P2P Tools

Version: 5.3

P2P Tools

1 The LUMP Messaging Protocol

(require (planet erast/p2ptools:1:=0/lump))

LUMP is a simple binary protocol for sending messages with arbitrary number of arguments over reliable stream ports.

1.1 Messages

The main data structure of LUMP are messages. These have an id, a referer field, some internal flags, and an argument list.

struct
(struct message (id seqnum referer flags version? args)
#:extra-constructor-name make-message)
  id : (integer-in 0 65535)
  seqnum : (integer-in 0 4294967295)
  referer : (integer-in 0 65535)
  flags : (integer-in 0 15)
  version? : (integer-in 1 16)
  args : (listof/c lump-argument-type?)

Instead of the default message constructor, use new-message or new-response for creating new messages.

1.2 Opening and Closing Sessions

In LUMP messages carry a sequence number that is increased per session. You first need to create a new session before sending any messages.

procedure
(new-session) → session?

Creates a new session.

After all messages associated to one session have been sent, you should use close-session to close it:

procedure
(close-session session) → void/c
session : session?

Closes the session.

1.3 Creating and Writing Messages

procedure
(new-message id [args ...]) → message?
id : (integer-in 0 65535)
args : lump-argument-type? = none/c

Creates a new message with given command id and an arbitrary number of arguments.

Example:
> (new-message 2 "hello world!" (typed type:int32 7267) #(26 "John"))
#<message>

procedure
(new-response id referer-message [args ...]) → message?
  id : (integer-in 1 65535)
  referer-message : message?
  args : lump-argument-type? = none/c

Creates a response to a given referer message with a given command id and arbitrary many arguments. The referer field of the message returned contains the seqnum of the original message.

procedure
(write-message session message [out-stream]) → number?
  session : session?
  message : message?
  out-stream : port? = (current-output-port)

Write a message within a given session to a port. Returns the sequence number (seqnum) of the message within the given session. This procedure will block until the message has been written, but does not call flush-output immediately. Call flush-output to ensure that the message is actually written to a block-buffered port.

1.4 Receiving Messages

procedure
(read-message [in-stream version-check-proc]) → message?
  in-stream : port? = (current-input-port)
   version-check-proc : (number? .->. any/c)
=
(lambda (version)
  (when (> version protocol-version)
    (raise-argument-error
     'read-message
     (format "LUMP protocol version ~a or lower" protocol-version)
     version)))

Synchroneously reads a message from the given port, or the current input port if no port is given, and returns it. The version check procedure can be used to test a given version number against the curent protocol-version. The version check procedure is called after the whole header has been read, but just before any referer or data fields follow, and the stream might be in an undefined state after the check has failed.

value
protocol-version : (integer-in 1 16)

The current version of the protocol. This value is independent of the LUMP library version and will only be increased when a new version is not downwards compatible to previous versions. LUMP is too low level to provide full upwards and downwards compatibility. It is advisable to never use sender and receivers with different protocol versions and define your own upwards and downwards compatible versioning schemes on top of LUMP serialization if you need it.

1.5 Supported Data Types

LUMP supports numbers, strings, bytes, lists, symbols, and vectors as external argument types that are automatically serialized. In addition to these Racket data types, LUMP also supports the following low-level data types:

Type
Internal
Bytesize
Explanation
bool
0
1
Boolean value as a byte
int8
2
1
Unsigned Byte
int16
3
2
Signed Short
uint16
4
2
Unsigned Short
int32
5
4
Signed 32-bit Integer
uint32
6
4
Unsigned 32-bit Integer
int64
7
8
Signed 64-bit Integer
uint64
8
8
Unsigned 64-bit Integer
text
9
n/a
UTF-8 String (varying length)
symbol
10
n/a
Racket symbol (varying length)
list
11
n/a
List (varying length)
number
12
n/a
Number (varying length)
bytes
13
n/a
Bytes (varying length)
vector
14
n/a
Vector (varying length)

(The values in the Internal column are internal identifiers that are only needed if you want to implement the protocol in another language.)

To use an internal, low-level type you need to prefixed the identifier with type:, so for example type:uint32 is the type for an unsigned 32-bit integer. Use the following procedures to wrap Racket data into a typed structure that can be provided to new-message or new-response.

procedure
(typed type datum) → typed?
type : lump-internal-type?
datum : lump-external-type?

Returns a structure that stores an external type as an explicitly provided internal type. Use this for wrapping Racket numbers of the given byte range into smaller fixed-size internal types.

Example:
> (typed type:uint8 255)
#<typed>

procedure
(untype typed-structure) → lump-external-type?
typed-structure : typed?

Converts a typed structure into a Racket data type, where fixed-size numbers are converted to exact Racket numbers.

Example:
> (untype (typed type:uint8 255))
255

procedure
(lump-internal-type? datum) → boolean?
datum : any/c

Returns true if the given datum represents an internal LUMP type identifier, false otherwise.

Examples:
> (lump-internal-type? type:text)
#t
> (lump-internal-type? 10)
#t
> (lump-internal-type? 130)
#f
> (lump-internal-type? "John")
#f

procedure
(lump-external-type? datum) → boolean?
datum : any/c

Returns true if the given datum has a valid LUMP external data type, i.e. a Racket type that can be serialized and is not a typed structure, false otherwise. Elements of lists or vectors are not checked by this procedure.

Examples:
> (lump-external-type? "John")
#t
> (lump-external-type? '("John" "Mary" "Brian"))
#t
> (lump-external-type? (typed type:int32 2728))
#f
> (lump-external-type? (box 10))
#f

procedure
(lump-argument-type? datum) → boolean?
datum : any/c

Returns true if the given datum can be used as an argument to new-message or new-response and serialized using the LUMP protocol, false otherwise. Elements of lists or vectors are not checked by this procedure.

Examples:
> (lump-argument-type? "John")
#t
> (lump-argument-type? '("John" "Mary" "Brian"))
#t
> (lump-argument-type? (typed type:int32 2728))
#t
> (lump-argument-type? (box 10))
#f

1.6 Description of the LUMP Protocol

You do not need to know the internals of the protocol in order to use it, but the following information is useful if you would like to implement a receiver or sender in another programming language. All numeric values represented by multiple bytes are in little endian format. The structure of a serialized LUMP message is as follows:

Header (7 bytes):
- 4 bits (MSBs of the byte): Protocol version - a value from 0 to 15 representing the internal protocol version number - 1
- 4 bits (LSBs of the byte): Flags - 4 bits of flags, currently 2 of them are used:
  - Bit 0: the message has a data portion
  - Bit 1: the message has a referer field
- 2 bytes: Message Id
- 4 bytes: Sequence number of the message
Referer (4 bytes, only present if referer flag is set): Sequence number of the referer message
Data Portion (varying size, only present if data flag is set):
- 2 bytes: Number argnum of following arguments
- For each argument 1...argnum:
  - 1 byte: LUMP internal argument type of the argument
  - 4 bytes: length arglen of the following data in bytes
  - 1....arglen bytes: data of the argument

2 The File Transfer Library

(require (planet erast/p2ptools:1:=0/filetransfer))

This library allows you to transfer files over TCP connections. To transfer a file, the receiver must first open a listener and then the sender must send the file. Sending and receiving is asynchronous, using procedures as event callbacks.

2.1 Receiving Files

procedure
(start-listen local-port
save-path
from-ip
progress-proc
final-proc
[ timeout
file-table
listen-timeout]) → filetransfer?
  local-port : (and/c exact-nonnegative-integer? (integer-in 0 65535))
  save-path : path?
  from-ip : (or/c string? boolean?)
   progress-proc :
([phase (one-of/c 'listening 'preparing 'receiving)]
[path path?]
[progress number?]
.->. any/c)
   final-proc :
([error-code (one-of/c 'finished 'error)]
[path path?]
[total-milliseconds number?]
[total-bytes-read number]
.->. any/c)
  timeout : real? = 60.0
  file-table : (and/c hash? hashtable-mutable?) = (make-hash)
  listen-timeout : real? = 604800.0

Starts to listen for incoming connections on the given port, where results will be saved to the folder indicated by save-path and from-ip specifies the IP address of the remote host from which a connection is accepted or #f if any connection is to be accepted. The optional progress-proc and final-proc will be run for each transfer and may be used for keeping track of progress and when a transfer has finished. The optional timeout value indicates the time until a network operation will fail if no progress has been made during that interval. The optional file-table is used to store concatenated checksum+suggested filename values used for checking whether a transfer is to be resumed. For automatic resuming to work, you need to store this table after partial transfers and provide it again in subsequent transfers. The option listen-timeout value indicates the time the receiver should wait for an incoming connection; if no connection is established within that period, the receiver stops listening.

2.2 Sending Files

procedure
(send-file remote-hostname
remote-port
file-path
suggested-filename
progress-proc
final-proc
[ timeout]) → filetransfer?
  remote-hostname : string?
  remote-port : (and/c exact-nonnegative-integer? (integer-in 0 65535))
  file-path : path?
  suggested-filename : string?
   progress-proc :
([phase (one-of/c 'connecting 'preparing 'sending)]
    [path path?]
    [progress number?]
    .->. any/c)
   final-proc :
([error-code (one-of/c 'finished 'error)]
[path path?]
[total-milliseconds number?]
[total-bytes-read number]
.->. any/c)
  timeout : real? = 60.0

Sends the file at the given path to the given remote host at remote port. The suggested file name indicates to the receiver how to name the file and is also used in determining whether a previously interupted file transfer ought to be resumed. However, it is up to the receiver how to actually name the file. The progress and final procedures can be used to track the progress of the file transfer and the optional timeout value indicates the time in seconds until a network operation fails if no progress has been made during that period. (Currently, not all network operations implement the timeout.)

2.3 Interrupting Transfers

procedure
(filetransfer? datum) → boolean?
datum : any/c

Returns #t if the given datum is a filetransfer object, #f otherwise. Filetransfer objects are returned by start-listen and send-file.

procedure
(kill-transfer transfer) → any/c
transfer : filetransfer?

Kills the current file transfer, disconnecting the remote host immediately and killing any transfer threads. After this function has been called, the socket used for tranferring files might be in an unusable state for some time on some systems.

procedure
(finish-transfer transfer [timeout]) → any/c
transfer : filetransfer?
timeout : real? = 259200.0

Blocks until the current transfer represented by transfer has finished, and then disconnects from the remote host. When the timeout value is reached while the transfer is in place, the transfer is killed immediately.

procedure
(wait-transfer transfer [timeout]) → any/c
transfer : filetransfer?
timeout : real? = 259200.0

Waits until the given transfer has finished.

2.4 Resuming Transfers

File transfers are resumed automatically as long as the same file-hash table and suggested names are used for both transfers. When a file is sent by send-file the sender produces a hash value retrieved from the mid of the file and a suggested file name. These are stored by the receiver in the file-table hash. If the same file paths and file-table are used for a file transfer on the receiving side, and the sender uses the same suggested file name both times, a file transfer that has previously been interrupted and succeeded only partially will automatically be resumed.

2.5 Description of the Transfer Protocol

You do not need to know the internals of the protocol in order to use it, but the following information is useful if you would like to implement a receiver or sender in another programming language. All numeric values represented by multiple bytes are in little endian format. The protocol involves a single handshake reply from the receiver to indicate the position from which the file transfer is to be resumed. A complete transmission of a file works as follows:

The sender sends 20 bytes of checksum data based on an sha1 hash of 16384 bytes from the middle of the file or less if the file is shorter.
The sender sends a 2 bytes value namelen representing the length of the suggested file name, followed by namelen bytes of name data in UTF-8 format.
The receiver then replies with an 8 bytes value offset into the file that indicates the position from which to continue the transfer (i.e. offset will be 0 if no portion of the file has been received yet).
The sender then sends an 8 bytes value datalen, followed by datalen bytes of file content. The content sent starts at position offset of the file and the total size of the file should be datalen+offset.

3 Display Measures

(require (planet erast/p2ptools:1:=0/display-measures))

This library contains utility functions for displaying transfer speeds. The data rate functions using a power of 10 measure for the amount of data are recommended.

procedure
(bits/sec->data-rate-string bit
sec
[ decimals]) → string?
  bit : number?
  sec : number?
  decimals : positive? = 1
procedure
(bytes/sec->data-rate-string byte
sec
[ decimals]) → string?
  byte : number?
  sec : number?
  decimals : positive? = 1
procedure
(bytes/msec->data-rate-string byte
msec
[ decimals]) → string?
  byte : number?
  msec : number?
  decimals : positive? = 1

Produce a string that expresses the data rate measured in bits per second, bytes per second, and bytes per millisecond respectively up to the given decimals precision, using power of 10 measures for the amount of data transferred.

Examples:
> (bits/sec->data-rate-string 0 0)
"0 bit/s"
> (bits/sec->data-rate-string 1817 2)
"1.8 kbit/s"
> (bytes/sec->data-rate-string 1024 1)
"8.2 kbit/s"
> (bytes/sec->data-rate-string 18161881 2 2)
"145.30 Mbit/s"

procedure
(bytes/sec->binary-rate-string byte
sec
[ decimals]) → string?
  byte : number?
  sec : number?
  decimals : positive? = 1
procedure
(bytes/sec->binary-rate-string* byte
sec
[ decimals]) → string?
  byte : number?
  sec : number?
  decimals : positive? = 1
procedure
(bytes/msec->binary-rate-string byte
sec
[ decimals]) → string?
  byte : number?
  sec : number?
  decimals : positive? = 1
procedure
(bytes/msec->binary-rate-string* byte
sec
[ decimals]) → string?
  byte : number?
  sec : number?
  decimals : positive? = 1

Produce a rate string based on power of 2 measures of data with the appropriate scale such as e.g. "8.0 KiB/s" up to the given decimals precision, where the first two functions use seconds and the last two functions use milliseconds as the basis, and the starred versions produce the slightly incorrect but more common measures "KB/s", "MB/s", etc.

Examples:
> (bytes/sec->binary-rate-string 17161817 3 2)
"5.46 MiB/s"
> (bytes/sec->binary-rate-string 179171618151816786676278287 2)
"74.1 YiB/s"
> (bytes/sec->binary-rate-string* 17161816719817198 1 1)
"15.2 PB/s"
> (bytes/msec->binary-rate-string* 18161881 2 2)
"8.46 GB/s"

1	The LUMP Messaging Protocol
2	The File Transfer Library
3	Display Measures

Type	Internal	Bytesize	Explanation
bool	0	1	Boolean value as a byte
int8	2	1	Unsigned Byte
int16	3	2	Signed Short
uint16	4	2	Unsigned Short
int32	5	4	Signed 32-bit Integer
uint32	6	4	Unsigned 32-bit Integer
int64	7	8	Signed 64-bit Integer
uint64	8	8	Unsigned 64-bit Integer
text	9	n/a	UTF-8 String (varying length)
symbol	10	n/a	Racket symbol (varying length)
list	11	n/a	List (varying length)
number	12	n/a	Number (varying length)
bytes	13	n/a	Bytes (varying length)
vector	14	n/a	Vector (varying length)