2013-12-15 10:43:31 +00:00
|
|
|
Block Exchange Protocol v1.0
|
|
|
|
============================
|
|
|
|
|
|
|
|
Introduction and Definitions
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
The BEP is used between two or more _nodes_ thus forming a _cluster_.
|
|
|
|
Each node has a _repository_ of files described by the _local model_,
|
|
|
|
containing modifications times and block hashes. The local model is sent
|
|
|
|
to the other nodes in the cluster. The union of all files in the local
|
|
|
|
models, with files selected for most recent modification time, forms the
|
2014-01-06 21:27:57 +00:00
|
|
|
_global model_. Each node strives to get it's repository in sync with
|
2013-12-15 10:43:31 +00:00
|
|
|
the global model by requesting missing blocks from the other nodes.
|
|
|
|
|
|
|
|
Transport and Authentication
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
The BEP itself does not provide retransmissions, compression, encryption
|
|
|
|
nor authentication. It is expected that this is performed at lower
|
|
|
|
layers of the networking stack. A typical deployment stack should be
|
|
|
|
similar to the following:
|
|
|
|
|
|
|
|
|-----------------------------|
|
|
|
|
| Block Exchange Protocol |
|
|
|
|
|-----------------------------|
|
|
|
|
| Compression (RFC 1951) |
|
|
|
|
|-----------------------------|
|
|
|
|
| Encryption & Auth (TLS 1.0) |
|
|
|
|
|-----------------------------|
|
|
|
|
| TCP |
|
|
|
|
|-----------------------------|
|
|
|
|
v v
|
|
|
|
|
|
|
|
The exact nature of the authentication is up to the application.
|
|
|
|
Possibilities include certificates signed by a common trusted CA,
|
|
|
|
preshared certificates, preshared certificate fingerprints or
|
|
|
|
certificate pinning combined with some out of band first verification.
|
|
|
|
|
|
|
|
There is no required order or synchronization among BEP messages - any
|
|
|
|
message type may be sent at any time and the sender need not await a
|
|
|
|
response to one message before sending another. Responses must however
|
|
|
|
be sent in the same order as the requests are received.
|
|
|
|
|
|
|
|
Compression is started directly after a successfull TLS handshake,
|
|
|
|
before the first message is sent. The compression is flushed at each
|
|
|
|
message boundary.
|
|
|
|
|
|
|
|
Messages
|
|
|
|
--------
|
|
|
|
|
|
|
|
Every message starts with one 32 bit word indicating the message version
|
|
|
|
and type. For BEP v1.0 the Version field is set to zero. Future versions
|
|
|
|
with incompatible message formats will increment the Version field. The
|
|
|
|
reserved bits must be set to zero.
|
|
|
|
|
|
|
|
0 1 2 3
|
|
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
| Ver=0 | Message ID | Type | Reserved |
|
|
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
|
|
|
|
All data following the message header is in XDR (RFC 1014) encoding.
|
|
|
|
The actual data types in use by BEP, in XDR naming convention, are:
|
|
|
|
|
2014-01-07 21:44:21 +00:00
|
|
|
- (unsigned) int -- (unsigned) 32 bit integer
|
|
|
|
- (unsigned) hyper -- (unsigned) 64 bit integer
|
|
|
|
- opaque<> -- variable length opaque data
|
|
|
|
- string<> -- variable length string
|
2013-12-15 10:43:31 +00:00
|
|
|
|
|
|
|
The encoding of opaque<> and string<> are identical, the distinction is
|
|
|
|
solely in interpretation. Opaque data should not be interpreted as such,
|
|
|
|
but can be compared bytewise to other opaque data. All strings use the
|
|
|
|
UTF-8 encoding.
|
|
|
|
|
|
|
|
### Index (Type = 1)
|
|
|
|
|
|
|
|
The Index message defines the contents of the senders repository. A Index
|
|
|
|
message is sent by each peer immediately upon connection and whenever the
|
|
|
|
local repository contents changes. However, if a peer has no data to
|
|
|
|
advertise (the repository is empty, or it is set to only import data) it
|
|
|
|
is allowed but not required to send an empty Index message (a file list of
|
|
|
|
zero length). If the repository contents change from non-empty to empty,
|
|
|
|
an empty Index message must be sent. There is no response to the Index
|
|
|
|
message.
|
|
|
|
|
|
|
|
struct IndexMessage {
|
|
|
|
FileInfo Files<>;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct FileInfo {
|
|
|
|
string Name<>;
|
|
|
|
unsigned int Flags;
|
|
|
|
hyper Modified;
|
2014-01-07 21:44:21 +00:00
|
|
|
unsigned int Version;
|
2013-12-15 10:43:31 +00:00
|
|
|
BlockInfo Blocks<>;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct BlockInfo {
|
|
|
|
unsigned int Length;
|
|
|
|
opaque Hash<>
|
|
|
|
}
|
|
|
|
|
|
|
|
The file name is the part relative to the repository root. The
|
|
|
|
modification time is expressed as the number of seconds since the Unix
|
2014-01-07 21:44:21 +00:00
|
|
|
Epoch. The version field is a counter that increments each time the file
|
|
|
|
changes but resets to zero each time the modification is updated. This
|
|
|
|
is used to signal changes to the file (or file metadata) while the
|
|
|
|
modification time remains unchanged. The hash algorithm is implied by
|
|
|
|
the hash length. Currently, the hash must be 32 bytes long and computed
|
|
|
|
by SHA256.
|
2013-12-15 10:43:31 +00:00
|
|
|
|
|
|
|
The flags field is made up of the following single bit flags:
|
|
|
|
|
|
|
|
0 1 2 3
|
|
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
2014-01-07 21:44:21 +00:00
|
|
|
| Reserved |I|D| Unix Perm. & Mode |
|
2013-12-15 10:43:31 +00:00
|
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
|
|
|
|
- The lower 12 bits hold the common Unix permission and mode bits.
|
|
|
|
|
|
|
|
- Bit 19 ("D") is set when the file has been deleted. The block list
|
|
|
|
shall contain zero blocks and the modification time indicates the
|
|
|
|
time of deletion or, if deletion time is not reliably determinable,
|
2014-01-07 21:44:21 +00:00
|
|
|
the last known modification time and a higher version number.
|
2013-12-15 10:43:31 +00:00
|
|
|
|
2014-01-07 21:44:21 +00:00
|
|
|
- Bit 18 ("I") is set when the file is invalid and unavailable for
|
|
|
|
synchronization. A peer may set this bit to indicate that it can
|
|
|
|
temporarily not serve data for the file.
|
|
|
|
|
|
|
|
- Bit 0 through 17 are reserved for future use and shall be set to
|
2013-12-15 10:43:31 +00:00
|
|
|
zero.
|
|
|
|
|
|
|
|
### Request (Type = 2)
|
|
|
|
|
|
|
|
The Request message expresses the desire to receive a data block
|
|
|
|
corresponding to a part of a certain file in the peer's repository.
|
|
|
|
|
|
|
|
The requested block must correspond exactly to one block seen in the
|
|
|
|
peer's Index message. The hash field must be set to the expected value by
|
|
|
|
the sender. The receiver may validate that this is actually the case
|
|
|
|
before transmitting data. Each Request message must be met with a Response
|
|
|
|
message.
|
|
|
|
|
|
|
|
struct RequestMessage {
|
|
|
|
string Name<>;
|
|
|
|
unsigned hyper Offset;
|
|
|
|
unsigned int Length;
|
|
|
|
opaque Hash<>;
|
|
|
|
}
|
|
|
|
|
|
|
|
The hash algorithm is implied by the hash length. Currently, the hash
|
|
|
|
must be 32 bytes long and computed by SHA256.
|
|
|
|
|
|
|
|
The Message ID in the header must set to a unique value to be able to
|
|
|
|
correlate the request with the response message.
|
|
|
|
|
|
|
|
### Response (Type = 3)
|
|
|
|
|
|
|
|
The Response message is sent in response to a Request message. In case the
|
|
|
|
requested data was not available (an outdated block was requested, or
|
|
|
|
the file has been deleted), the Data field is empty.
|
|
|
|
|
|
|
|
struct ResponseMessage {
|
|
|
|
opaque Data<>
|
|
|
|
}
|
|
|
|
|
|
|
|
The Message ID in the header is used to correlate requests and
|
|
|
|
responses.
|
|
|
|
|
|
|
|
### Ping (Type = 4)
|
|
|
|
|
|
|
|
The Ping message is used to determine that a connection is alive, and to
|
|
|
|
keep connections alive through state tracking network elements such as
|
|
|
|
firewalls and NAT gateways. The Ping message has no contents.
|
|
|
|
|
|
|
|
struct PingMessage {
|
|
|
|
}
|
|
|
|
|
|
|
|
### Pong (Type = 5)
|
|
|
|
|
|
|
|
The Pong message is sent in response to a Ping. The Pong message has no
|
|
|
|
contents, but copies the Message ID from the Ping.
|
|
|
|
|
|
|
|
struct PongMessage {
|
|
|
|
}
|
|
|
|
|
2013-12-28 13:10:36 +00:00
|
|
|
### IndexUpdate (Type = 6)
|
|
|
|
|
|
|
|
This message has exactly the same structure as the Index message.
|
|
|
|
However instead of replacing the contents of the repository in the
|
|
|
|
model, the Index Update merely amends it with new or updated file
|
|
|
|
information. Any files not mentioned in an Index Update are left
|
|
|
|
unchanged.
|
|
|
|
|
2013-12-15 10:43:31 +00:00
|
|
|
Example Exchange
|
|
|
|
----------------
|
|
|
|
|
|
|
|
A B
|
|
|
|
1. Index-> <-Index
|
|
|
|
2. Request->
|
|
|
|
3. Request->
|
|
|
|
4. Request->
|
|
|
|
5. Request->
|
|
|
|
6. <-Response
|
|
|
|
7. <-Response
|
|
|
|
8. <-Response
|
|
|
|
9. <-Response
|
|
|
|
10. Index->
|
|
|
|
...
|
|
|
|
11. Ping->
|
|
|
|
12. <-Pong
|
|
|
|
|
|
|
|
The connection is established and at 1. both peers send Index records.
|
|
|
|
The Index records are received and both peers recompute their knowledge
|
|
|
|
of the data in the cluster. In this example, peer A has four missing or
|
|
|
|
outdated blocks. At 2 through 5 peer A sends requests for these blocks.
|
|
|
|
The requests are received by peer B, who retrieves the data from the
|
|
|
|
repository and transmits Response records (6 through 9). Node A updates
|
|
|
|
their repository contents and transmits an updated Index message (10).
|
|
|
|
Both peers enter idle state after 10. At some later time 11, peer A
|
|
|
|
determines that it has not seen data from B for some time and sends a
|
|
|
|
Ping request. A response is sent at 12.
|
|
|
|
|