-
Notifications
You must be signed in to change notification settings - Fork 172
API Overview
OpenDHT offers the following features:
- Distributed shared key->value data-store.
- IPv4 and IPv6 support.
- Storage of arbitrary binary values up to 64 KiB. Keys are 160 bits long.
- Different values under a same key can be distinguished by a key-unique 64 bits ID.
- Every value also has a "value type". Each value type defines potentially complex storage, edition and expiration policies, allowing for instance different value expiration times. The set of supported "value types" is hardcoded and known by every node.
Note that OpenDHT is not compatible with the Mainline Bittorrent DHT (which only stores IP addresses).
An optional public-key cryptography layer on top of the DHT allows to put signed or encrypted data on the DHT. Signed values can then be edited, only by their owner (as verified cryptographically). Signed values retrieved from the DHT are automatically checked and will only be presented to the user if the signature verification succeeds.
The identity layer also publishes a (usually self-signed) certificate on the DHT that can be used to encrypt data for other nodes. Encrypted values are always signed, and the signature is part of the encrypted data, to hide the signer identity during transmission. For this reason, like other non-signed values, encrypted values can't be edited (because storage nodes can't check the identity of the author).
OpenDHT uses the dht
C++ namespace and is composed by a few major classes :
-
Infohash represents a key or a node ID, which are 20 bytes/160 bits bitstrings. Infohash instances can be compared with the comparison operator ==. The user can compute hashes from strings or binary data using static methods
Infohash::get()
, for instanceInfohash::get("my_key")
returns the SHA1 hash of the string "my_key". -
Value represents a value potentially stored on the DHT.
dht::Value
is the result type of get operations and the argument type of put operations. Adht::Value
can be easily built from any binary object, for instance using the constructordht::Value::Value(const std::vector<uint8_t>&)
or C-style withdht::Value::Value(const uint8_t* ptr, size_t len)
. -
ValueType defines how data is stored on the DHT : preservation time, storage and edition constraints etc. Every stored
Value
have an associated value type. Note thatValueType
usually have no impact on data serialization. -
Value::Filter is a class inheriting from
std::function<bool(Value&)>
. It lets you define whether a value should be returned to the user. It also defines some useful methods likechain(Value::Filter&&)
andchainOr(Value::Filter&&)
. -
Query much like the filters, the
Query
lets you filter values, but also fields in each value. It pretty much defines an SQLSELECT, WHERE
statements. In fact, one of it's constructors literally takes an SQL-ish fromatted string as parameter. Fields on whichSELECT
andWHERE
operations are permitted are listed inValue::Fields
. This is a subset of the fields aValue
contains. The most meaningful distinction between the query and the filter is that the query is going to be executed by the remote nodes, giving you a better control over the traffic triggered by your usage of the library. -
Dht is the class implementing the actual distributed hash table and providing basic operations. It requires an already-open UDP socket to send packets. When used alone, the
Dht::periodic
method must be called regularly and when a packet is received. -
SecureDht is a child class of
dht::Dht
that exposes its APIs and will transparently check signed values (for get and listen operations), decrypt encrypted values (that we can decrypt), and provide additional methods to publish signed or encrypted values. - DhtRunner provides a thread-safe interface to SecureDht and manages UDP sockets. DhtRunner is what most applications implementing OpenDHT should use: the instance can be safely shared to be used independently by various components or threads, with networking managed transparently. DhtRunner can launch a dedicated thread or be integrated in the program main loop.
Get/listen operations take a callback argument of type GetCallback or GetCallbackSimple (both can be used):
using GetCallback = std::function<bool(const std::vector<std::shared_ptr<dht::Value>>& values)>;
using GetCallbackSimple = std::function<bool(const std::shared_ptr<dht::Value>& values)>;
Query operations take a callback argument of type QueryCallback, defined as:
using QueryCallback = std::function<bool(const std::vector<std::shared_ptr<dht::FieldValueIndex>>& fields)>;
Many operations also use an "operation completed" callback DoneCallback, defined as:
using DoneCallback = std::function<void(bool success)>
This class provides the core API. Important methods are:
- Constructor
Dht::Dht(int s, int s6, const InfoHash& id)
The constructor takes open IPv4, IPv6 UDP sockets used to send packets, and the node ID. At least one open socket must be provided for the Dht instance to be considered running. If a valid socket is not provided the value -1
should be passed instead.
Most apps implementing OpenDHT should use the class DhtRunner that will instantiate Dht, handle networking transparently and provide a thread-safe interface to the dht instance.
- Get
void Dht::get(const InfoHash& key, GetCallback cb, DoneCallback donecb={}, Value::Filter f = {}, Query q = {});
Get
initiates a search on the network for values associated with the provided key
. Results will be provided during the search through the second argument cb
. The callback will be called multiple times with new values when they are found on the network or until the callback returns false. An optional DoneCallback
is called on operation completion (success or failure), after which no further callback is called.
Filter
: optional predicate to pre-filter values before they are passed to the callback.
Query
: optional query to filter values on remote nodes.
Example using Dht::get:
//node is a running instance of dht::Dht
node.get(
dht::InfoHash::get("some_key"),
[](const std::vector<std::shared_ptr<dht::Value>>& values) {
for (const auto& v : values)
std::cout << "Got value: " << *v << std::endl;
return true; // keep looking for values
},
[](bool success) {
std::cout << "Get finished with " << (success ? "success" : "failure") << std::endl;
}
);
- Query
void Dht::query(const InfoHash& key, QueryCallback cb, DoneCallback done_cb = {}, Query&& q = {});
Query
initiates a search on the network at the provided key
for specific value fields. Results will be provided during the search through the second argument cb
. The callback will be called multiple times with new values when they are found on the network or until the callback returns false. An optional DoneCallback
is called on operation completion (success or failure), after which no further callback is called.
Filter
: optional predicate to pre-filter values before they are passed to the callback.
Query
: optional query to filter values on remote nodes.
Example using Dht::query:
//node is a running instance of dht::Dht
node.query(
dht::InfoHash::get("some_key"),
[](const std::vector<std::shared_ptr<dht::FieldValueIndex>>& fields) {
for (const auto& i : fields)
std::cout << "Got index: " << *i << std::endl;
return true; // keep looking for field value index
},
[](bool success) {
std::cout << "Get finished with " << (success ? "success" : "failure") << std::endl;
}
);
- Put
void Dht::put(const InfoHash& key, const std::shared_ptr<Value>& value, DoneCallback cb = {});
Put
initiates publication of a value on the network at the provided key
. See Data serialization for more information about how to build a dht::Value
instance. An optional DoneCallback
is called on operation completion (success or failure).
If the value ID is dht::Value::INVALID_ID
(0) when put
is called, the Value::id
field is set during the operation to identify the value.
A value remains on the network for its lifetime (default 10 minutes).
Use put
with the same key and value to refresh the expiration deadline.
Values can't be edited by default (with the exception of signed values).
If a value with the same value ID exists on the network, the new value is by default ignored by the network.
Example using Dht::put:
const char* my_data = "42 cats";
//node is a running instance of dht::Dht
node.put(
dht::InfoHash::get("some_key"),
dht::Value((const uint8_t*)my_data, std::strlen(my_data))
);
- Listen
size_t Dht::listen(const InfoHash& key, GetCallback cb, Value::Filter q = {}, Query q = {});
Listen initiates a search on the network to find values associated with the provided key
and will keep being informed of new values published at key
, calling the provided callback function cb
every time there is a new or changed value at key, until the callback cb
returns false or the operation is canceled with bool cancelListen(const InfoHash& key, size_t token)
, where token
is the return value from listen
. Calling cancelListen
has the same effect as returning false from the callback.
Example using Dht::listen:
auto key = dht::InfoHash::get("some_key");
auto token = node.listen(key,
[](const std::vector<std::shared_ptr<dht::Value>>& values) {
for (const auto& v : values)
std::cout << "Found value: " << *v << std::endl;
return true; // keep listening
}
);
// later
node.cancelListen(key, std::move(token));
Listen with type template for automatic deserialization:
struct Cloud {
uint32_t altitude;
double width, height;
bool rainbow;
MSGPACK_DEFINE_MAP(altitude, width, height, rainbow);
}
std::vector<Cloud> found_clouds;
auto key = dht::InfoHash::get("some_key");
auto token = node.listen<Cloud>(key, [](Cloud&& value) {
// warning: called from another thread
found_clouds.emplace_back(std::move(value));
}
);
// later
node.cancelListen(key, token);
A filter is an std::function<bool(const dht::Value&)>
predicate to filter values.
auto coolValueFilter = [](const dht::Value& v) {
return v.user_type == "cool" and v.data.size() < 64;
};
node.get("coolKey"),
[](const std::shared_ptr<dht::Value>& value) {
std::cout << "That's a cool value: " << *v << std::endl;
return true; // keep looking for values
},
[](bool success) {
std::cout << "Op went " << (success ? "cool" : "not cool") << std::endl;
},
filter);
As you can see, the Value::Filter
class is really flexible. However, this filtering is only going to be processed on the local node upon receiving values in a response. What if you know that the storage you're interested in is hosting a high number of values and you don't want to trigger big traffic. Use queries!
An equivalent to the last example, but using queries is as follows:
Where w;
w.id(5); /* the same as Where w("WHERE id=5"); */
node.get(
dht::InfoHash::get("some_key"),
[](const std::vector<std::shared_ptr<dht::Value>>& values) {
for (const auto& v : values)
std::cout << "This value has passed through the remotes filters " << *v << std::endl;
return true; // keep looking for values
},
[](bool success) {
std::cout << "Get finished with " << (success ? "success" : "failure") << std::endl;
}, {}, w
);
All available fields are listed below:
Field |
---|
Id |
ValueType |
OwnerPk |
UserType |
Note: fields usage in string initialization is snake case!
A query can tell if it is satisfied by another query. For e.g.:
Query q1;
q1.where.id(5); // the whole value with id=5 will be sent
Query q2 {{"SELECT value_type"}};
// q2 the same as Query q("SELECT * WHERE value_type=10,user_type=foo_type");
q2.where.valueType(10).userType("foo_type");
Query q3("SELECT id WHERE id=5"); // only the id=5 will be sent
q1.isSatisfiedBy(q3); // false
q2.isSatisfiedBy(q1); // false
q3.isSatisfiedBy(q1); // true
q2.isSatisfiedBy(q3); // false
This class extends dht::Dht, and provides the same API methods (get, put, listen). It adds a public-key cryptography layer on top of the DHT. A user-provided or generated Identity (RSA key pair) will be used for signing and decrypting.
Values returned to the user by ::get
and ::listen
are checked beforehand and filtered: signed values are dropped if their signature verification fails. Similarly, encrypted values that we can't decrypt are dropped, or provided decrypted to the user if we can.
The user can know if a value was encrypted by checking the recipient
field of the Value (which should be our public key ID).
As a layer on top of Dht
, SecureDht
can also be used for plain values. Methods like get
and put
will behave the same as Dht
for non-encrypted and non-signed values.
Additionally, SecureDht adds a few methods:
- PutSigned
void putSigned(const InfoHash& hash, const std::shared_ptr<Value>& val, DoneCallback callback);
- PutEncrypted
void putEncrypted(const InfoHash& hash, const InfoHash& to, std::shared_ptr<Value> val, DoneCallback callback);
DhtRunner provides a thread-safe access to the running DHT instance and exposes all methods from SecureDht. See more information here : Running a node in your program