You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/en/transfer-engine.md
+30-4Lines changed: 30 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -94,13 +94,28 @@ The sample program provided in `mooncake-transfer-engine/example/transfer_engine
94
94
95
95
After successfully compiling Transfer Engine, the test program `transfer_engine_bench` can be found in the `build/mooncake-transfer-engine/example` directory.
96
96
97
-
1.**Start the `etcd` service.** This service is used for the centralized highly available management of various metadata for Mooncake, including the internal connection status of Transfer Engine. It is necessary to ensure that both the initiator and target nodes can smoothly access this etcd service, so pay attention to:
98
-
- The listening IP of the etcd service should not be 127.0.0.1; it should be determined in conjunction with the network environment. In the experimental environment, 0.0.0.0 can be used. For example, the following command line can be used to start the required service:
97
+
1.**Start the `metadata` service.** This service is used for the centralized highly available management of various metadata for Mooncake, including the internal connection status of Transfer Engine. It is necessary to ensure that both the initiator and target nodes can smoothly access this metadata service, so pay attention to:
98
+
- The listening IP of the metadata service should not be 127.0.0.1; it should be determined in conjunction with the network environment. In the experimental environment, 0.0.0.0 can be used.
99
+
- On some platforms, if the initiator and target nodes have set the `http_proxy` or `https_proxy` environment variables, it will also affect the communication between Transfer Engine and the metadata service.
100
+
101
+
Transfer Engine support multiple kinds of metadata services, including `etcd`, `redis`, and `http`. The following describes how to start the metadata service using `etcd` and `http` as examples.
102
+
103
+
1.1. **`etcd`**
104
+
105
+
For example, the following command line can be used to start the etcd service:
- On some platforms, if the initiator and target nodes have set the `http_proxy` or `https_proxy` environment variables, it will also affect the communication between Transfer Engine and the etcd service, reporting the "Error from etcd client: 14" error.
110
+
111
+
1.2. **`http`**
112
+
113
+
For example, you can use the `http` service in the `mooncake-transfer-engine/example/http-metadata-server` example:
114
+
```bash
115
+
# This is 10.0.0.1
116
+
# cd mooncake-transfer-engine/example/http-metadata-server
117
+
go run . --addr=:8080
118
+
```
104
119
105
120
2. **Start the target node.**
106
121
```bash
@@ -117,6 +132,7 @@ After successfully compiling Transfer Engine, the test program `transfer_engine_
117
132
- `--mode=target` indicates the start of the target node. The target node does not initiate read/write requests; it passively supplies or writes data as required by the initiator node.
118
133
> Note: In actual applications, there is no need to distinguish between target nodes and initiator nodes; each node can freely initiate read/write requests to other nodes in the cluster.
119
134
- `--metadata_server` is the address of the metadata server (the full address of the etcd service).
135
+
> Change `--metadata_server` to `--metadata_server=http://10.0.0.1:8080/metadata` and add `--metadata_type=http` when using `http` as the `metadata` service.
120
136
- `--local_server_name` represents the address of this machine, which does not need to be setin most cases. If this option is not set, the value is equivalent to the hostname of this machine (i.e., `hostname(2)`). Other nodes in the cluster will use this address to attempt out-of-band communication with this node to establish RDMA connections.
121
137
> Note: If out-of-band communication fails, the connection cannot be established. Therefore, if necessary, you need to modify the `/etc/hosts` file on all nodes in the cluster to locate the correct node through the hostname.
122
138
- `--device_name` indicates the name of the RDMA network card used in the transfer process.
@@ -414,14 +430,24 @@ Value = {
414
430
```
415
431
</details>
416
432
433
+
### HTTP Metadata Server
434
+
435
+
The HTTP server should implement three following RESTful APIs, while the metadata server configured to `http://host:port/metadata` as an example:
436
+
437
+
1. `GET /metadata?key=$KEY`: Get the metadata corresponding to `$KEY`.
438
+
2. `PUT /metadata?key=$KEY`: Update the metadata corresponding to `$KEY` to the value of the request body.
439
+
3. `DELETE /metadata?key=$KEY`: Delete the metadata corresponding to `$KEY`.
440
+
441
+
For specific implementation, refer to the demo service implemented in Golang at [mooncake-transfer-engine/example/http-metadata-server](../../mooncake-transfer-engine/example/http-metadata-server).
- Pointer to a `TransferMetadata` object, which abstracts the communication logic between the TransferEngine framework and the metadata server. We currently support `etcd`and `redis` protocols, while`metadata_server` represents the IP address or hostname of the etcd or redis server.
450
+
- Pointer to a `TransferMetadata` object, which abstracts the communication logic between the TransferEngine framework and the metadata server. We currently support `etcd`, `redis`and `http` protocols, while`metadata_server` represents the IP address or hostname of the etcd or redis server, or the base HTTP URI of http server.
425
451
426
452
For easy exception handling, TransferEngine needs to call the init functionfor secondary construction after construction:
编译 Transfer Engine 成功后,可在 `build/mooncake-transfer-engine/example` 目录下产生测试程序 `transfer_engine_bench`。该工具通过调用 Transfer Engine 接口,发起节点从目标节点的 DRAM 处反复读取/写入数据块,以展示 Transfer Engine 的基本用法,并可用于测量读写吞吐率。目前 Transfer Engine Bench 工具可用于 RDMA 协议(GPUDirect 正在测试) 及 TCP 协议。
0 commit comments