Sherman is a B+Tree on disaggregated memory; it uses one-sided RDMA verbs to perform all index operations. Sherman includes three techniques to boost write performance:
- A hierarchical locks leveraging on-chip memory of RDMA NICs.
- Coalescing dependent RDMA commands
- Two-level version layout in leaf nodes
For more details, please refer to our paper:
[SIGMOG'22] Sherman: A Write-Optimized Distributed B+Tree Index on Disaggregated Memory. Qing Wang and Youyou Lu and Jiwu Shu.
Please use Deft for evaluation, which improving Sherman in performance and correct synchronization.
- Mellanox ConnectX-5 NICs and above
- RDMA Driver: MLNX_OFED_LINUX-4.7-3.2.9.0 (If you use MLNX_OFED_LINUX-5**, you should modify codes to resolve interface incompatibility)
- NIC Firmware: version 16.26.4012 and above (to support on-chip memory, you can use
ibstat
to obtain the version) - memcached (to exchange QP information)
- cityhash
- boost 1.53 (to support
boost::coroutines::symmetric_coroutine
)
1. RDMA NIC Selection.
You can modify this line according the RDMA NIC you want to use, where ibv_get_device_name(deviceList[i])
is the name of RNIC (e.g., mlx5_0)
Line 28 in 9bb9508
2. Gid Selection.
If you use RoCE, modify gidIndex
in this line according to the shell command show_gids
, which is usually 3.
Line 60 in c5ee9d8
3. MTU Selection.
If you use RoCE and the MTU of your NIC is not equal to 4200 (check with ifconfig
), modify the value path_mtu
in src/rdma/StateTrans.cpp
4. On-Chip Memory Size Selection.
Change the constant kLockChipMemSize
in include/Commmon.h
, making it <= max size of on-chip memory.
cd Sherman
./script/hugepage.sh
to request huge pages from OS (use./script/clear_hugepage.sh
to return huge pages)mkdir build; cd build; cmake ..; make -j
cp ../script/restartMemc.sh .
- configure
../memcached.conf
, where the 1st line is memcached IP, the 2nd is memcached port
For each run with kNodeCount
servers:
./restartMemc.sh
(to initialize memcached server)- In each server, execute
./benchmark kNodeCount kReadRatio kThreadCount
We emulate each server as one compute node and one memory node: In each server, as the compute node, we launch
kThreadCount
client threads; as the memory node, we launch one memory thread.kReadRatio
is the ratio ofget
operations.
In
./test/benchmark.cpp
, we can modifykKeySpace
andzipfan
, to generate different workloads. In addition, we can open the macroUSE_CORO
to bindkCoroCnt
coroutine on each client thread.
- The two-level version may induce inconsistency in some concurrent cases. Refer to this SIGMOD'23 paper
- Re-write
delete
operations