Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions integration/lmcache/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Build
build/

# Python
__pycache__/
*.py[cod]
*.egg-info/
dist/
.pytest_cache/
39 changes: 39 additions & 0 deletions integration/lmcache/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
cmake_minimum_required(VERSION 3.14)
project(dingofs_connector C CXX)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)

# Main shared library
add_library(dingofs_connector SHARED
src/io_engine/file_io.cpp
src/io_engine/io_engine.cpp
src/io_engine/io_engine_capi.cpp
)

target_include_directories(dingofs_connector PUBLIC
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/src/io_engine>
$<INSTALL_INTERFACE:include>
)

target_compile_options(dingofs_connector PRIVATE -O3 -Wall -Wextra -Wpedantic)
target_link_libraries(dingofs_connector PRIVATE pthread)

set_target_properties(dingofs_connector PROPERTIES
OUTPUT_NAME "dingofs_connector"
)

# Install
install(TARGETS dingofs_connector
LIBRARY DESTINATION dingofs_connector
ARCHIVE DESTINATION dingofs_connector
)
install(FILES src/io_engine/io_engine_capi.h DESTINATION include)

# Tests
option(BUILD_TESTS "Build C++ tests" OFF)
if(BUILD_TESTS)
enable_testing()
add_subdirectory(tests/io_engine)
endif()
20 changes: 20 additions & 0 deletions integration/lmcache/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Convenience wrapper around cmake
BUILD_DIR ?= build
CMAKE_FLAGS ?=

.PHONY: all clean test install

all:
@mkdir -p $(BUILD_DIR)
cd $(BUILD_DIR) && cmake .. $(CMAKE_FLAGS) && make -j$$(nproc)

test:
@mkdir -p $(BUILD_DIR)
cd $(BUILD_DIR) && cmake .. -DBUILD_TESTS=ON $(CMAKE_FLAGS) && make -j$$(nproc)
cd $(BUILD_DIR) && ctest --output-on-failure

clean:
rm -rf $(BUILD_DIR)

install: all
cd $(BUILD_DIR) && make install
40 changes: 40 additions & 0 deletions integration/lmcache/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# DingoFS Connector for LMCache

A high-performance storage connector for LLM KV cache on [DingoFS](https://github.com/dingodb/dingofs), built with a native C++ I/O engine and Python ctypes bindings.

## Quick Start

### 1. Build Wheel (on build machine)

```bash
cd integration/lmcache
pip install build
python -m build --wheel
```

Output: `dist/dingofs_connector-0.1.0-cp3*-linux_x86_64.whl`

### 2. Install (on target machine)

```bash
# Copy wheel to target machine, then:
pip install dingofs_connector-0.1.0-cp3*-linux_x86_64.whl
```

### 3. Configure LMCache

Add to your LMCache YAML config:

```yaml
chunk_size: 256
remote_url: "dingofs://host:0/mnt/dingofs/kv_cache"
remote_serde: "naive"
remote_storage_plugins:
- dingofs
extra_config:
remote_storage_plugin.dingofs.module_path: "dingofs_connector.adapter"
remote_storage_plugin.dingofs.class_name: "DingoFSConnectorAdapter"
dingofs_num_workers: 8
```

More details see [docs](docs) and [examples](examples).
125 changes: 125 additions & 0 deletions integration/lmcache/docs/design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Design

## Layer Diagram

```
┌──────────────────────────────────────────────────┐
│ LMCache Integration │
│ DingoFSConnector / DingoFSConnectorAdapter │
│ URL: dingofs://host:0/mnt/path │
├──────────────────────────────────────────────────┤
│ Python NativeIOEngine │
│ ctypes bindings + asyncio eventfd integration │
├──────────────────────────────────────────────────┤
│ C API (io_engine_capi.h) │
│ Stable extern "C" ABI for any language │
├──────────────────────────────────────────────────┤
│ C++ IOEngine (io_engine.h) │
│ Multi-threaded worker pool, batch tiling, │
│ eventfd-based async completion │
├──────────────────────────────────────────────────┤
│ POSIX File I/O (file_io.h) │
│ O_DIRECT, atomic writes (temp + rename) │
└──────────────────────────────────────────────────┘
```

## Design Rationale

**Hourglass pattern** (C++ → C ABI → Python):
Following the [RocksDB](https://github.com/facebook/rocksdb/blob/main/include/rocksdb/c.h) approach, the C++ implementation is wrapped behind a stable C API. This decouples the Python bindings from C++ internals and allows bindings in any FFI-capable language.

**Multi-threaded batch tiling**:
Each batch operation is divided into tiles distributed across worker threads. Tile completion is tracked with atomic counters; the final tile signals an eventfd to wake the caller. This maps well to DingoFS's high-concurrency I/O path.

**Zero-copy I/O**:
Read and write operations work directly on caller-provided buffers. No intermediate copies are made in the C++ or Python layers.

**Atomic writes**:
Writes go to a temporary file first, followed by `fdatasync` and an atomic `rename`. This prevents partial reads on crash.

## File Mapping

| Layer | Files |
|-------|-------|
| LMCache adapter | `src/dingofs_connector/connector.py`, `src/dingofs_connector/adapter.py` |
| Python bindings | `src/dingofs_connector/native_engine.py` |
| C API | `src/io_engine/io_engine_capi.h`, `src/io_engine/io_engine_capi.cpp` |
| C++ engine | `src/io_engine/io_engine.h`, `src/io_engine/io_engine.cpp` |
| POSIX I/O | `src/io_engine/file_io.h`, `src/io_engine/file_io.cpp` |

## C API Reference

Header: [`src/io_engine/io_engine_capi.h`](../src/io_engine/io_engine_capi.h)

All functions are thread-safe. The opaque handle can be shared across threads.

### Lifecycle

```c
// Create an I/O engine. Returns NULL on failure.
io_engine_t* io_engine_create(const char* base_path, int num_workers,
int use_odirect, int sync_mode);

// Destroy and free all resources.
void io_engine_destroy(io_engine_t* engine);

// Graceful shutdown. Idempotent.
void io_engine_close(io_engine_t* engine);
```

### Submit Operations

All submit functions are non-blocking and return a `future_id` (0 on error).

```c
uint64_t io_engine_submit_batch_get(io_engine_t* engine,
const char** keys, void** bufs, const size_t* lens,
size_t count, size_t chunk_size);

uint64_t io_engine_submit_batch_set(io_engine_t* engine,
const char** keys, const void** bufs, const size_t* lens,
size_t count, size_t chunk_size);

uint64_t io_engine_submit_batch_exists(io_engine_t* engine,
const char** keys, size_t count);
```

### Completions

```c
// Drain available completions. Returns count written to out.
// Pointers in io_completion_t are valid until the next drain call.
int io_engine_drain_completions(io_engine_t* engine,
io_completion_t* out, size_t max_completions);
```

`io_completion_t` fields:

| Field | Type | Description |
|-------|------|-------------|
| `future_id` | `uint64_t` | Matches the submit return value |
| `ok` | `int` | 1 = success, 0 = failure |
| `error` | `const char*` | Error message (NULL if ok) |
| `result_bytes` | `const uint8_t*` | EXISTS results: 0/1 per key (NULL otherwise) |
| `result_len` | `size_t` | Length of `result_bytes` |

### Async Notification

```c
// Returns an eventfd that becomes readable when completions are available.
int io_engine_event_fd(const io_engine_t* engine);
```

### Error Handling

```c
// Thread-local last error. Returns NULL if no error.
const char* io_engine_last_error(void);
```

### Constants

```c
#define IO_ENGINE_SYNC_NONE 0 // No fdatasync
#define IO_ENGINE_SYNC_ALWAYS 1 // fdatasync after every write
```
64 changes: 64 additions & 0 deletions integration/lmcache/docs/tuning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Tuning

## Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `dingofs_num_workers` | int | 8 | Number of I/O worker threads |
| `dingofs_use_odirect` | bool | false | Use O_DIRECT for file I/O |
| `dingofs_sync_mode` | str | `"always"` | Write sync strategy |

## Worker Count (`dingofs_num_workers`)

DingoFS performs best under high concurrency. Read throughput scales nearly linearly with thread count.

- **Recommended**: 8-16 workers
- Write throughput is bounded by `fdatasync` latency, so more workers help reads more than writes

## O_DIRECT (`dingofs_use_odirect`)

Bypasses the kernel page cache, reducing memory copies in the FUSE layer.

- **Best for**: Read-once workloads (e.g., prefetch)
- **Requirement**: Buffer address and size must align to filesystem block size (typically 4096)
- **Default**: false (easier buffer management)

## Sync Mode (`dingofs_sync_mode`)

Controls whether `fdatasync` is called after every write.

- **`"always"`** (default): Safe, prevents data loss on crash. Slower writes.
- **`"none"`**: Skip fdatasync. Much faster writes. Acceptable when DingoFS provides its own durability guarantees.

## Recommended Configurations

| Workload | Settings |
|----------|----------|
| General KV cache | `num_workers=8`, `sync_mode=always` |
| High-throughput writes | `num_workers=16`, `sync_mode=none` |
| Prefetch (read-once) | `num_workers=8`, `use_odirect=true` |

## Benchmark Reference

Results from the benchmark example (`examples/benchmark/benchmark.py`):

```
--- Worker Scaling (1MB x 32 chunks) ---
workers= 1 | WRITE 280 MB/s | READ 4000 MB/s
workers= 4 | WRITE 290 MB/s | READ 15000 MB/s
workers= 8 | WRITE 300 MB/s | READ 28000 MB/s
workers=16 | WRITE 310 MB/s | READ 40000 MB/s

--- Sync Mode (1MB x 32 chunks, 8 workers) ---
sync=always | WRITE 280 MB/s
sync=none | WRITE 950 MB/s
```

Run your own benchmark:

```bash
python examples/benchmark/benchmark.py
python examples/benchmark/benchmark.py --base-path /mnt/dingofs/bench
python examples/benchmark/benchmark.py --write-only
python examples/benchmark/benchmark.py --read-only
```
Loading
Loading