-
Notifications
You must be signed in to change notification settings - Fork 1
PERFORMANCE_OPTIMIZATION_NOTES
Date: December 15, 2025
Status: Future Enhancements
Priority: P2 (Optional)
Version: v1.3.0
Kategorie: 🔒 Security
This document tracks potential performance optimizations for the vector encryption implementation. These are not critical for Phase 1-2 functionality but could improve performance for large-scale deployments.
Issue: Converting between std::vector<uint8_t> and std::string creates unnecessary copies for large HNSW indexes (multi-GB).
Current Implementation:
// serialize: vector<uint8_t> → string (copy)
std::string serialize(const std::vector<uint8_t>& value) {
return std::string(value.begin(), value.end());
}
// deserialize: string → vector<uint8_t> (copy)
std::vector<uint8_t> deserialize(const std::string& str) {
return std::vector<uint8_t>(str.begin(), str.end());
}Memory Impact:
- 3 GB HNSW index
- 2 copies: 6 GB peak memory usage
- Additional overhead during encryption/decryption
Optimization Options:
template<>
std::string EncryptedField<std::vector<uint8_t>>::serialize(std::vector<uint8_t>&& value) {
// Move data instead of copy
return std::string(
std::make_move_iterator(value.begin()),
std::make_move_iterator(value.end())
);
}Pros: Reduces copies
Cons: Requires API changes (rvalue references)
// Use string_view to avoid intermediate copies
std::string_view serializeView(const std::vector<uint8_t>& value) {
return std::string_view(
reinterpret_cast<const char*>(value.data()),
value.size()
);
}Pros: Zero-copy
Cons: Lifetime management complexity
// Encrypt directly from/to memory-mapped files
class MemoryMappedEncryption {
void encryptFile(const std::string& input, const std::string& output);
void decryptFile(const std::string& input, const std::string& output);
};Pros: No in-memory copies, handles huge files
Cons: Platform-specific, more complex
Recommendation: Option C for large indexes (>1 GB)
Estimated Impact:
- Memory reduction: 50% (6 GB → 3 GB peak)
- Performance: 10-20% faster for large indexes
Issue: Using istreambuf_iterator for multi-GB files is inefficient.
Current Implementation:
std::ifstream file(path, std::ios::binary);
std::vector<uint8_t> data(
(std::istreambuf_iterator<char>(file)),
std::istreambuf_iterator<char>()
);Problems:
- Character-by-character iteration
- No buffering optimization
- Slow for large files
Optimization:
std::vector<uint8_t> readFileChunked(const std::string& path) {
std::ifstream file(path, std::ios::binary);
// Get file size
file.seekg(0, std::ios::end);
size_t size = file.tellg();
file.seekg(0, std::ios::beg);
// Pre-allocate
std::vector<uint8_t> data(size);
// Read in chunks
const size_t chunk_size = 64 * 1024 * 1024; // 64 MB chunks
for (size_t i = 0; i < size; i += chunk_size) {
size_t read_size = std::min(chunk_size, size - i);
file.read(reinterpret_cast<char*>(data.data() + i), read_size);
}
return data;
}Pros: Much faster for large files
Cons: Slightly more complex
Estimated Impact:
- 3 GB file: 20 seconds → 5 seconds (4x faster)
#include <sys/mman.h>
class MMapFile {
void* map(const std::string& path, size_t& size);
void unmap(void* addr, size_t size);
};Pros: Fastest possible, OS-optimized
Cons: Platform-specific (Linux/Windows differ)
Recommendation: Option A for cross-platform, Option B for maximum performance
Issue: Large HNSW indexes encrypted/decrypted sequentially.
Current Implementation:
// Single-threaded encryption
EncryptedField<std::vector<uint8_t>> encField;
encField.encrypt(indexData, "hnsw_index"); // 3 GB takes 3 secondsOptimization: Chunk-Based Parallel Encryption
class ParallelEncryption {
std::vector<EncryptedChunk> encryptParallel(
const std::vector<uint8_t>& data,
const std::string& key_id,
size_t num_threads = 8
) {
const size_t chunk_size = 256 * 1024 * 1024; // 256 MB chunks
std::vector<EncryptedChunk> chunks;
#pragma omp parallel for
for (size_t i = 0; i < data.size(); i += chunk_size) {
size_t len = std::min(chunk_size, data.size() - i);
std::vector<uint8_t> chunk(data.begin() + i, data.begin() + i + len);
EncryptedField<std::vector<uint8_t>> encField;
encField.encrypt(chunk, key_id);
#pragma omp critical
chunks.push_back({i, encField});
}
return chunks;
}
};Estimated Impact:
- 3 GB encryption: 3 seconds → 0.5 seconds (8 cores)
- Near-linear scaling with core count
Challenges:
- Need to store chunk metadata
- Reconstruction on decryption
- Slightly more complex format
Issue: HNSW indexes have some redundancy that could be compressed.
Optimization:
class CompressedEncryption {
Status saveCompressedEncrypted(const std::string& directory) {
// 1. Save HNSW index
appr->saveIndex(tempPath);
// 2. Load into memory
auto data = readFile(tempPath);
// 3. Compress (e.g., zstd, lz4)
auto compressed = zstd::compress(data, level=3);
// 4. Encrypt compressed data
EncryptedField<std::vector<uint8_t>> encField;
encField.encrypt(compressed, "hnsw_index");
// 5. Save
writeFile(encryptedPath, encField.toBase64());
}
};Estimated Impact:
- Compression ratio: 0.7-0.9 (depends on HNSW structure)
- Storage savings: 10-30%
- CPU overhead: +10-20% (compression time)
Trade-off: CPU time for storage space
Issue: Vectors decrypted one-by-one during rebuildFromStorage().
Current Implementation:
for (auto& entity : entities) {
auto enc = EncryptedField<std::vector<float>>::fromBase64(entity.encryptedField);
auto vector = enc.decrypt(); // Sequential
cache[pk] = vector;
}Optimization: Parallel Batch Decryption
// Collect all encrypted fields
std::vector<EncryptedField<std::vector<float>>> encrypted_batch;
for (auto& entity : entities) {
encrypted_batch.push_back(
EncryptedField<std::vector<float>>::fromBase64(entity.encryptedField)
);
}
// Parallel decryption
std::vector<std::vector<float>> decrypted_batch(encrypted_batch.size());
#pragma omp parallel for
for (size_t i = 0; i < encrypted_batch.size(); ++i) {
decrypted_batch[i] = encrypted_batch[i].decrypt();
}
// Cache results
for (size_t i = 0; i < entities.size(); ++i) {
cache[entities[i].pk] = decrypted_batch[i];
}Estimated Impact:
- 1M vectors: 16 minutes → 2 minutes (8 cores, 8x speedup)
| Optimization | Phase | Priority | Effort | Impact |
|---|---|---|---|---|
| Chunked file reading | 2 | P1 | Low | High |
| Memory-mapped files | 2 | P2 | Medium | High |
| Parallel batch decrypt | 1 | P2 | Medium | High |
| Move semantics | 2 | P2 | Low | Medium |
| Compression | 2 | P3 | Medium | Medium |
| Parallel encryption | 2 | P3 | High | Medium |
- Implement chunked file reading (easy win)
- Add batch decryption for index rebuild
- Memory-mapped I/O for HNSW indexes
- Compression before encryption (optional)
- Parallel chunk-based encryption
- Hardware acceleration (AES-NI already used by OpenSSL)
Before optimizations:
- Measure baseline: 1M vectors, 3 GB HNSW index
- Profile bottlenecks: CPU, memory, I/O
After each optimization:
- Re-measure performance
- Verify correctness
- Document improvement
- EncryptedField implementation:
src/security/encrypted_field.cpp - HNSW save/load:
src/index/vector_index.cpp - Performance analysis:
PHASE2_IMPLEMENTATION_REPORT.md
Status: Documentation complete
Implementation: Future work
Priority: P2-P3 (optional performance enhancements)
ThemisDB v1.3.4 | GitHub | Documentation | Discussions | License
Last synced: January 02, 2026 | Commit: 6add659
Version: 1.3.0 | Stand: Dezember 2025
- Übersicht
- Home
- Dokumentations-Index
- Quick Reference
- Sachstandsbericht 2025
- Features
- Roadmap
- Ecosystem Overview
- Strategische Übersicht
- Geo/Relational Storage
- RocksDB Storage
- MVCC Design
- Transaktionen
- Time-Series
- Memory Tuning
- Chain of Thought Storage
- Query Engine & AQL
- AQL Syntax
- Explain & Profile
- Rekursive Pfadabfragen
- Temporale Graphen
- Zeitbereichs-Abfragen
- Semantischer Cache
- Hybrid Queries (Phase 1.5)
- AQL Hybrid Queries
- Hybrid Queries README
- Hybrid Query Benchmarks
- Subquery Quick Reference
- Subquery Implementation
- Content Pipeline
- Architektur-Details
- Ingestion
- JSON Ingestion Spec
- Enterprise Ingestion Interface
- Geo-Processor Design
- Image-Processor Design
- Hybrid Search Design
- Fulltext API
- Hybrid Fusion API
- Stemming
- Performance Tuning
- Migration Guide
- Future Work
- Pagination Benchmarks
- Enterprise README
- Scalability Features
- HTTP Client Pool
- Build Guide
- Implementation Status
- Final Report
- Integration Analysis
- Enterprise Strategy
- Verschlüsselungsstrategie
- Verschlüsselungsdeployment
- Spaltenverschlüsselung
- Encryption Next Steps
- Multi-Party Encryption
- Key Rotation Strategy
- Security Encryption Gap Analysis
- Audit Logging
- Audit & Retention
- Compliance Audit
- Compliance
- Extended Compliance Features
- Governance-Strategie
- Compliance-Integration
- Governance Usage
- Security/Compliance Review
- Threat Model
- Security Hardening Guide
- Security Audit Checklist
- Security Audit Report
- Security Implementation
- Development README
- Code Quality Pipeline
- Developers Guide
- Cost Models
- Todo Liste
- Tool Todo
- Core Feature Todo
- Priorities
- Implementation Status
- Roadmap
- Future Work
- Next Steps Analysis
- AQL LET Implementation
- Development Audit
- Sprint Summary (2025-11-17)
- WAL Archiving
- Search Gap Analysis
- Source Documentation Plan
- Changefeed README
- Changefeed CMake Patch
- Changefeed OpenAPI
- Changefeed OpenAPI Auth
- Changefeed SSE Examples
- Changefeed Test Harness
- Changefeed Tests
- Dokumentations-Inventar
- Documentation Summary
- Documentation TODO
- Documentation Gap Analysis
- Documentation Consolidation
- Documentation Final Status
- Documentation Phase 3
- Documentation Cleanup Validation
- API
- Authentication
- Cache
- CDC
- Content
- Geo
- Governance
- Index
- LLM
- Query
- Security
- Server
- Storage
- Time Series
- Transaction
- Utils
Vollständige Dokumentation: https://makr-code.github.io/ThemisDB/