HadroDB improvements #860
Closed
joocer
started this conversation in
Improvements
Replies: 1 comment
-
Schema file should contain Columns
Dataset
Indexes
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
save the schema somewhere
optional indexing on writes (and support querying)
deletion support... Consider tracking empty records and writing to empty when available and data fits
set to/from arrow
save values as Tuples, create a type which acts like a tuple but has a to_dict() function
write parts in Cython
use LRU to keep records in memory to reduce disk reads
Write to a LSM WAL, when this reaches a given size 16k?) Write to the main dataset.
when reading, check the WAL and the main dataset
tiny datasets may never evict from the WAL
entries are added to to indexes on eviction from WAL
this is likely to cause a noticeable stall at this point, so consider having compaction and flushes in another thread/process
Folder contains these files
.wal - write log
.lsm - user data
.art - radix index
.btree - btree index
.bitmap - bitmap index
.hadro - schema
Beta Was this translation helpful? Give feedback.
All reactions