HadroDB improvements #860

joocer · 2023-02-12T14:49:02Z

joocer
Feb 12, 2023
Maintainer

save the schema somewhere
optional indexing on writes (and support querying)
deletion support... Consider tracking empty records and writing to empty when available and data fits
set to/from arrow
save values as Tuples, create a type which acts like a tuple but has a to_dict() function
write parts in Cython
use LRU to keep records in memory to reduce disk reads
Write to a LSM WAL, when this reaches a given size 16k?) Write to the main dataset.
when reading, check the WAL and the main dataset
tiny datasets may never evict from the WAL
entries are added to to indexes on eviction from WAL
this is likely to cause a noticeable stall at this point, so consider having compaction and flushes in another thread/process

Folder contains these files
.wal - write log
.lsm - user data
.art - radix index
.btree - btree index
.bitmap - bitmap index
.hadro - schema

joocer · 2023-02-25T18:40:34Z

Schema file should contain

Columns

Dataset

Indexes

0 replies