Adding support for in-place delete #2

dangershony · 2020-07-23T09:03:26Z

This issue is to track and possibly try to add support for in-place hard delete

As far as I am aware to hard delete entries one needs to call defragment.
How can we delete at the point where the delete function is called.

The following is a convo I had with dbreee dev that might help:

Imagine data structure in a DBreeze file (number means a data block, bold-red first char of the block means - not deleted):
(1)111(2)2222(3)3333444(5)555555

Currently, if we want to delete block 2 it will look like this:

(1)11122222(3)3333444(5)555555  

Do you want to have it like
1111-----333334445555555    ?

 I would expect to see (1)111(3)3333444(5)555555

If you have several parallel threads that are reading table data (these threads have already read some bytes and are going to read other bytes from disk starting 
from address xxxx. At this moment you run de-fragmentaion procedure ((3)3333(4)44(5)555555 must be moved to address that (2)2222 occupies).
All pointers in parallel reading threads will be corrupted and threads can go out with mistakes or read incorrect data.

To avoid that, long time ago, was developed Exclusive/Shared access (read docu "Full tables locking inside of transaction.")

It is also expensive and weird to run de-fragmentation procedure (purge) after each "delete command". 
There is a sense to run it not so often, once per day, month, when DBreeze starts etc... 

Purge itself is a copying data from one table t1 into another table t2 and then restoring t1 from t2 
(t1 delete, t2 rename to t1 - all happens in one transaction with tran.SynchronizeTable(t1)).

It is also possible to use  tran.RestoreTableFromTheOtherFile 
(also will not save from "parallel-reading threads losing pointers" problem, so the same Exclusive/Shared technique may be used
or purge on DBreeze Init)

So, all techniques are already available.

So if we have a database the size of 600gb and 50gb is deletable, then we need to copy 550gb to a new location?

I think you understand that partial de-fragmentation of 600GB could take the same time as de-fragmentation of the file system data.
Copying and renaming will be definitely faster, it is more sequential operation for DBreeze, you just need to have enough space on the drive.

But, depending upon the nature of your data, probably, you can utilize reusable data-blocks  - buckets. 

Imagine that you have created data data-block with the size 1MB or 10MB. This is a reserved space inside of DBreeze table.
There you can create other data structures, like serialized dictionary. The most important is to stay within data-block size while 
it is updated. Many keys can lead to one data-block and each key must also lead to an element inside that stored serialized Dictionary.  
There you can remove unused values, insert new and update data-block completely and in transnational manner in the end.

What would I do? Don't know anything about your workflow. We work in Telematik area, there I would do either nothing
 or would do de-fragmentation once per several years, while putting all data on the new disk type or, probably, nothing.

If  you defragment on delete you don't need to do it on 600gb because you do it at the point of removal, so the data is constantly defragmented.

Imagine, you have inserted 100GB, after that you have inserted 1 more MB, after that passed one month and then you want to delete that inserted 1MB.
Within the month you have inserted 200GB

(1)111-----(3)3333(4)44(5)555555  

1  - 100GB
--- your 1 MB
3,4,5 - 200 GB

you have to move from 3-5 at the index of ----

it could take... hmmm... 5 minutes or 10 minutes or you need to move 500GB

My advice is make compaction once per year on startup.

The text was updated successfully, but these errors were encountered:

NicolasDorier · 2020-07-23T13:48:41Z

The best we could do is to track the last deleted entry, so when we have a new entry with a size lower than the last entry, we can fit it in the space instead of just allocating more space at the end of the file. This would definitely save lot's of space for UTXO db, This would not save completely the issue though, but frankly speaking, I think running defrag once a while is good enough.

There is 2 defragmentation mode for a table in DBTRie. One is the "Copy/Defrag/Rename" and the other is the "Defragment Unsafe" which defragment the file directly, but if crash on the middle it would be irrecoverable.

The time window where thing can go wrong should be quite low though.

dangershony · 2020-07-23T22:21:39Z

hmm interesting so a strategy for using defrag would be say every 10k blocks code will block consensus do defrag and continue.
Doing it frequently means defrag will be fast and perhaps not effect sync time by much.
If I have time I will create a Trie implementation of coindb on blockcore fullnode and report the results.

NicolasDorier · 2020-07-24T12:17:43Z

Yes. You need to test on big DB (maybe 10 GB), but I tried on a 2 GB table, and it took like 10s.

dangershony changed the title ~~Adding support for in place delete~~ Adding support for in-place delete Jul 23, 2020

dangershony mentioned this issue Jul 23, 2020

Tracking db implementations block-core/blockcore#190

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for in-place delete #2

Adding support for in-place delete #2

dangershony commented Jul 23, 2020

NicolasDorier commented Jul 23, 2020 •

edited

Loading

dangershony commented Jul 23, 2020

NicolasDorier commented Jul 24, 2020

Adding support for in-place delete #2

Adding support for in-place delete #2

Comments

dangershony commented Jul 23, 2020

This issue is to track and possibly try to add support for in-place hard delete

NicolasDorier commented Jul 23, 2020 • edited Loading

dangershony commented Jul 23, 2020

NicolasDorier commented Jul 24, 2020

NicolasDorier commented Jul 23, 2020 •

edited

Loading