28 Feb 18:09

gfosco

641fae6

RocksDB Release v5.18.3

Rocksdb Change Log

5.18.3 (2/11/2019)

Bug Fixes

Fix possible LSM corruption when both range deletions and subcompactions are used. The symptom of this corruption is L1+ files overlapping in the user key space.

5.18.2 (01/31/2019)

Public API Change

Change time resolution in FileOperationInfo.
Deleting Blob files also go through SStFileManager.

5.18.0 (11/30/2018)

New Features

Introduced JemallocNodumpAllocator memory allocator. When being use, block cache will be excluded from core dump.
Introduced PerfContextByLevel as part of PerfContext which allows storing perf context at each level. Also replaced __thread with thread_local keyword for perf_context. Added per-level perf context for bloom filter and Get query.
With level_compaction_dynamic_level_bytes = true, level multiplier may be adjusted automatically when Level 0 to 1 compaction is lagged behind.
Introduced DB option atomic_flush. If true, RocksDB supports flushing multiple column families and atomically committing the result to MANIFEST. Useful when WAL is disabled.
Added num_deletions and num_merge_operands members to TableProperties.
Added "rocksdb.min-obsolete-sst-number-to-keep" DB property that reports the lower bound on SST file numbers that are being kept from deletion, even if the SSTs are obsolete.
Add xxhash64 checksum support
Introduced MemoryAllocator, which lets the user specify custom memory allocator for block based table.
Improved DeleteRange to prevent read performance degradation. The feature is no longer marked as experimental.
Enabled checkpoint on readonly db (DBImplReadOnly).

Public API Change

DBOptions::use_direct_reads now affects reads issued by BackupEngine on the database's SSTs.
NO_ITERATORS is divided into two counters NO_ITERATOR_CREATED and NO_ITERATOR_DELETE. Both of them are only increasing now, just as other counters.

Bug Fixes

Fix corner case where a write group leader blocked due to write stall blocks other writers in queue with WriteOptions::no_slowdown set.
Fix in-memory range tombstone truncation to avoid erroneously covering newer keys at a lower level, and include range tombstones in compacted files whose largest key is the range tombstone's start key.
Properly set the stop key for a truncated manual CompactRange
Fix slow flush/compaction when DB contains many snapshots. The problem became noticeable to us in DBs with 100,000+ snapshots, though it will affect others at different thresholds.
Fix the bug that WriteBatchWithIndex's SeekForPrev() doesn't see the entries with the same key.
Fix the bug where user comparator was sometimes fed with InternalKey instead of the user key. The bug manifests when during GenerateBottommostFiles.
Fix a bug in WritePrepared txns where if the number of old snapshots goes beyond the snapshot cache size (128 default) the rest will not be checked when evicting a commit entry from the commit cache.
Fixed Get correctness bug in the presence of range tombstones where merge operands covered by a range tombstone always result in NotFound.
Start populating NO_FILE_CLOSES ticker statistic, which was always zero previously.
The default value of NewBloomFilterPolicy()'s argument use_block_based_builder is changed to false. Note that this new default may cause large temp memory usage when building very large SST files.
Fix a deadlock caused by compaction and file ingestion waiting for each other in the event of write stalls.
Make DB ignore dropped column families while committing results of atomic flush.

Assets 2

12 Nov 21:14

gfosco

v5.17.2

f438b98

RocksDB Release v5.17.2

Rocksdb Change Log

5.17.2 (10/24/2018)

Bug Fixes

Fix the bug that WriteBatchWithIndex's SeekForPrev() doesn't see the entries with the same key.

5.17.1 (10/16/2018)

Bug Fixes

Fix slow flush/compaction when DB contains many snapshots. The problem became noticeable to us in DBs with 100,000+ snapshots, though it will affect others at different thresholds.
Properly set the stop key for a truncated manual CompactRange
Fix corner case where a write group leader blocked due to write stall blocks other writers in queue with WriteOptions::no_slowdown set.

New Features

Introduced CacheAllocator, which lets the user specify custom allocator for memory in block cache.

5.17.0 (10/05/2018)

Public API Change

OnTableFileCreated will now be called for empty files generated during compaction. In that case, TableFileCreationInfo::file_path will be "(nil)" and TableFileCreationInfo::file_size will be zero.
Add FlushOptions::allow_write_stall, which controls whether Flush calls start working immediately, even if it causes user writes to stall, or will wait until flush can be performed without causing write stall (similar to CompactRangeOptions::allow_write_stall). Note that the default value is false, meaning we add delay to Flush calls until stalling can be avoided when possible. This is behavior change compared to previous RocksDB versions, where Flush calls didn't check if they might cause stall or not.
Application using PessimisticTransactionDB is expected to rollback/commit recovered transactions before starting new ones. This assumption is used to skip concurrency control during recovery.

Assets 2

12 Nov 21:13

gfosco

v5.16.6

cfdea78

RocksDB Release v5.16.6

Rocksdb Change Log

5.16.6 (10/24/2018)

Bug Fixes

Fix the bug that WriteBatchWithIndex's SeekForPrev() doesn't see the entries with the same key.

5.16.5 (10/16/2018)

Bug Fixes

Fix slow flush/compaction when DB contains many snapshots. The problem became noticeable to us in DBs with 100,000+ snapshots, though it will affect others at different thresholds.
Properly set the stop key for a truncated manual CompactRange

5.16.4 (10/10/2018)

Bug Fixes

Fix corner case where a write group leader blocked due to write stall blocks other writers in queue with WriteOptions::no_slowdown set.

5.16.3 (10/1/2018)

Bug Fixes

Fix crash caused when CompactFiles run with CompactionOptions::compression == CompressionType::kDisableCompressionOption. Now that setting causes the compression type to be chosen according to the column family-wide compression options.

5.16.2 (9/21/2018)

Bug Fixes

Fix bug in partition filters with format_version=4.

5.16.1 (9/17/2018)

Bug Fixes

Remove trace_analyzer_tool from rocksdb_lib target in TARGETS file.
Fix RocksDB Java build and tests.
Remove sync point in Block destructor.

5.16.0 (8/21/2018)

Public API Change

OnTableFileCreated will now be called for empty files generated during compaction. In that case, TableFileCreationInfo::file_path will be "(nil)" and TableFileCreationInfo::file_size will be zero.
Add FlushOptions::allow_write_stall, which controls whether Flush calls start working immediately, even if it causes user writes to stall, or will wait until flush can be performed without causing write stall (similar to CompactRangeOptions::allow_write_stall). Note that the default value is false, meaning we add delay to Flush calls until stalling can be avoided when possible. This is behavior change compared to previous RocksDB versions, where Flush calls didn't check if they might cause stall or not.
The merge operands are passed to MergeOperator::ShouldMerge in the reversed order relative to how they were merged (passed to FullMerge or FullMergeV2) for performance reasons
GetAllKeyVersions() to take an extra argument of max_num_ikeys.

New Features

Changes the format of index blocks by delta encoding the index values, which are the block handles. This saves the encoding of BlockHandle::offset of the non-head index entries in each restart interval. The feature is backward compatible but not forward compatible. It is disabled by default unless format_version 4 or above is used.
Add a new tool: trace_analyzer. Trace_analyzer analyzes the trace file generated by using trace_replay API. It can convert the binary format trace file to a human readable txt file, output the statistics of the analyzed query types such as access statistics and size statistics, combining the dumped whole key space file to analyze, support query correlation analyzing, and etc. Current supported query types are: Get, Put, Delete, SingleDelete, DeleteRange, Merge, Iterator (Seek, SeekForPrev only).
Add hash index support to data blocks, which helps reducing the cpu utilization of point-lookup operations. This feature is backward compatible with the data block created without the hash index. It is disabled by default unless BlockBasedTableOptions::data_block_index_type is set to data_block_index_type = kDataBlockBinaryAndHash.

Bug Fixes

Fix a bug in misreporting the estimated partition index size in properties block.
Avoid creating empty SSTs and subsequently deleting them in certain cases during compaction.

Assets 2

14 Sep 17:21

gfosco

v5.15.10

7e1f37e

RocksDB v5.15.10

Rocksdb Change Log

5.15.10 (9/13/2018)

Bug Fixes

Fix RocksDB Java build and tests.

5.15.9 (9/4/2018)

Bug Fixes

Fix compilation errors on OS X clang due to '-Wsuggest-override'.

5.15.8 (8/31/2018)

Bug Fixes

Further avoid creating empty SSTs and subsequently deleting them during compaction.

5.15.7 (8/24/2018)

Bug Fixes

Avoid creating empty SSTs and subsequently deleting them in certain cases during compaction.

5.15.6 (8/21/2018)

Public API Change

The merge operands are passed to MergeOperator::ShouldMerge in the reversed order relative to how they were merged (passed to FullMerge or FullMergeV2) for performance reasons

5.15.5 (8/16/2018)

Bug Fixes

Fix VerifyChecksum() API not preserving options

5.15.4 (8/11/2018)

Bug Fixes

Fix a bug caused by not generating OnTableFileCreated() notification for a 0-byte SST.

5.15.3 (8/10/2018)

Bug Fixes

Fix a bug in misreporting the estimated partition index size in properties block.

5.15.2 (8/9/2018)

Bug Fixes

Return correct usable_size for BlockContents.

5.15.1 (8/1/2018)

Bug Fixes

Prevent dereferencing invalid STL iterators when there are range tombstones in ingested files.

5.15.0 (7/17/2018)

Public API Change

Remove managed iterator. ReadOptions.managed is not effective anymore.
For bottommost_compression, a compatible CompressionOptions is added via bottommost_compression_opts. To keep backward compatible, a new boolean enabled is added to CompressionOptions. For compression_opts, it will be always used no matter what value of enabled is. For bottommost_compression_opts, it will only be used when user set enabled=true, otherwise, compression_opts will be used for bottommost_compression as default.
With LRUCache, when high_pri_pool_ratio > 0, midpoint insertion strategy will be enabled to put low-pri items to the tail of low-pri list (the midpoint) when they first inserted into the cache. This is to make cache entries never get hit age out faster, improving cache efficiency when large background scan presents.
For users of Statistics objects created via CreateDBStatistics(), the format of the string returned by its ToString() method has changed.
The "rocksdb.num.entries" table property no longer counts range deletion tombstones as entries.

New Features

Changes the format of index blocks by storing the key in their raw form rather than converting them to InternalKey. This saves 8 bytes per index key. The feature is backward compatbile but not forward compatible. It is disabled by default unless format_version 3 or above is used.
Avoid memcpy when reading mmap files with OpenReadOnly and max_open_files==-1.
Support dynamically changing ColumnFamilyOptions::ttl via SetOptions().
Add a new table property, "rocksdb.num.range-deletions", which counts the number of range deletion tombstones in the table.
Improve the performance of iterators doing long range scans by using readahead, when using direct IO.
pin_top_level_index_and_filter (default true) in BlockBasedTableOptions can be used in combination with cache_index_and_filter_blocks to prefetch and pin the top-level index of partitioned index and filter blocks in cache. It has no impact when cache_index_and_filter_blocks is false.

Bug Fixes

Fix deadlock with enable_pipelined_write=true and max_successive_merges > 0
Check conflict at output level in CompactFiles.
Fix corruption in non-iterator reads when mmap is used for file reads
Fix bug with prefix search in partition filters where a shared prefix would be ignored from the later partitions. The bug could report an eixstent key as missing. The bug could be triggered if prefix_extractor is set and partition filters is enabled.
Change default value of bytes_max_delete_chunk to 0 in NewSstFileManager() as it doesn't work well with checkpoints.
Fix a bug caused by not copying the block trailer with compressed SST file, direct IO, prefetcher and no compressed block cache.
Fix write can stuck indefinitely if enable_pipelined_write=true. The issue exists since pipelined write was introduced in 5.5.0.

Assets 2

26 Aug 20:33

ajkr

v5.14.3

6265503

RocksDB 5.14.3

5.14.3 (8/21/2018)

Public API Change

The merge operands are passed to MergeOperator::ShouldMerge in the reversed order relative to how they were merged (passed to FullMerge or FullMergeV2) for performance reasons

Bug Fixes

Fixes DBImpl::FindObsoleteFiles() calling GetChildren() on the same path

Assets 2

22 Aug 00:37

yiwu-arbug

rocksdb-5.14.3

6265503

RocksDB 5.14.3

5.14.3 (8/21/2018)

Public API Change

The merge operands are passed to MergeOperator::ShouldMerge in the reversed order relative to how they were merged (passed to FullMerge or FullMergeV2) for performance reasons

Bug Fixes

Fixes DBImpl::FindObsoleteFiles() calling GetChildren() on the same path

Assets 2

04 Jul 17:06

gfosco

v5.14.2

5089e12

RocksDB release v5.14.2

5.14.2 (7/3/2018)

Bug Fixes

Change default value of bytes_max_delete_chunk to 0 in NewSstFileManager() as it doesn't work well with checkpoints.
Set DEBUG_LEVEL=0 for RocksJava Mac Release build.

5.14.1 (6/20/2018)

Bug Fixes

Fix block-based table reader pinning blocks throughout its lifetime, causing memory usage increase.
Fix bug with prefix search in partition filters where a shared prefix would be ignored from the later partitions. The bug could report an eixstent key as missing. The bug could be triggered if prefix_extractor is set and partition filters is enabled.

5.14.0 (5/16/2018)

Public API Change

Add a BlockBasedTableOption to align uncompressed data blocks on the smaller of block size or page size boundary, to reduce flash reads by avoiding reads spanning 4K pages.
The background thread naming convention changed (on supporting platforms) to "rocksdb:", e.g., "rocksdb:low0".
Add a new ticker stat rocksdb.number.multiget.keys.found to count number of keys successfully read in MultiGet calls
Touch-up to write-related counters in PerfContext. New counters added: write_scheduling_flushes_compactions_time, write_thread_wait_nanos. Counters whose behavior was fixed or modified: write_memtable_time, write_pre_and_post_process_time, write_delay_time.
Posix Env's NewRandomRWFile() will fail if the file doesn't exist.
Now, DBOptions::use_direct_io_for_flush_and_compaction only applies to background writes, and DBOptions::use_direct_reads applies to both user reads and background reads. This conforms with Linux's open(2) manpage, which advises against simultaneously reading a file in buffered and direct modes, due to possibly undefined behavior and degraded performance.
Iterator::Valid() always returns false if !status().ok(). So, now when doing a Seek() followed by some Next()s, there's no need to check status() after every operation.
Iterator::Seek()/SeekForPrev()/SeekToFirst()/SeekToLast() always resets status().

New Features

Introduce TTL for level compaction so that all files older than ttl go through the compaction process to get rid of old data.
TransactionDBOptions::write_policy can be configured to enable WritePrepared 2PC transactions. Read more about them in the wiki.
Add DB properties "rocksdb.block-cache-capacity", "rocksdb.block-cache-usage", "rocksdb.block-cache-pinned-usage" to show block cache usage.
Add Env::LowerThreadPoolCPUPriority(Priority) method, which lowers the CPU priority of background (esp. compaction) threads to minimize interference with foreground tasks.
Fsync parent directory after deleting a file in delete scheduler.
In level-based compaction, if bottom-pri thread pool was setup via Env::SetBackgroundThreads(), compactions to the bottom level will be delegated to that thread pool.

Bug Fixes

Fsync after writing global seq number to the ingestion file in ExternalSstFileIngestionJob.
Fix WAL corruption caused by race condition between user write thread and FlushWAL when two_write_queue is not set.
Fix BackupableDBOptions::max_valid_backups_to_open to not delete backup files when refcount cannot be accurately determined.
Fix memory leak when pin_l0_filter_and_index_blocks_in_cache is used with partitioned filters
Disable rollback of merge operands in WritePrepared transactions to work around an issue in MyRocks. It can be enabled back by setting TransactionDBOptions::rollback_merge_operands to true.
Fix bug with prefix search in partition filters where a shared prefix would be ignored from the later partitions. The bug could report an eixstent key as missing. The bug could be triggered if prefix_extractor is set and partition filters is enabled.

Java API Changes

Add BlockBasedTableConfig.setBlockCache to allow sharing a block cache across DB instances.
Added SstFileManager to the Java API to allow managing SST files across DB instances.

Assets 2

18 Jun 17:15

gfosco

v5.13.4

d9c289c

RocksDB 5.13.4

Bug Fixes

Fix regression bug of Prev() with ReadOptions.iterate_upper_bound.

Assets 2

18 Jun 17:11

gfosco

v5.12.5

005c34f

RocksDB v5.12.5

Bug Fixes

Fix regression bug of Prev() with ReadOptions.iterate_upper_bound.

Assets 2

07 Jun 17:27

ajkr

v5.13.3

f9e0a06

RocksDB v5.13.3

5.13.3 (6/6/2018)

Bug Fixes

Fix assertion when reading bloom filter of SST files containing range deletions but no data

Assets 2

Releases: facebook/rocksdb

RocksDB Release v5.18.3

Rocksdb Change Log

5.18.3 (2/11/2019)

Bug Fixes

5.18.2 (01/31/2019)

Public API Change

5.18.0 (11/30/2018)

New Features

Public API Change

Bug Fixes

RocksDB Release v5.17.2

Rocksdb Change Log

5.17.2 (10/24/2018)

Bug Fixes

5.17.1 (10/16/2018)

Bug Fixes

New Features

5.17.0 (10/05/2018)

Public API Change

RocksDB Release v5.16.6

Rocksdb Change Log

5.16.6 (10/24/2018)

Bug Fixes

5.16.5 (10/16/2018)

Bug Fixes

5.16.4 (10/10/2018)

Bug Fixes

5.16.3 (10/1/2018)

Bug Fixes

5.16.2 (9/21/2018)

Bug Fixes

5.16.1 (9/17/2018)

Bug Fixes

5.16.0 (8/21/2018)

Public API Change

New Features

Bug Fixes

RocksDB v5.15.10

Rocksdb Change Log

5.15.10 (9/13/2018)

Bug Fixes

5.15.9 (9/4/2018)

Bug Fixes

5.15.8 (8/31/2018)

Bug Fixes

5.15.7 (8/24/2018)

Bug Fixes

5.15.6 (8/21/2018)

Public API Change

5.15.5 (8/16/2018)

Bug Fixes

5.15.4 (8/11/2018)

Bug Fixes

5.15.3 (8/10/2018)

Bug Fixes

5.15.2 (8/9/2018)

Bug Fixes

5.15.1 (8/1/2018)

Bug Fixes

5.15.0 (7/17/2018)

Public API Change

New Features

Bug Fixes

RocksDB 5.14.3

5.14.3 (8/21/2018)

Public API Change

Bug Fixes

RocksDB 5.14.3

5.14.3 (8/21/2018)

Public API Change

Bug Fixes

RocksDB release v5.14.2

5.14.2 (7/3/2018)

Bug Fixes

5.14.1 (6/20/2018)

Bug Fixes

5.14.0 (5/16/2018)

Public API Change

New Features