Skip to content

RocksDB v6.8.1

Compare
Choose a tag to compare
released this 25 Apr 01:00
· 1 commit to 6.8.fb since this release

6.8.1 (2020-03-30)

Behavior changes

  • Since RocksDB 6.8.0, ttl-based FIFO compaction can drop a file whose oldest key becomes older than options.ttl while others have not. This fix reverts this and makes ttl-based FIFO compaction use the file's flush time as the criterion. This fix also requires that max_open_files = -1 and compaction_options_fifo.allow_compaction = false to function properly.

6.8.0 (2020-02-24)

Java API Changes

  • Major breaking changes to Java comparators, toward standardizing on ByteBuffer for performant, locale-neutral operations on keys (#6252).
  • Added overloads of common API methods using direct ByteBuffers for keys and values (#2283).

Bug Fixes

  • Fix incorrect results while block-based table uses kHashSearch, together with Prev()/SeekForPrev().
  • Fix a bug that prevents opening a DB after two consecutive crash with TransactionDB, where the first crash recovers from a corrupted WAL with kPointInTimeRecovery but the second cannot.
  • Fixed issue #6316 that can cause a corruption of the MANIFEST file in the middle when writing to it fails due to no disk space.
  • Add DBOptions::skip_checking_sst_file_sizes_on_db_open. It disables potentially expensive checking of all sst file sizes in DB::Open().
  • BlobDB now ignores trivially moved files when updating the mapping between blob files and SSTs. This should mitigate issue #6338 where out of order flush/compaction notifications could trigger an assertion with the earlier code.
  • Batched MultiGet() ignores IO errors while reading data blocks, causing it to potentially continue looking for a key and returning stale results.
  • WriteBatchWithIndex::DeleteRange returns Status::NotSupported. Previously it returned success even though reads on the batch did not account for range tombstones. The corresponding language bindings now cannot be used. In C, that includes rocksdb_writebatch_wi_delete_range, rocksdb_writebatch_wi_delete_range_cf, rocksdb_writebatch_wi_delete_rangev, and rocksdb_writebatch_wi_delete_rangev_cf. In Java, that includes WriteBatchWithIndex::deleteRange.
  • Assign new MANIFEST file number when caller tries to create a new MANIFEST by calling LogAndApply(..., new_descriptor_log=true). This bug can cause MANIFEST being overwritten during recovery if options.write_dbid_to_manifest = true and there are WAL file(s).

Performance Improvements

  • Perfom readahead when reading from option files. Inside DB, options.log_readahead_size will be used as the readahead size. In other cases, a default 512KB is used.

Public API Change

  • The BlobDB garbage collector now emits the statistics BLOB_DB_GC_NUM_FILES (number of blob files obsoleted during GC), BLOB_DB_GC_NUM_NEW_FILES (number of new blob files generated during GC), BLOB_DB_GC_FAILURES (number of failed GC passes), BLOB_DB_GC_NUM_KEYS_RELOCATED (number of blobs relocated during GC), and BLOB_DB_GC_BYTES_RELOCATED (total size of blobs relocated during GC). On the other hand, the following statistics, which are not relevant for the new GC implementation, are now deprecated: BLOB_DB_GC_NUM_KEYS_OVERWRITTEN, BLOB_DB_GC_NUM_KEYS_EXPIRED, BLOB_DB_GC_BYTES_OVERWRITTEN, BLOB_DB_GC_BYTES_EXPIRED, and BLOB_DB_GC_MICROS.
  • Disable recycle_log_file_num when an inconsistent recovery modes are requested: kPointInTimeRecovery and kAbsoluteConsistency

New Features

  • Added the checksum for each SST file generated by Flush or Compaction. Added sst_file_checksum_func to Options such that user can plugin their own SST file checksum function via override the FileChecksumFunc class. If user does not set the sst_file_checksum_func, SST file checksum calculation will not be enabled. The checksum information inlcuding uint32_t checksum value and a checksum function name (string). The checksum information is stored in FileMetadata in version store and also logged to MANIFEST. A new tool is added to LDB such that user can dump out a list of file checksum information from MANIFEST (stored in an unordered_map).
  • db_bench now supports value_size_distribution_type, value_size_min, value_size_max options for generating random variable sized value. Added blob_db_compression_type option for BlobDB to enable blob compression.
  • Replace RocksDB namespace "rocksdb" with flag "ROCKSDB_NAMESPACE" which if is not defined, defined as "rocksdb" in header file rocksdb_namespace.h.