Skip to content

Breaking old mental models about hash tables. Experiments, visualisations, and benchmarks showing how to beat the Θ ( log ⁑ ( 1 / 𝛿 ) ) Θ(log(1/Ξ΄)) wall β€” without reordering. Rust code, real numbers, no hand-waving.

License

Notifications You must be signed in to change notification settings

oliverdaff/hash-tables-without-assumptions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Hash Tables Without Assumptions β€” Blog Series Code

This repository contains Rust code examples for the blog series
β€œHash Tables Without Assumptions: Rethinking Open Addressing at Scale”

β†’ Read the series on fdmux.dev


πŸ“š Series Overview

This series explores how modern non-greedy probing techniques β€” like Elastic Hashing and Funnel Hashing β€” challenge the classical $\Theta(\log(1/\delta))$ lower bound on probe complexity in open-addressed hash tables.

We walk through each concept with annotated, runnable Rust code.

Planned posts:

  1. The Invisible Wall Around Hash Tables
    Why greedy probing slows down β€” and how to see it in code.
    βœ… Code included in this repo

  2. Beating the Bound β€” Without Reordering
    Elastic Hashing spreads keys across subarrays to reduce clustering. βœ… Code included in this repo

  3. Inside Elastic Hashing
    Final probe logic, batch-friendly balancing, and sub-logarithmic probes.

  4. Funnel Hashing and Worst-Case Sanity
    A levelled fallback strategy with bounded worst-case probes.

  5. What Faster Hashing Means for Computing
    Why this changes the way we design data-heavy systems.


πŸ“ Repo Layout

This is a Cargo workspace. Each post has its own crate.

hash-tables-without-assumptions/
β”œβ”€β”€ Cargo.toml                     # Workspace definition
β”œβ”€β”€ benchmarks
β”‚Β Β  β”œβ”€β”€ benches
β”‚Β Β  β”œβ”€β”€ gnuplot
β”‚Β Β  └── src
β”œβ”€β”€ posts
β”‚Β Β  β”œβ”€β”€ post1-invisible-wall
β”‚Β Β  β”‚Β Β  └── src
β”‚Β Β  β”œβ”€β”€ post2-elastic-wall
β”‚Β Β  β”‚Β Β  └── src
β”‚Β Β  └── shared
β”‚Β Β      └── src
β”œβ”€β”€ probe-data
β”œβ”€β”€ probe-plots
└── scripts

πŸ¦€ Running Post 1

To run the code for Post 1:

cargo run --bin post1-invisible-wall
cargo run

You’ll see a visualisation of how greedy insertion causes clustering β€” even when the table is mostly empty.

By default, it will initialise a table with 30 slots and insert 15 keys using a simple greedy strategy.

You can customise the run using CLI flags:

just post1 -- --slots 50 --keys 20 --hash-strategy default

Usage:

post1-invisible-wall [OPTIONS]

Options:
  -s, --slots <SLOTS>                  Number of slots in the table [default: 30]
  -k, --keys <KEYS>                    Number of keys to insert [default: 15]
      --hash-strategy <HASH_STRATEGY> Hash strategy to use [default: mod10]
                                      [possible values: default, mod10]
  -h, --help                           Print help

This lets you simulate different load factors and see how clustering appears β€” even when the table is mostly empty β€” based purely on the insertion strategy.


πŸ“Š Running Post 2 β€” Beating the Bound

To explore Post 2, which compares different elastic hashing fallback strategies:

cargo run --bin post2-beating-the-bound

By default, this inserts 100 keys into a hash table with 4 subarrays and 32 slots each β€” using unbalanced and un-rotated hash probing. You'll see a visualisation of how keys cluster or spread depending on the configuration.

You can customise the behaviour using CLI flags:

just post2 -- --subarrays 4 --slots 32 --keys 100 --balanced --rotate-subarrays

Usage

Elastic Hashing Demo

Visualise elastic subarray-based hash table clustering

Options:
  -u, --subarrays <SUBARRAYS>           Number of subarrays [default: 4]
  -s, --slots <SLOTS>                   Number of slots per subarray [default: 32]
  -k, --keys <KEYS>                     Number of keys to insert [default: 100]
  -b, --balanced                        Use coordinated fallback [default: false]
      --rotate-subarrays               Rotate fallback starting point [default: false]
      --hash-strategy <HASH_STRATEGY>  Hash strategy to use [default: default]
                                       [possible values: default, mod10]
  -h, --help                            Print help

πŸ“ˆ Reproducing Benchmarks and Plots

To regenerate all probe data and create updated plots:

nix develop
just plot-probes

This runs:

  • All probe generation tasks (post1, post2-unbalanced-unrotated, post2-unbalanced, post2-balanced)
  • Python script to plot the probe curves

All CSVs are saved to probe-data/, and final images land in probe-plots/.

The output includes the figures used in Post 2 on the blog, comparing fallback strategies and visualising how they scale under load.


πŸ§ͺ Development with Nix

This repo includes a flake.nix to provide a reproducible Rust environment:

nix develop

This gives you:

  • rustc, cargo, rust-analyzer
  • cargo-watch and gnuplot (for benchmarking)

πŸ›  Common Commands with just

Run common dev tasks with just:

just list        # Show available tasks
just post1         # Run Post 1 crate
just post2         # Run Post 1 crate
just check       # Check all workspace crates

πŸ”§ Requirements

  • Rust 1.86+ (edition 2024)
  • just (optional, for convenience)
  • nix (optional, for reproducible dev env)

πŸ“œ License

MIT


πŸ“£ Follow Along

More posts β€” and their code β€” will be published here as the series progresses.

For updates, diagrams, and additional commentary:
@oliverdaff on LinkedIn

About

Breaking old mental models about hash tables. Experiments, visualisations, and benchmarks showing how to beat the Θ ( log ⁑ ( 1 / 𝛿 ) ) Θ(log(1/Ξ΄)) wall β€” without reordering. Rust code, real numbers, no hand-waving.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •