Bit Blossom

Learning and Implementing Bloom Filters

When we are comparing two strings, it's an O(m + n) to compare them. There are some good data structures around strings such as Tries, Suffix Trees, etc.
- It's not easy to optimize Tries for any problem, they have good theoretical complexity, in practical applications, it may not be the best.
In practice, hashing is the best approach we can have. We can simply use a Hash Map, why not?
- What could be possible is, we don't have the bucket range to store all the strings, the range of the hash could be 0 to some number N. N is not very large, what we can store is multiple hashes.
- Let's say there is a hash function, the output range of the hash function need not to high, so there are chances for collision. In such scenarios, we can use multiple hash functions to reduce collisions.
  - If we have three hash functions and two strings have the same values for three different hash functions, there is a very low probability that two different strings will give the same output for three different hash functions.
This is concept of Bloom Filters.
- Facebook also uses this for "one hit wonder".
- Let's say we search for something which we don't search that often, the cache and CDN stores the commonly searched details, so that it won't have to keep hitting the servers for the response.
  - The problem with this approach is that, when we store something like this in a cache, which we may never search for again, it's going to space for something which might be more often searched.
  - If we use something like LRU, something more useful might be removed from the cache.
  - Because of this, the cache is poor, it would have to hit the server more, leading slower browser experience.
- What Facebook does is, they won't store it the cache on justing hitting the entity once, they would expect you to search for it multiple times before storing it into the cache.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
bloom_filter.go		bloom_filter.go
go.mod		go.mod
go.sum		go.sum
hashing_utils.go		hashing_utils.go
main.go		main.go
metrics_utils.go		metrics_utils.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bit Blossom

About

Releases

Packages

Languages

aneesh-aparajit/bit-blossom

Folders and files

Latest commit

History

Repository files navigation

Bit Blossom

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages