Skip to content

Latest commit

 

History

History
40 lines (25 loc) · 1.4 KB

README.md

File metadata and controls

40 lines (25 loc) · 1.4 KB

BloomFilters-StringSearch

Speeding up string search using bloom filters. I ll be writing a paper describing the algorithm used shortly.

Specs : In python3. Need multiprocessing to run the algorithms parallely.

Using this code :

Create_BloomFilter_Collection(text,size,no_of_bloom_filters)

This function will create bloom filter collection which will be used to 
speed up string searching. It will create a bloom filter collection object.
Returns nothing

Parameters : 
  text - The large text file you want to search in 
  size - Size of the pattern you want to search(This algorithm is sensitive to
  size of pattern).Defaults to 60.
  no_of_bloom_filters - No of bloom filters in your collection.Defaults to 15.




Search(pattern,collection,textfile)

This function will search for the pattern in the textfile. Collection is the 
path to the collection object in your system. It will return the number of 
times the pattern occurs in the text.

Parameters : 
  pattern - The pattern to search
  collection - Path of the bloom filter collection object
  textfile - The file associated with the bloom filter object.

Look at test.pdf to see the performance improvements. Naive with BLF - Bloom filters used with naive string search algorithm. Done serially and parallely Naive - Direct naive string search algorithm is used.