Elias-Fano compression implementation in Python, created for the Data Structures and Algorithms course. The script reads a list of integers from a file, compresses them into two bit arrays (L and U), and prints the compressed representation along with their SHA-256 hash.
The repository where the task was published is here.
This Python script implements the Elias-Fano compression technique for a sorted list of integers. The program reads a list of integers from a file, compresses it using two bit arrays (L and U), and prints the compressed representations along with their SHA-256 hash.
- Input Handling: Reads integers from a specified input file.
- Compression: Compresses the integers using the Elias-Fano scheme into two bit arrays, L and U.
- Output: Prints the bit arrays L and U in binary format.
- Hashing: Generates and prints the SHA-256 hash of the concatenated bit arrays L and U.
elias_fano.py
: Main script containing the compression implementation.README.md
: Documentation for the project.example_*.txt
: Example input files used for testing.
- Python 3.x
- Required Libraries:
sys
,math
,hashlib
To run the script, use the following command:
python elias_fano.py <input_file>
where <input_file> is the path to the file containing the list of integers.
- Reading Input: The script reads integers from the input file and stores them in a list.
- Calculating Parameters:
n
: Number of integers.m
: Maximum integer in the list.l
: Number of bits used for the least significant part, calculated asint(math.log(m / n, 2))
.
- Creating Bit Arrays:
L
: Contains the lastl
bits of each integer.U
: Encodes the remaining bits in unary form.
- Populating Bit Arrays:
- The script splits each integer into parts for
L
andU
and fills the respective bit arrays.
- The script splits each integer into parts for
- Output: Prints the bit arrays
L
andU
in binary format and the SHA-256 hash of the concatenated arrays.
- Ensure the input file contains integers, each on a new line.
- The script does not handle input errors or invalid data types.
- Aris Fetzian
This project is licensed under the MIT License.