python3_lzo_indexer

https://coveralls.io/repos/github/Orhideous/python3_lzo_indexer/badge.svg?branch=master

Python library for indexing block offsets within LZO compressed files. The implementation is largely based on that of the Hadoop Library. Index files are used to allow Hadoop to split a single file compressed with LZO into several chunks for parallel processing.

Since LZO is a block based compression algorithm, we can split the file along the lines of blocks and decompress each block on it’s own. The index is a file containing byte offsets for each block in the original LZO file.

This library is python3 fork of python-lzo-indexer.

Example

The python code below demonstrates how easy it is to index an LZO file. This library also supports indexing a string, and a method to return the individual block offsets should you need to create a file of your own format.

import lzo_indexer

with open("my-file.lzo", "r") as f, open("my-file.lzo.index", "rw") as index:
    lzo_indexer.index_lzo_file(f, index)

Command-line Utility

This library also includes a utility for indexing multiple lzo files, using the python indexer. This is a much faster alternative to the command line utility built into the hadoop-lzo library as it avoids the JVM.

$ lzo_indexer --help

Usage: lzo_indexer [OPTIONS] <files to index>

  Tool for indexing LZO compressed files

Options:
  -t, --threads INTEGER  Processing threads count
  -e, --extension TEXT   Index file extension
  -f, --force            Force re-creation of an index even if it exists
  -h, --help             Show this message and exit.

Contributions

I welcome any contributions, though I request that any pull requests come with test coverage.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
lzo_indexer		lzo_indexer
tests		tests
.gitignore		.gitignore
.pyup.yml		.pyup.yml
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

python3_lzo_indexer

Example

Command-line Utility

Contributions

About

Uh oh!

Releases

Packages

Languages

License

Orhideous/python3_lzo_indexer

Folders and files

Latest commit

History

Repository files navigation

python3_lzo_indexer

Example

Command-line Utility

Contributions

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages