Change the repository type filter
All
Repositories list
13 repositories
- 🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer
- 🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)
vibrato
Public🎤 vibrato: Viterbi-based accelerated tokenizer- 🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.
- Finding all pairs of similar documents time- and memory-efficiently
- 🦞 Rust library of natural language dictionaries using character-wise double-array tries.
- 🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure in Rust.
- Viterbi-based accelerated tokenizer (Python wrapper)
- Fast match expression optimized for string comparison
vaporetto-models
Public