Releases: finalfusion/finalfrontier
Releases · finalfusion/finalfrontier
Fixes, fixes, fixes
- Update to ndarray 0.12
ff-train
: show a progress bar while reading the vocabulary.- Fix a bug in negative sampling for the structured skip-gram model.
- Remove dependency between cloned RNGs.
- Fix one-off in the loss computation, resulting in reporting losses that were too low.
- Add
ff-compute-accuracy
for evaluating anologies. - Do not unnecessarily sort the vocabulary upon deserialization.
ff-train
: add thezipf
option to specify the exponent of the zipf distribution.
Memory-mapped models
- Support memory-mapped embedding matrices. This makes loading of the embedding matrices instantaneous and reduces memory use. The use of memory-mapped matrices comes at a small cost in efficiency.
- Normalize stored embeddings with their l2 norms. This avoids normalization when loading a model and simplifies the functionality for similarity/analogy queries. The l2 norms are stored, so the word vectors can be restored with their original magnitudes by multiplying them by their l2 norms.
- Add subword representation embeddings to known words in stored models. This speeds up the retrieval of known word embeddings. There is no loss of information: the original word embeddings can be reconstructed by subtracting their subword embeddings.