Skip to content

Releases: finalfusion/finalfrontier

Fixes, fixes, fixes

04 Feb 09:18
Compare
Choose a tag to compare
  • Update to ndarray 0.12
  • ff-train: show a progress bar while reading the vocabulary.
  • Fix a bug in negative sampling for the structured skip-gram model.
  • Remove dependency between cloned RNGs.
  • Fix one-off in the loss computation, resulting in reporting losses that were too low.
  • Add ff-compute-accuracy for evaluating anologies.
  • Do not unnecessarily sort the vocabulary upon deserialization.
  • ff-train: add the zipf option to specify the exponent of the zipf distribution.

Memory-mapped models

30 Sep 09:32
Compare
Choose a tag to compare
  • Support memory-mapped embedding matrices. This makes loading of the embedding matrices instantaneous and reduces memory use. The use of memory-mapped matrices comes at a small cost in efficiency.
  • Normalize stored embeddings with their l2 norms. This avoids normalization when loading a model and simplifies the functionality for similarity/analogy queries. The l2 norms are stored, so the word vectors can be restored with their original magnitudes by multiplying them by their l2 norms.
  • Add subword representation embeddings to known words in stored models. This speeds up the retrieval of known word embeddings. There is no loss of information: the original word embeddings can be reconstructed by subtracting their subword embeddings.