This repository contains source code for the experiments in a paper titled A Semisupervised Approach for Language Identification based on Ladder Networks
In 2015 NIST conducted a LRE i-vector challenge.
The challenge was to identify which language is spoken from a speech sample, given that the language belongs
to one of 50 given language or is one of out-of-set languages.
The speech samples were already processed into i-vectors
and duration information.
The data was split into training
, dev
and test
.
The training
data included labeled samples from the 50 given languages.
The dev
data included unlabeled samples from both the 50 given languages and the out-of-set languages.
The test
was similar to dev
but it could have been only used for making submissions to the competition.
- our solution used a modification of the Ladder Network and published code.
- The dark knowledge of tongues, fun with the i-vector dataset supplied by the challenge.
- Odyssey 2016, video lecture