Reference for DNN: Variani, Ehsan, Xin Lei, Erik McDermott, Ignacio Lopez Moreno, and Javier Gonzalez-Dominguez. "Deep neural networks for small footprint text-dependent speaker verification." In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 4052-4056. IEEE, 2014. paper
Reference for CNN: Chen, Y. H., Lopez-Moreno, I., Sainath, T. N., Visontai, M., Alvarez, R., & Parada, C. (2015). Locally-connected and convolutional neural networks for small footprint speaker recognition. In Sixteenth Annual Conference of the International Speech Communication Association. paper
data: WSJ and LibriSpeech Corpus
features: 32 dimensional log filterbank generated using HTK Toolkit
labels: labels are force aligned using ASR Model built using Kaldi's WSJ recipe.
Work was done at Learning and Extraction of Acoustic Pattern Lab, IISc under the guidance of Prof. Sriram Ganapathy