The main aim of this project is to classify Handwritten character dataset which consist of Kannada/Telugu script in coordinate form and to also classify Consonant Vowel (CV) segment dataset, a conversational speech data spoken in Hindi language by using RNN and LSTM.
Five characters are there a, aI, bA, dA and lA, each characters are stored in .txt files as sequence of 2-dimensional points (x and y
coordinates) :-
- RNN
- Accuracy on Train Set: 0.96
- Accuracy on Test Set: 0.94
- LSTM
- Accuracy on Train Set: 0.98
- Accuracy on Test Set: 0.97
This dataset consists of subset of CV segments from a conversational speech data spoken in Hindi language. Training and test data are separated and are provided inside the respective CV segment folder where each class consist of 39-dimensional Mel frequency cepstral coefficient (MFCC) features.
- RNN
- Accuracy on Train Set: 0.988
- Accuracy on Test Set: 0.899
- LSTM
- Accuracy on Train Set: 0.997
- Accuracy on Test Set: 0.879
In both of the cases, our data consists of long sequential sequences, the better accuracy of the LSTM model confirms its effectiveness over the standard RNN.