for the problem statement view this document Tifin-test_1.pdf
The goal of this assignment is to build a machine learning model that accurately predicts the intent of a user based on their input. Intent detection is a critical part of most Conversational AI systems as it helps the virtual agent "understand" user needs and respond appropriately.
1.) first of all i have analysed the problem statement and break down it into different parts .
2.) then analysed the each segement of the problem according to the contextual , awareness , sentement analysis , different core ML approaches.
3.) Afterwords i have searched for the research papaers and references for the model to build upon and also different things similar to the contextual awreness and user sentiments models for that you can visit here
4.) after i have explored the simialr pretrained approaches for it to determine the things and reproduce the output , and analysed what's the input and output for it.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805.
Pennington, J., Socher, R., & Manning, C. (2014). GloVe: Global Vectors for Word Representation. Conference on Empirical Methods in Natural Language Processing (EMNLP).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems.
you can also visit the below research and documentation:
Hugging Face Transformers Documentation: https://huggingface.co/docs
Scikit learn Documentation: https://scikit-learn.org/stable/
machine Learning Research Papaers
1.) Simpler and more efficient to implement.
2.)Most intent detection datasets align with this approach.
1.)Cannot handle cases where an utterance belongs to multiple intents.
Formulation: Each utterance can be assigned multiple intent labels.
Flexible and capable of handling real-world overlapping intents.
More complex implementation and evaluation.
The provided dataset contains user utterances with their corresponding intent labels. which is reffered as sentences
, labels
Key steps include:
Exploratory Data Analysis (EDA): Understanding label distribution, checking for class imbalance, and preprocessing text data (e.g., removing special characters, stopwords).
Preprocessing: Converting text to lowercase, tokenization, and vectorization using methods like TF-IDF or pretrained embeddings.
Model: BERT with TF-IDF features.
Convert text into numerical features using TF-IDF.
Train a BERT model on the given dataset with normal parameters.
Evaluate using a validation dataset.
Initial Accuracy: The baseline model achieved an accuracy of 40.23%, which was significantly lower than desired.
Model: Fine-tuned BERT (Bidirectional Encoder Representations from Transformers).
Tokenized inputs using the BERT tokenizer.
1.) Train a pretrained BERT model for sequence classification on the dataset.
2.) Fine-tune hyperparameters (learning rate, batch size) to improve performance.
3.) Final Accuracy After Tuning: After extensive hyperparameter tuning, the BERT model achieved an accuracy of 84.123%.
Baseline Model: Python, Sklearn, Pandas, and Numpy, torch, transformers.
Advanced Model: PyTorch, Hugging Face Transformers.
Baseline Model (TF-IDF + BERT)
Initial Accuracy: ~40.23%
for this less accuracy you can visit the colab notebook here
1.) Simple preprocessing and feature engineering produced limited results.
2.) The model struggled with complex language semantics.
Final Accuracy: ~84.123%
for this final accuracy you can download the notebook here
1.) BERT's contextual embeddings significantly improved performance.
2.) Hyperparameter tuning was critical in achieving higher accuracy.
Expand the dataset using back-translation or paraphrasing to increase diversity.
Experiment with learning rates, batch sizes, and epochs for better performance.
Use techniques like oversampling minority classes or weighted loss functions to improve predictions on underrepresented intents.
Combine traditional and transformer-based models for robust predictions.
git clone https://github.com/Blacksujit/Intent-Detection
visit the pretrained model notebook:
here
run the notebook
save the pretrained model in the same directory as the path ,pretrained_model.pth
you can now use the saved model for predictions
1.) using this approaches we have figured out how we can obtain the maximum prediction accuracy from the model.
2.) we can also use core ML approaches if we have the larger datasets to prepare our own , models , but by doing the hyperparameter tuining we can achieve the maximum accuracy