A complete end-to-end sentiment analysis system using Enhanced BERT model with FastAPI backend and React frontend. This project classifies tweets into 4 categories: Positive, Negative, Neutral, and Irrelevant.
- Enhanced BERT Model: Custom BERT architecture with multi-head attention and deeper classification layers
- ELMo+BERT Model: Combined architecture using ELMo embeddings and BERT for enhanced performance
- 4-Class Classification: Positive, Negative, Neutral, Irrelevant sentiment detection
- Google Drive Integration: Automatic model download from cloud storage
- FastAPI Backend: High-performance REST API with CORS support
- React Frontend: Modern, responsive user interface with real-time predictions
- Docker Support: Fully containerized deployment
- Intelligent Preprocessing: Advanced text preprocessing for social media content
- Real-time Predictions: Fast sentiment analysis with confidence scores
- Multiple Model Architectures: Support for both Enhanced BERT and ELMo+BERT models
twitter_sentiment_bert_only/
βββ backend/
β βββ models/
β β βββ __init__.py
β β βββ enhanced_bert.py # Enhanced BERT model architecture
β β βββ elmo_bert.py # ELMo+BERT combined model architecture
β β βββ elmo_bert.ipynb # Jupyter notebook with model development
β βββ utils/
β β βββ __init__.py
β β βββ drive_loader.py # Google Drive API integration
β β βββ model_loader.py # Model loading and initialization
β β βββ prediction.py # Prediction pipeline
β β βββ preprocessing.py # Text preprocessing utilities
β βββ server.py # FastAPI application
β βββ requirements.txt # Python dependencies
β βββ Dockerfile # Docker configuration
β βββ .env.example # Environment variables template
β βββ .gitignore # Git ignore rules
βββ frontend/
β βββ src/
β β βββ App.tsx # Main React sentiment analyzer UI
β β βββ components/ui/ # Reusable UI components
β β βββ ...
β βββ package.json # Node.js dependencies
β βββ tailwind.config.ts # Tailwind CSS configuration
β βββ ...
βββ README.md
- Python 3.12+
- Node.js 18+
- Docker (optional)
- Google Drive API credentials
-
Clone the repository
git clone https://github.com/VoMinhKhoii/hcmut-project-cuoi-khoa.git cd hcmut-project-cuoi-khoa/backend -
Create virtual environment
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Setup Google Drive API
- Get Google Drive API credentials from Google Cloud Console
- Create a
.envfile in the backend directory:
GOOGLE_DRIVE_FOLDER_ID=your_folder_id_here REFRESH_TOKEN=your_refresh_token_here CLIENT_ID=your_client_id_here CLIENT_SECRET=your_client_secret_here
-
Run the server
python server.py # Or with uvicorn uvicorn server:app --host 0.0.0.0 --port 8888
-
Navigate to frontend directory
cd ../frontend -
Install dependencies
npm install
-
Start development server
npm run dev
-
Access the application
- Frontend: http://localhost:8080
- Backend API: http://localhost:8888
- API Documentation: http://localhost:8888/docs
-
Build and run with Docker
cd backend docker build -t sentiment-api . docker run -p 8888:8888 sentiment-api
-
Run frontend separately
cd frontend npm run dev
POST /predict
{
"text": "I love this product! It works amazing!"
}Response:
{
"text": "I love this product! It works amazing!",
"sentiment": "positive",
"confidence": 0.9245,
"probabilities": {
"positive": 0.9245,
"negative": 0.0123,
"neutral": 0.0532,
"irrelevant": 0.0100
},
"processed_text": "i love this product it works amazing"
}POST /predict-elmo-bert
Uses the combined ELMo+BERT architecture for potentially enhanced performance:
{
"text": "This movie is absolutely fantastic!"
}Response: Same format as above, but uses the ELMo+BERT model for prediction.
GET /
Returns API status and model information.
The Enhanced BERT model includes:
- Base BERT:
bert-base-uncasedas the foundation - Multi-head Attention: 8-head self-attention mechanism
- Deep Classification: 3-layer fully connected network with residual connections
- Layer Normalization: Batch normalization for stable training
- Dropout: Regularization to prevent overfitting
The ELMo+BERT model (DICET architecture) combines:
- ELMo Embeddings: Contextualized word representations from TensorFlow Hub
- BERT Representations: Pre-trained BERT embeddings
- Feature Fusion: Concatenation of ELMo (1024-dim) and BERT (768-dim) features
- BiLSTM Encoder: Bidirectional LSTM for sequence modeling (256 hidden units)
- Additive Attention: Attention mechanism for feature weighting
- Classification Head: Fully connected layers with dropout and ReLU activation
- Gaussian Noise: Regularization during training (Ο=0.3)
- Accuracy: ~85-90% on validation set
- Classes: 4-class classification (Positive, Negative, Neutral, Irrelevant)
- Inference Speed: <100ms per prediction
- Model Size: ~110MB (BERT base)
- URL replacement with tokens
- Mention handling (@user β @USER)
- Hashtag extraction and processing
- Emoji conversion to descriptive text
- Punctuation normalization
- Contraction expansion
- Case normalization with emphasis detection
- Real-time Analysis: Instant sentiment prediction
- Interactive UI: Clean, modern interface built with React and Tailwind CSS
- Confidence Visualization: Progress bars showing prediction confidence
- Example Texts: Pre-loaded examples for quick testing
- Error Handling: Graceful error messages and loading states
- Responsive Design: Works on desktop and mobile devices
Create a .env file in the backend directory:
# Google Drive Configuration
GOOGLE_DRIVE_FOLDER_ID=your_folder_id_containing_model
REFRESH_TOKEN=your_google_refresh_token
CLIENT_ID=your_google_client_id
CLIENT_SECRET=your_google_client_secret# Build the image
docker build -t sentiment-api ./backend
# Run the container
docker run -p 8888:8888 --env-file ./backend/.env sentiment-apiThe application is ready for deployment on:
- Google Cloud Run
- AWS ECS
- Azure Container Instances
- Heroku
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Vu Tri Khai β Initial work β VuTriKhai
- HCMUT for the project opportunity
- Hugging Face for the BERT model
- FastAPI and React communities for excellent frameworks
- Google Drive API for model storage solution
For questions or support, please contact:
- Email: khai.vutri@hcmut.edu.vn
- GitHub: @khaivutri
Built with β€οΈ at Ho Chi Minh City University of Technology