This repository presents a Hybrid Anime Recommendation System built with state-of-the-art Machine Learning and end-to-end MLOps practices.
The system leverages Collaborative Filtering and Content-Based Filtering to deliver personalized anime recommendations to users.
Unlike traditional ML projects, this solution is fully production-ready, integrating automation, scalability, and monitoring. It combines ML pipelines with cloud-native deployment to demonstrate real-world engineering skills.
π Highlights:
- β Hybrid recommender system (Collaborative + Content-Based)
- β MLOps-driven workflow (DVC, Comet ML, Jenkins, Docker, Kubernetes)
- β Cloud-native architecture deployed on Google Kubernetes Engine (GKE)
- β Scalable, reproducible, and automated end-to-end ML lifecycle
- π― Personalized Recommendations β Top-10 anime suggestions per user
- π Hybrid Recommendation Engine β Combines user behavior + anime metadata
- π¦ MLOps Integration β Data versioning, experiment tracking, CI/CD pipelines
- βοΈ Cloud Deployment β Google Cloud Storage + Kubernetes cluster (GKE)
- π Experiment Transparency β Metrics & embeddings logged via Comet ML
- β‘ Scalability & Reliability β Autoscaling pods, load balancing, and CI/CD
- Python 3.8
- Flask (backend web app)
- TensorFlow / Keras (deep learning models)
- Pandas, NumPy (data wrangling)
- DVC β Dataset & model versioning
- Comet ML β Experiment tracking & visualization
- Docker β Containerization for portability
- Kubernetes (GKE) β Orchestration & scaling
- Jenkins β CI/CD automation
- Google Cloud Storage (GCS) β Artifact & dataset storage
flowchart TD
A[User Input: User ID] --> B[Flask Web Interface]
B --> C[Hybrid Recommendation Pipeline]
C --> D[Collaborative Filtering: User-Anime Interactions]
C --> E[Content-Based Filtering: Metadata & Synopsis]
D & E --> F[Score Fusion & Ranking]
F --> G[Top-10 Anime Recommendations]
G --> H[Results Rendered on Web Page]
- Learns latent userβanime factors from rating matrix
- Embedding model (
RecommenderNet) with:user_id β user_embeddinganime_id β anime_embedding
- Optimized with binary crossentropy for recommendation accuracy
- Leverages metadata features: genres, synopsis, anime type, ratings
- Computes semantic similarity between anime via embeddings
- Weighted combination of collaborative and content-based predictions
- Balances personalization with content diversity
- Produces ranked top-10 recommendations per user
flowchart LR
A[GitHub Commit] --> B[Jenkins CI/CD Pipeline]
B --> C[Install Dependencies & Run Tests]
C --> D[DVC Pull Data & Models from GCS]
D --> E[Build Docker Image]
E --> F[Push Docker Image to Google Container Registry]
F --> G[Deploy to Kubernetes: GKE]
G --> H[LoadBalancer Service β Exposes Flask Web App]
Pipeline Stages:
- Code Integration β GitHub β Jenkins
- Data Retrieval β DVC pulls datasets & artifacts from GCS
- Containerization β Build & push Docker image to GCR
- Deployment β Apply Kubernetes manifests (
deployment.yaml) - Scaling β Kubernetes HPA ensures high availability
Every experiment is logged to Comet ML Dashboard, including:
- π Training & validation losses
- π§© User & anime embeddings
- π§ Hyperparameters & metrics
- π¦ Model checkpoints
This ensures comparability, reproducibility, and optimization of models.
.
βββ application.py # Flask web app
βββ pipeline/ # ML training & prediction pipelines
βββ src/ # Core ML code (data processing, models)
βββ config/ # Config files & paths
βββ artifacts/ # DVC-tracked datasets & models
βββ static/ # CSS, JS & frontend assets
βββ templates/ # HTML templates (Flask)
βββ deployment.yaml # Kubernetes spec
βββ Dockerfile # Docker build config
βββ Jenkinsfile # CI/CD pipeline
βββ requirements.txt # Python dependencies
βββ setup.py # Package setup
SAMI-CODEAI