A sophisticated machine learning-powered recommendation engine designed to suggest Google Play Store applications based on user similarity preferences. This system utilizes a content-based filtering approach, analyzing app features such as categories, ratings, reviews, and genres to provide personalized recommendations.
- Intelligent Recommendations: Uses Cosine Similarity on TF-IDF vectorized text features and scaled numerical data.
- Robust REST API: Fully documented API endpoints for integrating recommendations into frontend applications.
- Hybrid Feature Engineering: Combines textual data (App Name, Category, Genres) with numerical metrics (Reviews, Ratings, Installs) for high-accuracy matching.
- Dockerized Deployment: Production-ready
Dockerfilefor easy containerization and deployment. - Popularity Metrics: specific endpoints to fetch trending and popular applications.
- Live Health Monitoring: Health check endpoints to monitor model status and system readiness.
- Backend Framework: Flask (Python)
- Machine Learning: Scikit-Learn, NumPy, Pandas
- Data Processing: TF-IDF Vectorization, MinMax Scaling, OneHot Encoding
- Containerization: Docker, Gunicorn
GoogleAppsRecom/
├── app/ # Application Source Code
│ ├── __init__.py # App Flask Factory
│ ├── main.py # Entry point for Gunicorn
│ ├── recommender.py # Core ML Recommendation Logic
│ ├── routes.py # API Endpoints
│ ├── models.py # Database Models
│ └── utils.py # Utility functions
├── data/ # Dataset Directory
│ └── googleplaystore.csv # Source Data
├── models/ # Serialized ML Models
│ ├── tfidf_vectorizer.pkl
│ ├── similarity_matrix.pkl
│ └── ...
├── train_model.py # Script to Train & Save Models
├── run.py # Local Development Server
├── Dockerfile # Docker Configuration
└── requirements.txt # Python Dependencies- Python 3.9+ installed
- pip package manager
-
Clone the Repository
git clone https://github.com/yourusername/GoogleAppsRecom.git cd GoogleAppsRecom -
Install Dependencies
pip install -r requirements.txt
-
Data Setup Ensure your Google Play Store dataset (
googleplaystore.csv) is placed in thedata/directory. -
Train the Model Before running the app, generate the similarity matrices and model artifacts:
python train_model.py
This will create the necessary
.pklfiles in themodels/directory. -
Run the Application
python run.py
The server will start at
http://localhost:5000.
-
Build the Image
docker build -t google-apps-recom . -
Run the Container
docker run -p 5000:5000 google-apps-recom
Retrieves a list of recommended apps similar to the requested app.
- Endpoint:
/api/recommend - Method:
GETorPOST - Query Params:
?app_name=<name> - Response:
{ "status": "success", "app_name": "Photo Editor", "recommendations": [ { "App": "Photo Editor Pro", "Category": "PHOTOGRAPHY", "Rating": 4.5 }, ... ] }
Returns a list of top-performing apps based on review counts and ratings.
- Endpoint:
/api/popular - Method:
GET - Query Params:
?count=10 - Response:
{ "success": true, "popular_apps": [...] }
Check the status of the ML models and server health.
- Endpoint:
/api/health - Method:
GET
The recommendation engine operates using a Content-Based Filtering strategy:
- Textual Analysis: App Names, Categories, and Genres are concatenated and transformed using TF-IDF Vectorization (n-grams: 1,2) to capture semantic similarity.
- Numerical Scaling: Metrics like Reviews, Size, and Installs are normalized using MinMax Scaler to ensure balanced weighting.
- Categorical Encoding: Meta-data like Content Rating and Type are encoded via OneHotEncoding.
- Similarity Calculation: A Cosine Similarity Matrix is computed across all features to find the nearest neighbors (most similar apps) in the multi-dimensional feature space.
Developed by [Your Name] - Data Scientist & ML Engineer.