Skip to content

ImdataScientistSachin/Google-Apps-Recommendation-

Repository files navigation

Google Apps Recommendation System

Python Flask Scikit-Learn Docker

A sophisticated machine learning-powered recommendation engine designed to suggest Google Play Store applications based on user similarity preferences. This system utilizes a content-based filtering approach, analyzing app features such as categories, ratings, reviews, and genres to provide personalized recommendations.

🚀 Features

  • Intelligent Recommendations: Uses Cosine Similarity on TF-IDF vectorized text features and scaled numerical data.
  • Robust REST API: Fully documented API endpoints for integrating recommendations into frontend applications.
  • Hybrid Feature Engineering: Combines textual data (App Name, Category, Genres) with numerical metrics (Reviews, Ratings, Installs) for high-accuracy matching.
  • Dockerized Deployment: Production-ready Dockerfile for easy containerization and deployment.
  • Popularity Metrics: specific endpoints to fetch trending and popular applications.
  • Live Health Monitoring: Health check endpoints to monitor model status and system readiness.

🛠️ Technology Stack

  • Backend Framework: Flask (Python)
  • Machine Learning: Scikit-Learn, NumPy, Pandas
  • Data Processing: TF-IDF Vectorization, MinMax Scaling, OneHot Encoding
  • Containerization: Docker, Gunicorn

📂 Project Structure

GoogleAppsRecom/
├── app/                    # Application Source Code
│   ├── __init__.py         # App Flask Factory
│   ├── main.py             # Entry point for Gunicorn
│   ├── recommender.py      # Core ML Recommendation Logic
│   ├── routes.py           # API Endpoints
│   ├── models.py           # Database Models
│   └── utils.py            # Utility functions
├── data/                   # Dataset Directory
│   └── googleplaystore.csv # Source Data
├── models/                 # Serialized ML Models
│   ├── tfidf_vectorizer.pkl
│   ├── similarity_matrix.pkl
│   └── ...
├── train_model.py          # Script to Train & Save Models
├── run.py                  # Local Development Server
├── Dockerfile              # Docker Configuration
└── requirements.txt        # Python Dependencies

⚡ Getting Started

Prerequisites

  • Python 3.9+ installed
  • pip package manager

Local Installation

  1. Clone the Repository

    git clone https://github.com/yourusername/GoogleAppsRecom.git
    cd GoogleAppsRecom
  2. Install Dependencies

    pip install -r requirements.txt
  3. Data Setup Ensure your Google Play Store dataset (googleplaystore.csv) is placed in the data/ directory.

  4. Train the Model Before running the app, generate the similarity matrices and model artifacts:

    python train_model.py

    This will create the necessary .pkl files in the models/ directory.

  5. Run the Application

    python run.py

    The server will start at http://localhost:5000.

🐳 Docker Deployment

  1. Build the Image

    docker build -t google-apps-recom .
  2. Run the Container

    docker run -p 5000:5000 google-apps-recom

🔌 API Documentation

1. Get Recommendations

Retrieves a list of recommended apps similar to the requested app.

  • Endpoint: /api/recommend
  • Method: GET or POST
  • Query Params: ?app_name=<name>
  • Response:
    {
      "status": "success",
      "app_name": "Photo Editor",
      "recommendations": [
        { "App": "Photo Editor Pro", "Category": "PHOTOGRAPHY", "Rating": 4.5 },
        ...
      ]
    }

2. Get Popular Apps

Returns a list of top-performing apps based on review counts and ratings.

  • Endpoint: /api/popular
  • Method: GET
  • Query Params: ?count=10
  • Response:
    {
      "success": true,
      "popular_apps": [...]
    }

3. System Health

Check the status of the ML models and server health.

  • Endpoint: /api/health
  • Method: GET

🧠 Model Details

The recommendation engine operates using a Content-Based Filtering strategy:

  1. Textual Analysis: App Names, Categories, and Genres are concatenated and transformed using TF-IDF Vectorization (n-grams: 1,2) to capture semantic similarity.
  2. Numerical Scaling: Metrics like Reviews, Size, and Installs are normalized using MinMax Scaler to ensure balanced weighting.
  3. Categorical Encoding: Meta-data like Content Rating and Type are encoded via OneHotEncoding.
  4. Similarity Calculation: A Cosine Similarity Matrix is computed across all features to find the nearest neighbors (most similar apps) in the multi-dimensional feature space.

Author

Developed by [Your Name] - Data Scientist & ML Engineer.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published