Prerequisites | Quick Start | Service Endpoints | Architecture | Project Organization | UI Showcase.
- Python
- Conda or Venv
- Docker
- Clone the repo
git clone https://github.com/brnaguiar/mlops-next-watch.git
- Create environment
make env
- Activate conda env
source activate nwenv
- Install requirements / dependencies and assets
make dependencies
- Pull the datasets
make datasets
- Configure containers and secrets
make init
- Run Docker Compose
make run
- Populate production Database with users
make users
- Jupyter `http://localhost:8888`
- MLFlow `http://localhost:5000`
- Minio Console `http://localhost:9001`
- Airflow `http://localhost:8080`
- Streamlit Frontend `http://localhost:8501`
- FastAPI Backend` http://localhost:8000/`
- Grafana Dashboard `http://localhost:3000`
- Prometheus `http://localhost:9090`
- Pushgateway `http://localhost:9091`
- Spark UI `http://localhost:8081`
├── LICENSE
│
├── Makefile <- Makefile with commands like `make env` or `make run`
│
├── README.md <- The top-level README for developers using this project
│
├── data
│ ├── 01-external <- Data from third party sources
│ ├── 01-raw <- Data in a raw format
│ ├── 02-processed <- The pre-processed data for modeling
│ └── 03-train <- Splitted Pre-Processed data for model training
├── airflow
│ ├── dags <- Airflow Dags
│ ├── logs <- Airflow logging
│ ├── plugins <- Airflow default directory for Plugins like Custom Operators, Sensors, etc... (however, we use the dir `include` in dags for this purpose)
│ └── config <- Airflow Configurations and Settings
│
├── assets <- Project assets like jar files used in Spark Sessions
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks used in experimentation
│
├── docker <- Docker data and configurations
│
├── images <- Project images
│
├── requirements.local <- Required Site-Packages
│
├── requirements.minimal <- Required Dist-Packages
│
├── Makefile <- File containing rules and dependencies to automate building processes
│
├── setup.py <- Makes project pip installable (pip install -e .) so src can be imported
│
├── src <- Source code for use in this project.
│ │
│ ├── collaborative <- Source code for the collaborative recommendation strategy
│ │ └── models <- Collaborative models
│ │ └── nodes <- Data processing, validation, training, etc. functions (or nodes) that represent units of work.
│ │ └── pipelines <- Collection of orquestrated data processing, validation, training, etc. nodes, arranged in a sequence or a directed acyclic graph (DAG)
│ │
│ ├── conf <- Configuration files and parameters for the projects
│ │
│ ├── main.py <- Main script, mostly to run pipelines
│ │
│ ├── scripts <- Scripts, for instance, to create credentials files and populate databases
│ │
│ └── frontend <- Source code for the Application Interface
│ │
│ └── utils <- Project utils like Handlers and Controllers
│
└── tox.ini <- Settings for flake8
│
└── pyproject.toml <- Settings for the project, and tools like isort, black, pytest, etc.
Project based on the cookiecutter data science project template.