This repository presents a simple pipeline to get statistics per vehicle from a data source. This project has the idea of providing the user with a view of the total trips made, total kilometers traveled, total moving time and total stopped time per vehicle and per month.
- AWS Account (sign up)
- Linux System
- SED
- Make
- Docker
├── /env
| ├── /airflow
| | ├── /logs
| | ├── /plugins
| ├── /app
| | ├── /statistic_per_vehicle
| | | ├── /extractor
| | | ├── /loader
| | | ├── /sender
| ├── /datasource
| ├── /datawarehouse
| ├── /kafka
| | ├── /connectors
| | ├── /libs
| | ├── /sink/config
| | ├── /source/config
| ├── /kibana
| | ├── /pgsync
├── /imgs
├── /source
| ├── /app
| | | ├── /dags
| | | ├── /etls
| | | | ├── /statistic_per_vehicle
| | | | | ├── querys
| | | ├── /services
| | | | ├── /email_sender
| ├── /data
| | | ├── /statistic_per_vehicle
The steps to set up the environment are a bit complex. So be careful
.
- Create an S3 bucket called
trip-statistics
(Creating a bucket) with all public acess - AWS Region: us-east-1 - Fill your email, AWS access key and AWS secrete access key in the
yourconfig.sh
file (it's in the root directory of the repository) - Finally, with the terminal open in the root directory of the repository, run the following command:
$ make
- If you want clean your environment run:
$ make clean
Note: You can individually starts the env components using make command. But keep in mind that some components have dependencies.
With environment up you can access the Airflow
to trigger the statistic per vehicle dag.
- In your browser access the portal through follwing URL:
localhost:8080
Environment information:
- URI to access the
data warehouse
(postgresdb):
localhost:3307
Data warehouse
login:
user: postgres
password: postgres
- Database:
mobi7_code_interview
- Table:
consumer_statistics
NOTE: Airflow grid is not showing up on the platform. The reason can see in this Github thread. The fix forecast is for Airflow version 2.3.2. But if you want the Airflow grid to appear, log in to the platform using the following credentials:
- Airflow login:
user: airflow
password: airflow
- Analytics dasbhoard:
NOTE: you can create your own Kibana dashboard via the following url:
localhost:5601