-
Python dependencies:
pip3 install duckdb pandas tabulate
-
Download dataset from s3:
s3://alex-datasets/dmv/dmv_fuel_type_passengers.csv
-
Copy the downloaded dataset into the root directory of the repo.
-
Create a copy of the template (
group_template
), please prefix your group folder withgroup_
:cp -rf group_template group_lightning_speed
-
Run your solution:
python3 group_lightning_speed/aggregation.py
-
Run all solutions:
aggregation
python3 gather_results.py aggregation `ls -d group_*`
join
python3 gather_results.py join `ls -d group_*`
-
Checkin:
git add group_lightning_speed git commit -m "updated group_lightning_speed" git fetch && git rebase origin/main && git push origin main