RideFare is a portfolio-grade pricing intelligence product that rebuilds a notebook-centered ride fare analysis into a reproducible data pipeline, a documented ML workflow, and a public Spanish-language web app. The repository shows how raw ride and weather files become validated marts, explainable model artifacts, and a deployed editorial interface for urban mobility storytelling.
| Challenge | System | Outcome |
|---|---|---|
| Legacy analysis lived in a notebook and ad-hoc scripts | Rebuilt as Python commands, DuckDB + dbt marts, and typed frontend contracts |
Reproducible pipeline from raw CSVs to deployed product |
| Public pricing story needed to work without a live inference API | Versioned JSON exports feed a static-first Next.js experience |
Stable previews, deterministic deploys, and transparent artifacts |
| Model outputs had to be explainable enough for portfolio storytelling | Temporal evaluation, SHAP exports, and a bounded scenario simulator | ML behavior is visible in docs, artifacts, and the public UI |
RideFare ships four public routes:
/introduces the project as an editorial analytics product/dashboardturns the analytics mart into a public pricing intelligence surface/como-funcionatranslates the pipeline and ML workflow into reader-friendly narrative/escenariosexposes the exported simulator artifacts through an explainable scenario lab
![]() |
![]() |
| Layer | Stack | Role |
|---|---|---|
| Data ingestion | Python, Polars, Pandera |
validate, normalize, and store clean ride/weather inputs |
| Analytics modeling | DuckDB, dbt, Parquet |
build stable marts for analytics and ML consumption |
| Machine learning | scikit-learn, XGBoost, SHAP |
temporal evaluation, baseline comparison, explainability, and exports |
| Web product | Next.js, TypeScript, Tailwind CSS, Framer Motion, Apache ECharts |
deliver the public Spanish-language interface |
| Automation | pytest, Ruff, GitHub Actions, release-please, Vercel |
validate, refresh artifacts, deploy previews/production, and manage releases |
The implementation converges on these repo-level interfaces:
ridefare ingestridefare transformridefare trainridefare export-webscripts/refresh-public-artifacts.ps1- versioned public artifacts under:
data/processed/analytics/webdata/processed/ml/web
powershell -ExecutionPolicy Bypass -File .\scripts\bootstrap.ps1
powershell -ExecutionPolicy Bypass -File .\scripts\refresh-public-artifacts.ps1 -RunId local-demo
powershell -ExecutionPolicy Bypass -File .\scripts\validate-python.ps1
corepack pnpm --filter web devIf you want to run the pipeline step by step instead of the refresh wrapper:
ridefare ingest --rides-path data/samples/raw/PFDA_rides.csv --weather-path data/samples/raw/PFDA_weather.csv
ridefare transform
ridefare train --run-id local-demo
ridefare export-web --run-id local-democi.ymlvalidates Python, regenerates public artifacts in workspace, and verifies the web build against those exportspipeline-refresh.ymlrefreshes the public JSON subsets when backend or sample data changes reachmastervercel-preview.ymlpublishes preview deployments for pull requestsvercel-production.ymldeploys the production site frommasterrelease-please.ymlmanages release PRs and changelog automation
Operational details live in:
RideFare/
|- apps/web/ # Public product in Spanish
|- src/ridefare/ # Production Python package
|- dbt/ # Analytics and ML marts
|- data/ # Raw, interim, processed, and sample zones
|- docs/ # Architecture, ML, UI, ADRs, and runbooks
|- tests/ # Unit and integration coverage
|- scripts/ # Bootstrap, validation, and artifact refresh helpers
`- .github/workflows/ # CI, refresh, deploy, and release automation
- RideFare is not an online inference service
- the public simulator consumes exported artifacts, not live backend predictions
- the current sample data is intentionally small for reproducible portfolio execution
- auth, user accounts, and stateful product features remain out of scope
- the legacy notebook is preserved only as reference material, not as the operational center
If you find this project useful, please consider giving the repository a star.



