A Python-based scraper that automatically collects daily passenger flow data from Chennai Metro Rail Limited (CMRL).
This project scrapes passenger data from CMRL's Passenger Flow Portal and stores it in CSV format. The data includes:
- Hourly passenger counts
- Station-wise passenger flow for Line 1 and Line 2
- Ticket type distribution statistics
- Data is scraped automatically at 12:15 AM IST daily using GitHub Actions
- Historical data available from January 20, 2025
- Data is stored in the
data/
directory in CSV format:passenger_flow_hourly.csv
: Hourly passenger countspassenger_flow_line_01.csv
: Line 1 station-wise datapassenger_flow_line_02.csv
: Line 2 station-wise datapassenger_ticket_count.csv
: Daily ticket type statistics
- For more details check here
- Clone the repository:
git clone https://github.com/heshinth/cmrl-passenger-data-scraper.git
cd cmrl-passenger-data-scraper
-
Install dependencies using
uv
:uv sync --all-extras
-
Run the scraper manually:
python main.py
- Add JSON schema validation for API response changes
- Add data validation checks
- Add API documentation
- Implement error notification system
Contributions are welcome! Feel free to open issues or submit pull requests.
For any inquiries or issues, please open an issue on the GitHub repository.
- This project scrapes data directly from the CMRL's passenger flow portal
- The data is provided "as is" without any guarantees of accuracy or completeness
- This is an unofficial tool and not affiliated with CMRL
- The scraping process may be affected by changes to CMRL's website structure
This project is licensed under the MIT License - see the LICENSE file for details.