Judicial Data Scrapers for Nepal's Court System
Nepal Government Modernization (NGM) is a specialized data collection service that systematically scrapes and structures judicial data from Nepal's court system. The service collects case information, hearing records, and legal proceedings from all levels of Nepal's judiciary, making this public information accessible in a structured, queryable format.
NGM automates the collection of judicial data from Nepal's court websites, transforming unstructured web pages into a comprehensive database of court cases and proceedings. This enables:
- Transparency: Making court proceedings accessible to citizens, researchers, and journalists
- Accountability: Tracking case progression and judicial decisions over time
- Research: Enabling data-driven analysis of Nepal's judicial system
- Integration: Providing structured data for other services in the Jawafdehi ecosystem
NGM collects data from all levels of Nepal's court system:
- The highest court in Nepal
- Final appellate jurisdiction
- Constitutional interpretation
- 18 High Courts across Nepal
- Appellate jurisdiction over district courts
- Original jurisdiction in certain matters
- 77 District Courts (one per district)
- Original jurisdiction for most civil and criminal cases
- First level of judicial proceedings
- Specialized court for corruption and financial crimes
- Critical for anti-corruption efforts
- High-profile cases involving public officials
- Case Numbers: Unique identifiers (format: DDD-SS-DDDD)
- Registration Details: When and where cases were filed
- Case Types: Classification of legal matters (भ्रष्टाचार, चेक अनादर, etc.)
- Parties: Plaintiffs and defendants with addresses
- Legal Sections: Applicable laws and regulations
- Case Status: Current state (चालु, फैसला भएको, etc.)
- Verdicts: Final decisions and verdict dates
- Hearing Dates: When cases appear in court (BS and AD formats)
- Bench Information: Which judges are hearing the case
- Bench Types: Single bench (एकल इजलास) or joint bench (संयुक्त इजलास)
- Judge Names: Presiding judges for each hearing
- Lawyer Information: Legal representation for both sides
- Hearing Outcomes: Decisions, adjournments, and orders
- Party Details: Structured information about plaintiffs and defendants
- Addresses: Geographic information for parties
- Entity Resolution: Links to Nepal Entity Service for standardized entity identification
NGM is built on Scrapy, a powerful web scraping framework that provides:
- Robust error handling and retry logic
- Concurrent request processing
- Middleware for custom processing
- Pipeline architecture for data transformation
PostgreSQL database with four main tables:
- Courts: Master table of all courts in Nepal
- Court Cases: Case metadata and registration information
- Court Case Hearings: Individual hearing records over time
- Case Entities: Structured party information
NGM includes specialized spiders for each court type:
supreme_court_cases.py- Supreme Court case listingssupreme_case_enrichment.py- Detailed Supreme Court case informationhigh_court_cases.py- High Court case listingsdistrict_court_cases.py- District Court case listingsdistrict_case_enrichment.py- Detailed District Court case informationspecial_court_cases.py- Special Court case listingsspecial_case_enrichment.py- Detailed Special Court case informationkanun_patrika.py- Legal gazette scraper
- Case Listing: Scrape daily causelists to collect basic case information
- Case Enrichment: Follow links to detail pages for comprehensive case data
This approach ensures efficient data collection while respecting server resources.
- Dual Format Support: Stores dates in both Bikram Sambat (BS) and Gregorian (AD) formats
- Automatic Conversion: Uses
nepalilibrary for accurate date conversion - Timezone Awareness: Proper handling of Nepal timezone (UTC+5:45)
- Unicode Standardization: Consistent Nepali text representation
- Whitespace Handling: Proper trimming and formatting
- Character Encoding: UTF-8 throughout the pipeline
- Unique Constraints: Case number + court identifier ensures no duplicates
- Upsert Logic: Updates existing records with new information
- Hearing Tracking: Multiple hearings for the same case properly linked
- Scraped Dates Table: Tracks which dates have been successfully collected
- Resume Capability: Can restart from last successful scrape
- Status Tracking: Monitors enrichment status (pending, enriched, failed)
- Links case parties to standardized entities
- Enables cross-case entity tracking
- Supports entity-based search and analysis
- Provides court case data for corruption tracking
- Enables case-based accountability features
- Supports public access to judicial information
- Structured data enables statistical analysis
- Supports case outcome research
- Facilitates judicial performance studies
Part of the Jawafdehi Project: Nepal's open database for transparency and accountability.
License: See LICENSE file for details.
Contact: For questions or collaboration opportunities, please reach out through the Jawafdehi project channels.