Skip to content

danielhz/seia_data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 

Repository files navigation

SEIA Data

A set of tools for crawling, storing and analyzing data from www.e-seia.cl, a chilean government site.

Requirements

A python interpreter and a unix like system with wget.

Usage

To download all pages you can use:

get_seia_project_list.py <pages>

Where pages is the number of pages that you must to lookup in the search results manually. For large numbers of pages this can take a while.

Downloaded pages will be stored in seia_project_list/ folder.

After download the list you can generate a YAML version with:

ruby project_list_parser.rb > project_list.yaml

To-do

  • Add range of time in the get_seia_project_list.py command argmuments.

  • Get the number of pages automatically for a time range.

  • Parse list pages to get the porjects id.

  • Download the project data.

  • Store the project data.

  • Write some queries.

License

This is free software under GPL v3.

About

A set of tools for crawling, storing and analysis for data in https://www.e-seia.cl

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages