The Open RFPs Project is modeled after Sunlight Labs' OpenStates. Open RFPs collects and makes available data about contracting activities, including RFP listings as well as awards, and makes that information available in a standardized format.
IRC: Find us on #openrfps on Freenode.
The first thing to contribute is the location of the best starting page in your state for someone to create a scraper. You can add that to the wiki page.
It's early days, and we're still figuring out the best development toolchain and methods for structuring these scrapers. Expect this section of the README to morph into its own separate guide in the near-future.
At present, this project is focused on building scrapers that collect RFP data into JSON documents. The scrapers can be found in the scrapers/ directory, with a separate directory for each state using that state's two letter abbreviation (for example: CA, OR, etc.).
An RFP scraper for a given state should have at least three files in its directory:
Basic configuration and metadata for the parsers. See our example config.yml.
This is the important one, as it handles the scraping of RFPs from the specified government's website. See an example, or read the annotated source.
Other governmental bodies are also welcome. Should you write a scraper for them, you can add them in a cities/[CITYNAME]
directory inside the appropriate state's directory.
So far, we have:
- Cities:
ca/cities/san-francisco
- Counties:
ca/counties/alameda
- School districts:
ca/schools/busd
Make sure your scraper provides the same three files described above in its directory. We're happy to accept contributions of any kind, but remember that our primary goal is all 50 states.
We've chosen Node.js because of its module-loading implementation, its accessibility to the programming community ("Everyone knows Javascript!"), and its asynchronous-by-default approach. As with most Node.js projects, we use npm to package this project and specify its dependencies. We like CoffeeScript for its expressiveness and improvements over JavaScript, but you can write your scraper in any language that compiles to JavaScript.
- node.js + npm
- install the
coffee-script
package globally:npm install -g coffee-script
. - install the rest of the dependencies:
npm install
We've built a lightweight command-line interface to help you run and test scrapers. If you run bin/openrfps --help
from the project root, you'll see some info:
Usage: openrfps [options] [command]
Commands:
run <file> run a scraper and output the results
test <file> test a scraper
help [cmd] display help for [cmd]
Options:
-h, --help output usage information
-V, --version output the version number
While starting to develop a scraper, you'll probably want to use a command like:
bin/run-scraper scrapers/ga/rfps.coffee
This command will:
- Run the Georgia RFP scraper.
- Cache its results to
scrapers/ga/rfps.json
. - Pretty-print the returned JSON.
Once you're confident that your results are shaping up, try running them against our test suite:
bin/test-scraper scrapers/ga/rfps.coffee
By default, the test
command will use the cached .json
file that we downloaded earlier.
To run both the scraper and the tests all with one command:
bin/test-scraper scrapers/ga/rfps.coffee --force
See OUTPUT.md for the current schema.
We're doing this for two reasons:
-
Because citizens have a right to know what kinds of RFPs their governments are releasing to the public, who is being awarded these contracts, and how much those projects cost.
-
Because we want to open up the marketplace, and we believe that process starts with usability and accessibility. State procurement websites are very challenging to use by even highly computer-literate individuals, to say nothing of automating the bidding process.
By enabling more companies to compete for these contracts, we think that this can unlock a lot of potential for civic innovation, increase competition, decrease the cost of government, and increase the level of service delivery. We hope you'll join us for the long haul.
We're excited to partner with government agencies who are willing to publish their data in an open, standard format from the start. You can contact us using this form.