Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a test data set #25

Open
diafygi opened this issue May 19, 2019 · 0 comments
Open

Create a test data set #25

diafygi opened this issue May 19, 2019 · 0 comments
Assignees

Comments

@diafygi
Copy link
Collaborator

diafygi commented May 19, 2019

In order to test the ballotapi server, we need to create a test data set that can be loaded into the database.

Brainstorm on data repo structure

We want people to be able to browse, comment, fork, branch, and open pull requests on our core ballot data set, so I think we need to treat it as a git repo. Also, for the data/ folder, I want people to easily find the big elections, so I think we should split elections up into federal, state, and local levels for easier browsing.

/ballotapi-data/README.md (overview for the overall data repository)
/ballotapi-data/CONTRIBUTING.md (instructions on how to contribute)
/ballotapi-data/LICENSE (public domain license)
/ballotapi-data/tests/*.py (tests for sanity checking the database)
/ballotapi-data/data/ (folder for the actual data, split by national/state/local elections)
/ballotapi-data/data/national_elections/2019-01-01_primary/README.md (notes about the particular election)
/ballotapi-data/data/national_elections/2019-01-01_primary/election.yaml (the election object)
/ballotapi-data/data/national_elections/2019-01-01_primary/precincts/*.yaml (precinct objects)
/ballotapi-data/data/national_elections/2019-01-01_primary/contests/*.yaml (contest objects)
/ballotapi-data/data/state_elections/* (same structure as national elections)
/ballotapi-data/data/local_elections/* (same structure as national elections)

Thoughts on using YAML

I'm leaning towards using yaml instead of json for defining objects in the authoritative data set because its so much more flexible, including the ability to have comments. I want for people who are fixing edge cases to be able to add comments and annotations beyond what is shown on the API, so that others can easily see later why a data point is what it is (in addition to the git log information).

@diafygi diafygi self-assigned this May 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant