Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.2.dev0] Template for loading data #93

Merged
merged 20 commits into from
Feb 20, 2019
Merged

[0.2.dev0] Template for loading data #93

merged 20 commits into from
Feb 20, 2019

Conversation

smmaurer
Copy link
Member

@smmaurer smmaurer commented Feb 14, 2019

This PR adds the initial chunk of data i/o functionality, as described in issue #66.

New template: urbansim_templates.io.TableFromDisk()

Template for registering data tables from csv or h5 files, to replace datasources.py. See docstrings and sphinx file for full documentation, which will appear online after this PR is merged.

Basic usage: Create an instance of the template class and set some properties (table name, file type, path, etc). "Registering" with ModelManager saves the object to disk and creates an Orca step with instructions to set up a table. "Running" the object/step registers the Orca table. Data is read from disk lazily when it's needed.

New standard template property: autorun

When you register a template instance with ModelManager, it now checks for a property called autorun. If it's present and True, ModelManager will immediately "run" the step.

This will be helpful for preparatory logic like this new template: If you have some table-loading steps defined in your configs directory, the tables will automatically be registered with Orca when you initialize a ModelManager session.

Data validation

The new template includes a validate() method that checks some basic expectations about the data that's being loaded:

  • contains a unique index (or multi-index)
  • if it contains columns whose names match the indexes of previously registered tables, checks whether they make sense as join keys
  • performs the same check for columns of previously registered tables whose names match the index of the new table

This gets us much of the way to either (a) automatically generating Orca "broadcasts" between tables or (b) performing table merges without needing broadcasts. See issue #78 for more on this.

Versioning

0.2.dev0

To do before merging

  • implement and test h5 features
  • finish documentation
  • update changelog
  • finalize versioning

@coveralls
Copy link

coveralls commented Feb 15, 2019

Coverage Status

Coverage decreased (-0.06%) to 88.843% when pulling 9409d2f on data-loading into 1a5f2bf on master.

@smmaurer smmaurer merged commit 1879a72 into master Feb 20, 2019
@smmaurer smmaurer deleted the data-loading branch February 20, 2019 03:16
This was referenced Feb 20, 2019
@smmaurer smmaurer changed the title Template for data loading [0.2.dev0] Template for loading data Feb 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants