Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilities for merging tables #100

Closed
smmaurer opened this issue Mar 5, 2019 · 0 comments
Closed

Utilities for merging tables #100

smmaurer opened this issue Mar 5, 2019 · 0 comments

Comments

@smmaurer
Copy link
Member

smmaurer commented Mar 5, 2019

🎉 ISSUE 100 🎉

This is to sketch out functionality for validating implicit merge relationships, and performing merges. See issues #78 and #94.

utils.validate_table()

Check some basic expectations about the table:

  • unique index or multi-index
  • if columns match names of other indexes, do they make sense as join keys?
  • validate against an orca_test spec if available

This functionality is already implemented in the LoadTable() template's validate() method -- we can just pull it out.

utils.validate_all_tables()

Run for all tables.

I think the validation will not be performed automatically when tables are registered, because of performance, but we can recommend that users run the validation at some point that makes sense for their workflow.

utils.merge_tables()

Replaces orca.merge_tables(), using implicit join keys instead of "broadcasts". Should support multi-indexes.

Stricter and more deterministic than Orca:

  • tables merged in listed order
  • requested column names must be unique
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant