Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph Spec improvements #227

Open
EvanDietzMorris opened this issue May 10, 2024 · 2 comments
Open

Graph Spec improvements #227

EvanDietzMorris opened this issue May 10, 2024 · 2 comments

Comments

@EvanDietzMorris
Copy link
Contributor

There are a few obvious things we need to change about the way graph specs are processed.

  1. We should have the ability to specify multiple graph specs at once and queue up building graphs from multiple specs at once.

  2. We also need the ability to have sub graph dependencies cross over from one spec to another. For example the Baseline graph is shared by robokopkg and yobokop but currently there's no way to have it in one place and have each reference the same thing.

  3. Right now you can build just one graph from a graph spec but it still checks for latest versions of every source in the spec (for sources that don't have a pinned version). This is bad because when you just want to build one graph in a spec, it's a waste of time to check them all, and a failed version check could disrupt building a graph that doesn't even use that source.

@EvanDietzMorris
Copy link
Contributor Author

It might also be nice, but is a way lower priority, to have the ability to reference another graph but say you want to build a that graph without a particular source. For example the rule mining kp is the baseline minus tmkp, but currently there's no way to do that without just making another copy of the spec that's mostly redundant.

@EvanDietzMorris
Copy link
Contributor Author

Another thing that is simple but would be a nice change, is that currently if the load_manager pipeline fails for a single data source it crashes the entire graph it was part of, but it should just continue to process the rest of the data sources (but not attempt to build the graph) so that when you come back and fix the failure the rest of the work is done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant