Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better association of experiments with data #18

Open
aidanheerdegen opened this issue Jun 8, 2021 · 2 comments
Open

Better association of experiments with data #18

aidanheerdegen opened this issue Jun 8, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@aidanheerdegen
Copy link
Contributor

Currently there is no good way to find the experimental configuration and run directory for a dataset that is present in the cookbook.

This is important for a number of reasons, investigating model configuration, documentation, but also in case you want to spin a new experiment off from the old one.

My favoured solution: add a url field to the metadata to specify a git(hub) repository for the experiment control repo. Then strongly encourage (force) everyone who has data in the main DB to push their config to GitHub and add the URL to their metadata.

@aidanheerdegen aidanheerdegen added the enhancement New feature or request label Jun 8, 2021
@aekiss
Copy link
Contributor

aekiss commented Jun 8, 2021

Sounds like a good idea. One complication is that there are multiple commits in each run... but having any of them in the metadata would be a lot better than none.

FYI there's a bit in sync_data.sh that clones the run's git history to the sync location, in an attempt to address this issue
https://github.com/COSIMA/1deg_jra55_iaf/blob/master/sync_data.sh#L148-L152
though not everyone uses this script.

@aidanheerdegen
Copy link
Contributor Author

I like the idea of using the presence of a metadata.yaml file to signify the root directory of an experiment. This makes it simpler to build scripts to walk directory trees looking for experiments, and means if you want your data indexed you need to have a metadata.yaml file.

Once you make the metadata.yaml file compulsory then you can start checking for fields like url. You could even check the link is valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants