Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set a new descriptive schema on mutations if none exists #383

Merged
merged 4 commits into from
Jun 4, 2024

Conversation

hyanwong
Copy link
Member

@hyanwong hyanwong commented Jun 4, 2024

And emit warning if mutation metadata cannot be set. This also means that existing mutation metadata is kept. However, since we now validate the metadata, it increases the time taken. E.g. dating the unified genealogy chr 2q (12462233 mutations) the metadata setting goes from 84 seconds to 403 seconds on my laptop (for comparison, the 10 EP steps take ~ 7 mins).

There might be a way to do this faster without validation if no node metadata already exists, I suppose.

@hyanwong
Copy link
Member Author

hyanwong commented Jun 4, 2024

New code avoids decoding the mutation metadata if there is none there. This reduces the time to 310 seconds for the metadata setting on chr2q. I think this is probably a fair compromise: it's always going to take time to validate 12 million JSON entries rather than just dumping raw bytes.

@hyanwong
Copy link
Member Author

hyanwong commented Jun 4, 2024

Aha - it's the validation that takes time: the encoding is quick. I've made a change to only bother validating if there is existing data, and that brings it down to 95 secs

@hyanwong
Copy link
Member Author

hyanwong commented Jun 4, 2024

I'm just going to merge this now. Is that OK @nspope ? Do you want to look over it beforehand?

And emit warning if mutation metadata cannot be set
Also allow overwriting, and save node time schema using this
@hyanwong hyanwong merged commit 446f4ff into tskit-dev:main Jun 4, 2024
3 checks passed
@hyanwong hyanwong mentioned this pull request Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant