-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support treetime ambiguous date format #1305
base: master
Are you sure you want to change the base?
Conversation
Add support for the treetime ambiguous date format `[float_min:float_max]`, e.g. `[2004.43:2005.58]` This is useful for when ambiguity is not limited to month/year boundaries. Resolves #1304
Codecov ReportPatch coverage is
📢 Thoughts on this report? Let us know!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this! It's a simple solution to something we've discussed (I forget where exactly) and seems to work well for you.
@@ -121,6 +121,12 @@ def get_numerical_date_from_value(value, fmt=None, min_max_year=None): | |||
except InvalidDate as error: | |||
raise AugurError(str(error)) from error | |||
return [treetime.utils.numeric_date(d) for d in ambig_date] | |||
if value.startswith('[') and value.endswith(']') and len(value[1:-1].split(':'))==2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For added context, this is the corresponding logic in TreeTime: treetime/utils.py#L305-L312
@@ -121,6 +121,12 @@ def get_numerical_date_from_value(value, fmt=None, min_max_year=None): | |||
except InvalidDate as error: | |||
raise AugurError(str(error)) from error | |||
return [treetime.utils.numeric_date(d) for d in ambig_date] | |||
if value.startswith('[') and value.endswith(']') and len(value[1:-1].split(':'))==2: | |||
# Treetime ambiguous date format, e.g. [2019.5:2020.5] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious how you get the numerical dates in metadata. Is this a standard format outside of TreeTime, or do you run ISO dates through something like treetime.utils.numeric_date
to get the float representation?
Asking because I think a good next step after this gets merged is to generalize for any [A:B]
date pair where A
and B
can be anything that's currently supported as an exact scalar date (e.g. [2023-07-31:2023-08-06]
). I did some prototyping of this a while back but never got back to creating a proper PR for it.
try: | ||
return [float(x) for x in value[1:-1].split(':')] | ||
except ValueError as error: | ||
raise AugurError(str(error)) from error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: customize the error message.
When given something like [2023-07-31:2023-08-06]
, the ValueError
message looks something like:
could not convert string to float: '2023-07-31'
It's obvious what this means to us as developers, but maybe not so obvious to users. Something like this might be more clear:
The date range format must be `[A:B]` where both `A` and `B` are numerical dates.
Shifting the design decision a bit in the motivating issue. I'll plan to update this PR with that and address the conversations above. |
Add support for the treetime ambiguous date format
[float_min:float_max]
, e.g.[2004.43:2005.58]
This is useful for when ambiguity is not limited to month/year boundaries, e.g. in case of Danish SARS-CoV-2 sequences which have their dates binned rounded down to the most recent Monday.
Resolves #1304
With this PR, we can use cross-month ambiguous dates that aren't whole year, and within-month ambiguous dates as well.
E.g. this wouldn't have been possible previously without unconstraining everything but the year:
Testing
What steps should be taken to test the changes you've proposed?
If you added or changed behavior in the codebase, did you update the tests, or do you need help with this?
ncov-simplest
, turning Danish dates into ranges here and running refine with the new parsing feature hereChecklist