-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automate hospital admission patch #2043
base: main
Are you sure you want to change the base?
Conversation
custom_run_flag = ( | ||
False if not params["indicator"].get("custom_run", False) else params["indicator"].get("custom_run", False) | ||
) | ||
if not logger: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be "if logger:" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, basically how the patching code has worked is that it's a wrapper to called the run script within in a for loop with some customization and in order to make sure that logging is different from patching and a regular run is to pass on a logger as a parameter that's created from patch.
if it's a regular run, it's not going to have that
So the logic goes, if the logger exists already, then it's logger from patch, if not we need to create the logger
|
||
|
||
def merge_existing_backfill_files(backfill_dir, backfill_file, issue_date, logger): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more explanations to this function? Otherwise people can easily get confused by this one and the function below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully the explanation I had made sense.
Documenting comments here not mentioned in the PR.
There would be a cascading effect of having to fixing the subsequent merged parquet files. Not sure if that level of outage should be handled in an automated fashion due to touching many more backfill files.
Totally valid edge case that I forgot to consider and need to check for the edge case. |
|
||
It will generate data for that range of issue dates, and store them in batch issue format: | ||
[name-of-patch]/issue_[issue-date]/doctor-visits/actual_data_file.csv | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from docstring here, I suggest adding some patching instructions like this in the indicator readme too.
if issue_date is None: | ||
assert current_date.date() == latest_timestamp.date(), "no drop for today" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if issue_date is None: | |
assert current_date.date() == latest_timestamp.date(), "no drop for today" | |
assert current_date.date() == latest_timestamp.date(), "no drop for today" |
This is fine without the issue date check, since there might be times where a date (or more) in a patch date range really has no source drop on that date.
Since latest_timestamp
is only grabbing latest timestamp in the input_dir
, not on the ftp server, and the patch code downloads files into input_dir
one issue date at a time, the old assert would still do what it's supposed to do fine in patching context.
Description
automate patching for hospital-admission
Changelog
Itemize code/test/documentation changes and files added/removed.
Associated Issue(s)