Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate hospital admission patch #2043

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

aysim319
Copy link
Contributor

Description

automate patching for hospital-admission

Changelog

Itemize code/test/documentation changes and files added/removed.

  • change1
  • change2

Associated Issue(s)

claims_hosp/delphi_claims_hosp/patch.py Outdated Show resolved Hide resolved
claims_hosp/delphi_claims_hosp/patch.py Outdated Show resolved Hide resolved
custom_run_flag = (
False if not params["indicator"].get("custom_run", False) else params["indicator"].get("custom_run", False)
)
if not logger:
Copy link
Contributor

@jingjtang jingjtang Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "if logger:" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, basically how the patching code has worked is that it's a wrapper to called the run script within in a for loop with some customization and in order to make sure that logging is different from patching and a regular run is to pass on a logger as a parameter that's created from patch.

if it's a regular run, it's not going to have that

So the logic goes, if the logger exists already, then it's logger from patch, if not we need to create the logger



def merge_existing_backfill_files(backfill_dir, backfill_file, issue_date, logger):
"""
Copy link
Contributor

@jingjtang jingjtang Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add more explanations to this function? Otherwise people can easily get confused by this one and the function below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the explanation I had made sense.

@aysim319
Copy link
Contributor Author

aysim319 commented Sep 9, 2024

Documenting comments here not mentioned in the PR.

@minhkhul mentioned in dm: we need to consider patches if the outages are more than n_checked days?

There would be a cascading effect of having to fixing the subsequent merged parquet files. Not sure if that level of outage should be handled in an automated fashion due to touching many more backfill files.

Second thing @minhkhul pointed out: need to consider outage days where the date falls between the start or the end date.

Totally valid edge case that I forgot to consider and need to check for the edge case.


It will generate data for that range of issue dates, and store them in batch issue format:
[name-of-patch]/issue_[issue-date]/doctor-visits/actual_data_file.csv
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from docstring here, I suggest adding some patching instructions like this in the indicator readme too.

Comment on lines +26 to +27
if issue_date is None:
assert current_date.date() == latest_timestamp.date(), "no drop for today"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if issue_date is None:
assert current_date.date() == latest_timestamp.date(), "no drop for today"
assert current_date.date() == latest_timestamp.date(), "no drop for today"

This is fine without the issue date check, since there might be times where a date (or more) in a patch date range really has no source drop on that date.
Since latest_timestamp is only grabbing latest timestamp in the input_dir, not on the ftp server, and the patch code downloads files into input_dir one issue date at a time, the old assert would still do what it's supposed to do fine in patching context.

claims_hosp/delphi_claims_hosp/backfill.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants