Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release Delphi Epidata 4.1.25 #1508

Merged
merged 25 commits into from
Jul 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
639e159
Merge pull request #1490 from cmu-delphi/bot/sync-main-dev
melange396 Jul 9, 2024
5f23936
docs: very minor dev doc update (#1491)
dshemetov Jul 10, 2024
e26f2c9
covid act now links and context around deactivation
nmdefries Jul 10, 2024
de4883f
Merge pull request #1492 from cmu-delphi/ndefries/CAN-update-source-l…
nmdefries Jul 15, 2024
df213ee
put hhs docs under inactive signals
minhkhul Jul 15, 2024
7799011
add back .md suffix to ind-combo
nmdefries Jul 15, 2024
9340ba5
Merge pull request #1493 from cmu-delphi/inactive-hhs-docs
nmdefries Jul 15, 2024
829e9f1
Merge pull request #1494 from cmu-delphi/ndefries/indcombo-suffix
nmdefries Jul 15, 2024
44ce849
refactor: use delphi_utils.logger instead of copied file
dshemetov Jul 17, 2024
81179c5
lint: trailing whitespace changes
dshemetov Jul 17, 2024
8805f3c
repo: ignore lint in blame
dshemetov Jul 17, 2024
b4b9232
Merge pull request #1488 from cmu-delphi/ds/logger2
melange396 Jul 17, 2024
546f2f6
fix: wrong blame commit
dshemetov Jul 17, 2024
46a46a2
Merge pull request #1495 from cmu-delphi/dshemetov-patch-1
melange396 Jul 18, 2024
35d67a7
One-time version check (#1456)
rzats Jul 18, 2024
243a22f
New CTIS publication
capnrefsmmat Jul 21, 2024
5711322
Merge pull request #1499 from cmu-delphi/ctis/pub
melange396 Jul 23, 2024
b941e98
troubleshooting gh action for signals sync (gdoc-->csv)
melange396 Jul 25, 2024
a4a5fcf
avoid linking keywords in PR template
melange396 Jul 25, 2024
af96e25
Merge pull request #1503 from cmu-delphi/signals_sync_action_fix
melange396 Jul 25, 2024
72b4050
py client version check fixes and cleanup (#1497)
melange396 Jul 25, 2024
3e39e22
Merge pull request #1504 from cmu-delphi/pr_template_keyword_removal
melange396 Jul 26, 2024
d9d024e
Update Google Docs Meta Data (#1501)
github-actions[bot] Jul 29, 2024
5ec88ef
chore: release delphi-epidata 4.1.25
minhkhul Jul 29, 2024
704e898
Update (client) CHANGELOG.md -- release date & missing PR
melange396 Jul 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 4.1.24
current_version = 4.1.25
commit = False
tag = False

Expand Down
2 changes: 2 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,5 @@ b9ceb400d9248c8271e8342275664ac5524e335d
07ed83e5768f717ab0f9a62a9209e4e2cffa058d
# style(black): format wiki acquisition
923852eafa86b8f8b182d499489249ba8f815843
# lint: trailing whitespace changes
81179c5f144b8f25421e799e823e18cde43c84f9
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
closes|addresses <!--list issues closed or partially-addressed by this PR -->
addresses issue(s) #ISSUE <!--list issue(s) associated with this PR -->

### Summary:

Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/update_gdocs_data.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,10 @@ jobs:
restore-keys: |
${{ runner.os }}-pipd-
- name: Install Dependencies
run: pip install -r requirements.dev.txt
run: |
pip -V
python -m pip install pip==22.0.2
pip install -r requirements.dev.txt
- name: Update Docs
run: inv update-gdoc
- name: Create pull request into dev
Expand Down
2 changes: 1 addition & 1 deletion dev/local/setup.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[metadata]
name = Delphi Development
version = 4.1.24
version = 4.1.25

[options]
packages =
Expand Down
37 changes: 21 additions & 16 deletions docs/api/covidcast-signals/covid-act-now.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,13 @@ grand_parent: COVIDcast Main Endpoint
* **Time type:** day (see [date format docs](../covidcast_times.md))
* **License:** [CC BY-NC](../covidcast_licensing.md#creative-commons-attribution-noncommercial)

The COVID Act Now (CAN) data source provides COVID-19 testing statistics, such as positivity rates and total tests performed.
The county-level positivity rates and test totals are pulled directly from CAN.
While CAN provides this data potentially from multiple sources, we only use data sourced from the
The [COVID Act Now (CAN)](https://covidactnow.org/) data source provides COVID-19 testing statistics, such as positivity rates and total tests performed.
The county-level positivity rates and test totals are pulled directly from CAN using [their API](https://covidactnow.org/data-api).
While CAN provides this data potentially from multiple sources, we only use data that CAN sources from the
[CDC's COVID-19 Integrated County View](https://covid.cdc.gov/covid-data-tracker/#county-view).

Delphi's mirror of the CAN data was deactivated in December 2021 (last issue 2021-12-10) in favor of the [DSEW CPR data](./dsew-cpr.md), which reports the same information under the `covid_naat_pct_positive_7dav` signal.


| Signal | Description |
|--------------------------------|----------------------------------------------------------------|
Expand All @@ -34,9 +36,9 @@ While CAN provides this data potentially from multiple sources, we only use data

## Estimation

The quantities received from CAN / CDC are the county-level positivity rate and total tests,
which are based on the counts of PCR specimens tested.
In particular, they are also already smoothed with a 7-day-average.
We receive county-level positivity rate and total tests from CAN, originating from the CDC.
These quantiles are based on the counts of PCR specimens tested.
They are also already smoothed with a 7-day-average.

For a fixed location $$i$$ and time $$t$$, let $$Y_{it}$$ denote the number of PCR specimens
tested that have a positive result. Let $$N_{it}$$ denote the total number of PCR specimens tested.
Expand Down Expand Up @@ -79,38 +81,41 @@ $$

### Smoothing

No additional smoothing is done to avoid double-smoothing, since the data pulled from CAN / CDC
No additional smoothing is done to avoid double-smoothing, since the CAN data
is already smoothed with a 7-day-average.

## Limitations

Estimates for geographical levels beyond counties may be inaccurate due to how aggregations
are done on smoothed values instead of the raw values. Ideally we would aggregate raw values
Estimates for geographical levels beyond counties may be inaccurate because our aggregations
are performed on smoothed values instead of the raw values.
Ideally we would aggregate raw values
then smooth, but the raw values are not accessible in this case.

The positivity rate here should not be interpreted as the population positivity rate as
The reported test positivity rate should not be interpreted as the population positivity rate as
the testing performed are typically not randomly sampled, especially for early data
with lower testing volumes.

A few counties, most notably in California, are also not covered by this data source.

Entries with zero total tests performed are also suppressed, even if it was actually the case that
Entries with zero total tests performed are suppressed, even if it was actually the case that
no tests were performed for the day.

## Lag and Backfill

The lag for these signals varies depending on the reporting patterns of individual counties.
Most counties have their latest data report with a lag of 2 days, while others can take 9 days
or more in the case of California counties.
or more, as is the case with California counties.

These signals are also backfilled as backlogged test results could get assigned to older 7-day timeframes.
Most recent test positivity rates do not change substantially with backfill (having a median delta of close to 0).
However, most recent total tests performed is expected to increase in later data revisions (having a median increase of 7%).
Revisions are sometimes made to the data. For example, backlogged test results can get assigned to past dates.
The majority of recent test positivity rates do not change substantially with backfill (having a median delta of close to 0).
However, the majority of recent total tests performed is expected to increase in later data revisions (having a median increase of 7%).
Values more than 5 days in the past are expected to remain fairly static (with total tests performed
having a median increase of 1% of less), as most major revisions have already occurred.

## Source and Licensing

County-level testing data is scraped by CAN from the
County-level testing data is scraped by [CAN](https://covidactnow.org/) from the
[CDC's COVID-19 Integrated County View](https://covid.cdc.gov/covid-data-tracker/#county-view),
and made available through [CAN's API](https://covidactnow.org/tools).

The data is made available under a [CC BY-NC](../covidcast_licensing.md#creative-commons-attribution-noncommercial) license.
2 changes: 1 addition & 1 deletion docs/api/covidcast-signals/hhs.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Department of Health & Human Services
parent: Data Sources and Signals
parent: Inactive Signals
grand_parent: COVIDcast Main Endpoint
---

Expand Down
6 changes: 3 additions & 3 deletions docs/epidata_development.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ $ [sudo] make test pdb=1
$ [sudo] make test test=repos/delphi/delphi-epidata/integrations/acquisition
```

You can read the commands executed by the Makefile [here](../dev/local/Makefile).
You can read the commands executed by the Makefile [here](https://github.com/cmu-delphi/delphi-epidata/blob/dev/dev/local/Makefile).

## Rapid Iteration and Bind Mounts

Expand Down Expand Up @@ -87,8 +87,8 @@ You can test your changes manually by:

What follows is a worked demonstration based on the `fluview` endpoint. Before
starting, make sure that you have the `delphi_database_epidata`,
`delphi_web_epidata`, and `delphi_redis` containers running; if you don't, see
the Makefile instructions above.
`delphi_web_epidata`, and `delphi_redis` containers running (with `docker ps`);
if you don't, see the Makefile instructions above.

First, let's insert some fake data into the `fluview` table:

Expand Down
6 changes: 5 additions & 1 deletion docs/symptom-survey/publications.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ Pandemic"](https://www.pnas.org/topic/548) in *PNAS*:

Research publications using the survey data include:

- C.K. Ettman, E. Badillo-Goicoechea, E.A. Stuart (2024). [Financial
strain, schooling modality and mental health of US adults living
with children during the COVID-19 pandemic](https://doi.org/10.1136/jech-2023-221672).
*Journal of Epidemiology & Community Health*.
- K. Sasse, R. Mahabir, O. Gkountouna, A. Crooks, A. Croitoru (2024).
[Understanding the determinants of vaccine hesitancy in the United
States: A comparison of social surveys and social media](https://doi.org/10.1371/journal.pone.0301488).
Expand All @@ -41,7 +45,7 @@ Research publications using the survey data include:
- Z. Yang, R. Krishnan, and B. Li (2024). [The interplay between individual
mobility, health risk, and economic choice: A holistic model for COVID-19
policy intervention](https://doi.org/10.1287/ijds.2023.0013). *INFORMS
Journal on Data Science*.
Journal on Data Science* 3 (1), 6-27.
- A. Srivastava, J. M. Ramirez, S. Díaz-Aranda, J. Aguilar, A. F. Anta, A. Ortega,
and R. E. Lillo (2024). [Nowcasting temporal trends using indirect surveys](https://doi.org/10.1609/aaai.v38i20.30242).
In *Proceedings of the 38th AAAI Conference on Artificial Intelligence* 38,
Expand Down
22 changes: 21 additions & 1 deletion integrations/client/test_delphi_epidata.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

# standard library
import time
import json
from json import JSONDecodeError
from requests.models import Response
from unittest.mock import MagicMock, patch

# first party
Expand Down Expand Up @@ -306,6 +306,26 @@ def test_sandbox(self, get, post):
Epidata.debug = False
Epidata.sandbox = False

@patch('requests.get')
def test_version_check(self, get):
"""Test that the _version_check() function correctly logs a version discrepancy."""
class MockJson:
def __init__(self, content, status_code):
self.content = content
self.status_code = status_code
def raise_for_status(self): pass
def json(self): return json.loads(self.content)
get.reset_mock()
get.return_value = MockJson(b'{"info": {"version": "0.0.1"}}', 200)

Epidata._version_check()

captured = self.capsys.readouterr()
output = captured.err.splitlines()
self.assertEqual(len(output), 1)
self.assertIn("Client version not up to date", output[0])
self.assertIn("\'latest_version\': \'0.0.1\'", output[0])

def test_geo_value(self):
"""test different variants of geo types: single, *, multi."""

Expand Down
2 changes: 1 addition & 1 deletion src/acquisition/covid_hosp/common/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

# first party
import delphi.operations.secrets as secrets
from delphi.epidata.common.logger import get_structured_logger
from delphi_utils import get_structured_logger

Columndef = namedtuple("Columndef", "csv_name sql_name dtype")

Expand Down
3 changes: 1 addition & 2 deletions src/acquisition/covidcast/csv_importer.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,9 @@
import pandas as pd

# first party
from delphi_utils import Nans
from delphi_utils import get_structured_logger, Nans
from delphi.utils.epiweek import delta_epiweeks
from delphi.epidata.common.covidcast_row import CovidcastRow
from delphi.epidata.common.logger import get_structured_logger

DataFrameRow = NamedTuple('DFRow', [
('geo_id', str),
Expand Down
2 changes: 1 addition & 1 deletion src/acquisition/covidcast/csv_to_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from delphi.epidata.acquisition.covidcast.csv_importer import CsvImporter, PathDetails
from delphi.epidata.acquisition.covidcast.database import Database, DBLoadStateException
from delphi.epidata.acquisition.covidcast.file_archiver import FileArchiver
from delphi.epidata.common.logger import get_structured_logger
from delphi_utils import get_structured_logger


def get_argument_parser():
Expand Down
20 changes: 10 additions & 10 deletions src/acquisition/covidcast/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

# first party
import delphi.operations.secrets as secrets
from delphi.epidata.common.logger import get_structured_logger
from delphi_utils import get_structured_logger
from delphi.epidata.common.covidcast_row import CovidcastRow


Expand Down Expand Up @@ -117,28 +117,28 @@ def insert_or_update_batch(self, cc_rows: List[CovidcastRow], batch_size=2**20,
get_structured_logger("insert_or_update_batch").fatal(err_msg)
raise DBLoadStateException(err_msg)

# NOTE: `value_update_timestamp` is hardcoded to "NOW" (which is appropriate) and
# NOTE: `value_update_timestamp` is hardcoded to "NOW" (which is appropriate) and
# `is_latest_issue` is hardcoded to 1 (which is temporary and addressed later in this method)
insert_into_loader_sql = f'''
INSERT INTO `{self.load_table}`
(`source`, `signal`, `time_type`, `geo_type`, `time_value`, `geo_value`,
`value_updated_timestamp`, `value`, `stderr`, `sample_size`, `issue`, `lag`,
`value_updated_timestamp`, `value`, `stderr`, `sample_size`, `issue`, `lag`,
`is_latest_issue`, `missing_value`, `missing_stderr`, `missing_sample_size`)
VALUES
(%s, %s, %s, %s, %s, %s,
UNIX_TIMESTAMP(NOW()), %s, %s, %s, %s, %s,
(%s, %s, %s, %s, %s, %s,
UNIX_TIMESTAMP(NOW()), %s, %s, %s, %s, %s,
1, %s, %s, %s)
'''

# all load table entries are already marked "is_latest_issue".
# if an entry in the load table is NOT in the latest table, it is clearly now the latest value for that key (so we do nothing (thanks to INNER join)).
# if an entry *IS* in both load and latest tables, but latest table issue is newer, unmark is_latest_issue in load.
fix_is_latest_issue_sql = f'''
UPDATE
`{self.load_table}` JOIN `{self.latest_view}`
USING (`source`, `signal`, `geo_type`, `geo_value`, `time_type`, `time_value`)
SET `{self.load_table}`.`is_latest_issue`=0
WHERE `{self.load_table}`.`issue` < `{self.latest_view}`.`issue`
UPDATE
`{self.load_table}` JOIN `{self.latest_view}`
USING (`source`, `signal`, `geo_type`, `geo_value`, `time_type`, `time_value`)
SET `{self.load_table}`.`is_latest_issue`=0
WHERE `{self.load_table}`.`issue` < `{self.latest_view}`.`issue`
'''

# TODO: consider handling cc_rows as a generator instead of a list
Expand Down
2 changes: 1 addition & 1 deletion src/acquisition/covidcast/file_archiver.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import shutil

# first party
from delphi.epidata.common.logger import get_structured_logger
from delphi_utils import get_structured_logger

class FileArchiver:
"""Archives files by moving and compressing."""
Expand Down
2 changes: 1 addition & 1 deletion src/client/delphi_epidata.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Epidata <- (function() {
# API base url
BASE_URL <- getOption('epidata.url', default = 'https://api.delphi.cmu.edu/epidata/')

client_version <- '4.1.24'
client_version <- '4.1.25'

auth <- getOption("epidata.auth", default = NA)

Expand Down
2 changes: 1 addition & 1 deletion src/client/delphi_epidata.js
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
}
})(this, function (exports, fetchImpl, jQuery) {
const BASE_URL = "https://api.delphi.cmu.edu/epidata/";
const client_version = "4.1.24";
const client_version = "4.1.25";

// Helper function to cast values and/or ranges to strings
function _listitem(value) {
Expand Down
30 changes: 27 additions & 3 deletions src/client/delphi_epidata.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

from aiohttp import ClientSession, TCPConnector, BasicAuth

__version__ = "4.1.24"
__version__ = "4.1.25"

_HEADERS = {"user-agent": "delphi_epidata/" + __version__ + " (Python)"}

Expand All @@ -43,8 +43,6 @@ class Epidata:
BASE_URL = "https://api.delphi.cmu.edu/epidata"
auth = None

client_version = __version__

debug = False # if True, prints extra logging statements
sandbox = False # if True, will not execute any queries

Expand All @@ -54,6 +52,25 @@ def log(evt, **kwargs):
kwargs['timestamp'] = time.strftime("%Y-%m-%d %H:%M:%S %z")
return sys.stderr.write(str(kwargs) + "\n")

# Check that this client's version matches the most recent available. This
# is intended to run just once per program execution, on initial module load.
# See the bottom of this file for the ultimate call to this method.
@staticmethod
def _version_check():
try:
request = requests.get('https://pypi.org/pypi/delphi-epidata/json', timeout=5)
latest_version = request.json()['info']['version']
except Exception as e:
Epidata.log("Error getting latest client version", exception=str(e))
return

if latest_version != __version__:
Epidata.log(
"Client version not up to date",
client_version=__version__,
latest_version=latest_version
)

# Helper function to cast values and/or ranges to strings
@staticmethod
def _listitem(value):
Expand Down Expand Up @@ -692,3 +709,10 @@ async def async_make_calls(param_combos):
future = asyncio.ensure_future(async_make_calls(param_list))
responses = loop.run_until_complete(future)
return responses



# This should only run once per program execution, on initial module load,
# as a result of how Python's module system works:
# https://docs.python.org/3/reference/import.html#the-module-cache
Epidata._version_check()
2 changes: 1 addition & 1 deletion src/client/packaging/npm/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"name": "delphi_epidata",
"description": "Delphi Epidata API Client",
"authors": "Delphi Group",
"version": "4.1.24",
"version": "4.1.25",
"license": "MIT",
"homepage": "https://github.com/cmu-delphi/delphi-epidata",
"bugs": {
Expand Down
2 changes: 1 addition & 1 deletion src/client/packaging/pypi/.bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 4.1.24
current_version = 4.1.25
commit = False
tag = False

Expand Down
9 changes: 9 additions & 0 deletions src/client/packaging/pypi/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,15 @@
All notable future changes to the `delphi_epidata` python client will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/).

## [4.1.25] - 2024-07-29

### Includes
- https://github.com/cmu-delphi/delphi-epidata/pull/1456
- https://github.com/cmu-delphi/delphi-epidata/pull/1497

### Changed
- Added a one-time check which logs a warning when the newest client version does not match the client version in use.

## [4.1.24] - 2024-07-09

### Includes
Expand Down
Loading
Loading