Skip to content

Comments

Feature/patient and obs flatten#150

Merged
quinnwai merged 7 commits intounique-col-namesfrom
feature/patient-and-obs-flatten
Oct 20, 2025
Merged

Feature/patient and obs flatten#150
quinnwai merged 7 commits intounique-col-namesfrom
feature/patient-and-obs-flatten

Conversation

@quinnwai
Copy link
Contributor

@quinnwai quinnwai commented Oct 17, 2025

More Patient Metadata Use Cases

Description

  1. Flatten patient observations to ResearchSubject
  2. Flatten Patient onto DocumentReference (DocRef -> Specimen -> Patient each connected by subject.reference)

Motivation and Context

Needed to populate important HTAN info

Validation Checklist

  • On local, can successfully g3t push --step publish
    • SMMART
    • HTAN
  • Using SMMART project, difference between old dataframe to new dataframe for
    • specimen: no difference, identical
    • DocumentReference has patient fields
    • ResearchSubject has flattened patient Observations
  • Using HTAN project, dataframe for
    • Specimen, ResearchSubject, GroupMember, MedAdmin are identical to old versions
    • DocRef has flattened patient fields
  • I have added tests to cover my changes. (NA)
  • All new and existing tests passed
    • Unit tests
    • Integration tests
  • Reviewer has tested this feature locally

@matthewpeterkort
Copy link
Collaborator

Ran unit tests and flake8 is failing which is not surprising. Would be good to clean it up, otherwise LGTM:

gen3_util % python -m pytest tests/unit       
================================================================================================ test session starts =================================================================================================
platform darwin -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0
rootdir: /Users/peterkor/Desktop/gen3_util
configfile: pytest.ini
plugins: anyio-4.9.0, cov-7.0.0
collected 51 items                                                                                                                                                                                                   

tests/unit/dataframer/test_dataframer.py ............................                                                                                                                                          [ 54%]
tests/unit/meta/test_meta.py 


..                                                                                                                                                                                [ 58%]
tests/unit/test_coding_conventions.py F                                                                                                                                                                        [ 60%]
tests/unit/test_commit_class.py ..                                                                                                                                                                             [ 64%]
tests/unit/test_flatten_fhir_example.py ........                                                                                                                                                               [ 80%]
tests/unit/test_hash_types.py ...                                                                                                                                                                              [ 86%]
tests/unit/test_indexclient.py .                                                                                                                                                                               [ 88%]
tests/unit/test_mime_type.py ..                                                                                                                                                                                [ 92%]
tests/unit/test_none_fields.py .                                                                                                                                                                               [ 94%]
tests/unit/test_num_parallel.py .                                                                                                                                                                              [ 96%]
tests/unit/test_read_dvc.py ..                                                                                                                                                                                 [100%]

====================================================================================================== FAILURES ======================================================================================================
______________________________________________________________________________________________ test_coding_conventions _______________________________________________________________________________________________

    def test_coding_conventions():
        """Check python conventions on key directories"""
        script_dir = os.path.dirname(os.path.abspath(__file__))
        directories = [
            os.path.join(script_dir, "../../gen3_tracker"),
            os.path.join(script_dir, "../../tests"),
        ]
        failures = []
        for directory in directories:
            cmd_str = f"flake8 {directory} --max-line-length 256 --exclude test_flatten_fhir_example.py"
            completed = subprocess.run(cmd_str, shell=True)
            if completed.returncode != 0:
                _ = f"FAILURE: Python formatting and style for directory {directory}/"
                failures.append(_)
                print(_)
    
>       assert len(failures) == 0, failures
E       AssertionError: ['FAILURE: Python formatting and style for directory /Users/peterkor/Desktop/gen3_util/tests/unit/../../gen3_tracker/', 'FAILURE: Python formatting and style for directory /Users/peterkor/Desktop/gen3_util/tests/unit/../../tests/']
E       assert 2 == 0
E        +  where 2 = len(['FAILURE: Python formatting and style for directory /Users/peterkor/Desktop/gen3_util/tests/unit/../../gen3_tracker/', 'FAILURE: Python formatting and style for directory /Users/peterkor/Desktop/gen3_util/tests/unit/../../tests/'])

/Users/peterkor/Desktop/gen3_util/tests/unit/test_coding_conventions.py:23: AssertionError

@matthewpeterkort matthewpeterkort self-requested a review October 20, 2025 17:50
@quinnwai
Copy link
Contributor Author

great thanks, hopefully the formatting error is on your local, seems like GitHub actions unit tests are passing. On my local:

$pytest tests/unit/test_coding_conventions.py 
======================================================== test session starts ========================================================
platform darwin -- Python 3.12.2, pytest-8.4.1, pluggy-1.6.0
rootdir: /Users/wongq/projects/gen3_util
configfile: pytest.ini
plugins: anyio-4.10.0, cov-6.2.1
collected 1 item                                                                                                                    

tests/unit/test_coding_conventions.py .                                                                                       [100%]

========================================================= 1 passed in 0.68s =========================================================
(venv) wongq@RNB13688:~/projects/gen3_util (feature/patient-and-obs-flatten)$black tests/unit/
All done! ✨ 🍰 ✨
14 files left unchanged.

@quinnwai quinnwai merged commit 0f8e16a into unique-col-names Oct 20, 2025
1 check passed
@quinnwai quinnwai deleted the feature/patient-and-obs-flatten branch October 20, 2025 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants