Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
bf95b3b to
a3680f4
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR adds comprehensive documentation for the dataframer.py module, including detailed explanations of its components, a field map aligned to test fixtures, and a sequence diagram. The PR also includes several minor code improvements related to configuration handling and workflow organization.
- Adds new documentation files explaining the dataframer module's functionality and architecture
- Fixes configuration profile handling to improve error resilience
- Optimizes authentication initialization in CLI push command to avoid unnecessary calls during dry runs
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| docs/dataframer.md | New comprehensive documentation for dataframer module with field mapping |
| gen3_tracker/cli.py | Improves profile loading with error handling and removes help check logic |
| gen3_tracker/config/init.py | Fixes error message to use config.gen3.profile instead of undefined profile variable |
| gen3_tracker/git/cli.py | Optimizes authentication initialization to only occur when not in dry-run mode |
| tests/unit/test_mime_type.py | Removes trailing whitespace |
| .github/workflows/*.yaml | Updates workflow configurations and adds git setup for unit tests |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Review Steps1. Install feature/dataframer-doc ✔️2. Configure Credentials (CALYPR-prod ✔️3. Run All Tests
|
There was a problem hiding this comment.
LGTM — great doc addition and dataframer tests passing ✔️
Small issue with a end-to-end test in test_end_to_end_workflow.py hanging (I might have been being too inpatient and not given it enough time to complete) but doesn't block dataframer tests or functionality. Approved!
Integration tests not expected to run. Only unit tests |
Description
This PR adds comprehensive documentation for
gen3_tracker/meta/dataframer.pyand aligns the public field map to thesimplified_resourcesunit-test fixture. It also includes a Mermaid sequence diagram covering ingestion/upsert and the analytical read path (flattened_procedures()).Included docs (new):
docs/dataframer/dataframer_documentation.md— Whatdataframer.pyis for, key components, entities deep-dive, notable behaviors.docs/dataframer/Appendix_Field_Map_Aligned.md— Appendix field map aligned totests/unit/dataframer/test_dataframer.py’ssimplified_resources.docs/dataframer/dataframer_sequence_diagram.md— Mermaid sequence diagram (copy/paste-ready for GitHub/Docs).Improvements (enables tests without gen3-client-config.ini):
Motivation and Context
dataframer.pyingests, upserts, and flattens FHIR resources for analytics.simplified_resources), reducing onboarding friction and preventing future regressions.How Has This Been Tested?
tests/unit/dataframer/test_dataframer.py(simplified_resources) to ensure all mapped keys match (e.g., singleidentifierstring, flattenedDocumentReferenceattachment fields likemd5,source_path,contentType,size,url,title,creation, and Observation attributes such aseffectiveDateTime,valueCodeableConcept,sequencer,Gene, etc.).Environment: Python 3.11, SQLite (system default), macOS/Linux; ran
pytest -q tests/unit/dataframer/.Types of Changes
Checklist
docs/dataframer/).Related files/refs:
tests/unit/dataframer/test_dataframer.py(fixture:simplified_resources)docs/dataframer/dataframer_documentation.md,docs/dataframer/Appendix_Field_Map_Aligned.md,docs/dataframer/dataframer_sequence_diagram.md