Abstract table names #225

tomjemmett · 2025-10-29T13:16:10Z

previously, all table names and save paths were hard-coded in. If we were to change where the location of a table was stored all instances would need to be updated.

This was going to prove to be an issue when we migrate to UDAL.

This PR abstracts the table names/file save paths into a new module, nhp.data.table_names. That way we can more easily update names as needed.

Also tidied up some potential circular dependencies and moved reference files out of more obscure locations and adds them into the code base as appropriate.

allows us to change where we read/write data from in a single place, so we can more easily switch between environments.

…prevent circular dependency

Copilot

Pull Request Overview

This PR introduces a centralized configuration system for managing database table names and file paths through a new table_names.py module. The changes replace hard-coded table names and file paths throughout the codebase with references to a configurable TableNames dataclass, making the system more maintainable and environment-aware.

Key changes:

Creates a new table_names.py module with a TableNames dataclass and environment-based configuration
Refactors all files to import and use table_names instead of hard-coded strings
Updates Databricks workflow configurations to remove redundant path parameters
Standardizes function signatures by removing default parameters and using explicit spark session passing

Reviewed Changes

Copilot reviewed 68 out of 69 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
src/nhp/data/table_names.py	New centralized configuration module defining all table names and paths
src/nhp/data/reference/*.py	Updated to use table_names for reference data tables and paths
src/nhp/data/raw_data/*.py	Updated to use table_names for raw data tables
src/nhp/data/nhp_datasets/*.py	Updated to use table_names for HES dataset references
src/nhp/data/population_projections/*.py	Updated to use table_names for population projection paths and tables
src/nhp/data/model_data/*.py	Updated to use table_names and standardize function signatures
src/nhp/data/inputs_data/*.py	Updated to use table_names and refactor main functions
src/nhp/data/aggregated_data/*.py	Updated to use table_names for aggregated data tables
src/nhp/data/get_spark.py	Simplified to remove catalog/schema parameters
databricks_workflows/*.yaml	Removed redundant path parameters now handled by table_names

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/nhp/data/table_names.py

src/nhp/data/reference/icb_catchments.py

src/nhp/data/inputs_data/inequalities.py

src/nhp/data/raw_data/mitigators/reference_data/hrg_trimpoints.csv

src/nhp/data/inputs_data/rates.py

src/nhp/data/inputs_data/diagnoses.py

src/nhp/data/inputs_data/baseline.py

src/nhp/data/inputs_data/age_sex.py

src/nhp/data/inputs_data/inequalities.py

src/nhp/data/table_names.py

…y place this is used

StatsRhian

Tom and I chatted through the changes in a code walk. Nice to get this stuff abstracted.

tomjemmett added 21 commits October 28, 2025 12:48

abstracts table names into table_names.py

1cc4bd7

allows us to change where we read/write data from in a single place, so we can more easily switch between environments.

fix typing/linting issues

5218ce9

removes arguments to get_spark

9ec981a

removes arguments that called get_spark as a default value

9054ab3

moves the path for model data extraction to table_names

58ed27a

switches to single get_spark method

8739757

fixes some table names

da69470

changes current environment

4baba90

switches to using the base data object rather than raw_data table to …

28e9355

…prevent circular dependency

removes unused file

3301ac3

removes comment, covvered elsewhere

383b8eb

uses correct tables in filter_acute_providers calls

ef223d7

moves inputs save path into table names module

554469d

fix typo in variable name

1e15a6c

moves save path into table names for population projections

9c537d9

removes custom population paths path, uses standard path instead

1f6771d

uses path from table_names in day procedures mitigator

a7d76bd

moves hrg trim point file to reference data module

8a13906

moves hrgs.json into reference data

7b4c3ce

updates todo note

79b8e14

refactor

afcdc70

Copilot AI review requested due to automatic review settings October 29, 2025 13:16

tomjemmett requested a review from a team as a code owner October 29, 2025 13:16

tomjemmett self-assigned this Oct 29, 2025

tomjemmett added the enhancement New feature or request label Oct 29, 2025

tomjemmett added this to the v4.3.0 milestone Oct 29, 2025

remove unused import

8d14c62

Copilot AI reviewed Oct 29, 2025

View reviewed changes

tomjemmett added 2 commits October 29, 2025 13:24

fix typo in docstring

f3c1750

explicitly state the null value used in the reference data file

44f0410

tomjemmett added 2 commits October 29, 2025 14:16

uses get_spark wherever SparkSession is used

4f9b00a

moves hes_datasets to raw_data.mitigators.ip, because that is the onl…

dc0206d

…y place this is used

StatsRhian approved these changes Oct 30, 2025

View reviewed changes

tomjemmett merged commit de4d41c into main Oct 30, 2025
3 checks passed

tomjemmett deleted the abstract-table-names branch October 30, 2025 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Abstract table names #225

Abstract table names #225

Uh oh!

tomjemmett commented Oct 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

StatsRhian left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Abstract table names #225

Abstract table names #225

Uh oh!

Conversation

tomjemmett commented Oct 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

StatsRhian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants