Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

12 task 2b primary locations work #15

Merged
merged 49 commits into from
Jul 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
778f04f
add seaborn
Hussein-Mahfouz Apr 17, 2024
6da1bab
start on primary locations notebook
Hussein-Mahfouz Apr 17, 2024
b8bcc7f
start on primary locations notebook
Hussein-Mahfouz Apr 17, 2024
ae9d1c7
primary locations: distance per age group
Hussein-Mahfouz Apr 18, 2024
5e060e9
example config file for osmox
Hussein-Mahfouz Apr 19, 2024
9d25499
filtering spc to study area
Hussein-Mahfouz Apr 25, 2024
1feda70
filtering spc to study area
Hussein-Mahfouz Apr 25, 2024
97a0d8b
move exploratory plots to different notebook
Hussein-Mahfouz Apr 25, 2024
a97e8d4
first pass at assigning edu to zones
Hussein-Mahfouz Apr 25, 2024
6615879
speed up get_possible_zones()
Hussein-Mahfouz Apr 26, 2024
89bcc83
updating files before merging incoming pr
Hussein-Mahfouz Apr 30, 2024
e51a04e
first test :)
Hussein-Mahfouz Apr 30, 2024
4a47e54
remove placeholder
Hussein-Mahfouz Apr 30, 2024
86ff56d
activities per zone for sampling
Hussein-Mahfouz May 2, 2024
e6617cf
Merge branch 'main' into 12-task-2b-primary-locations-work
sgreenbury May 3, 2024
fab0d2b
fix typo
Hussein-Mahfouz May 3, 2024
6e2e9a4
initial logic for zone sampling
Hussein-Mahfouz May 3, 2024
1261627
editing get_possible_zones to ensure all shortlisted zones have the d…
Hussein-Mahfouz May 7, 2024
09f61c1
almost assigning a zone
Hussein-Mahfouz May 7, 2024
c548fa7
functions to assign activities to specific zones
Hussein-Mahfouz May 8, 2024
3d1dc9b
draft assignment to specific location
Hussein-Mahfouz May 8, 2024
0f759f9
edits to activity assignment
Hussein-Mahfouz May 9, 2024
5c08cfd
edits to activity assignment 2
Hussein-Mahfouz May 9, 2024
300a02a
fix hard coded parameter
Hussein-Mahfouz May 10, 2024
8459a66
exploring pam, ref #18
Hussein-Mahfouz May 13, 2024
79bbfee
removing old get_possible_zones function
Hussein-Mahfouz May 13, 2024
6acecb9
move education_type logic upstream to get_possible_zones
Hussein-Mahfouz May 13, 2024
10ae82b
ensuring all activities have a location
Hussein-Mahfouz May 14, 2024
264416d
maps for education trips
Hussein-Mahfouz May 17, 2024
5bb4ac5
import numpy
Hussein-Mahfouz May 22, 2024
d45fc0d
remove long console outputs
Hussein-Mahfouz May 22, 2024
bb3b388
Exclude G004, exclude notebooks from pre-commit
sgreenbury Jul 17, 2024
b61db0e
Lints
sgreenbury Jul 17, 2024
3e7f8d3
Complete fixing lints
sgreenbury Jul 17, 2024
b44e4ac
Add script version of notebook
sgreenbury Jul 17, 2024
2bbe1a3
Formatting
sgreenbury Jul 17, 2024
addd3f7
Update location with external centroids data
sgreenbury Jul 17, 2024
3cdb29c
Update location with external centroids data
sgreenbury Jul 17, 2024
99b9e8f
Update to prep script and notebook to Leeds
sgreenbury Jul 27, 2024
c17612b
Parallelise matching, fix typo, use sample instead of head
sgreenbury Jul 27, 2024
fc124fe
Add transformation between CRS, remove category recoding
sgreenbury Jul 27, 2024
44381f3
Update deps
sgreenbury Jul 27, 2024
e857dd7
Update notebook
sgreenbury Jul 29, 2024
dce764d
Lints
sgreenbury Jul 31, 2024
0153e84
Remove select_activity since now included as library function
sgreenbury Jul 31, 2024
9e83df0
Lints
sgreenbury Jul 31, 2024
3b565f4
Update gitignore
sgreenbury Jul 31, 2024
4405cd2
Merge remote-tracking branch 'origin/main' into 12-task-2b-primary-lo…
sgreenbury Jul 31, 2024
914882b
Apply pre-commit
sgreenbury Jul 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,6 @@ Thumbs.db

# Misc
.copier-answers.yml

# Ignore data path
data/
3 changes: 3 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Exclude notebooks from pre-commit
exclude: \.ipynb$

ci:
autoupdate_commit_msg: "chore: update pre-commit hooks"
autofix_commit_msg: "style: pre-commit fixes"
Expand Down
35 changes: 17 additions & 18 deletions BACKGROUND.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,19 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil

### Adding activity patterns to synthetic population

#### NTS data
#### NTS data
- We are currently using the entire NTS sample, but this could include trips with unrepresentative distances (e.g. commuting distance in London is not the same as liverpool). See https://github.com/Urban-Analytics-Technology-Platform/acbm/issues/16

#### Household level matching
#### Household level matching
- We use categorical matching at the household level (level 1) and then propensity score matching (PSM) at the individual level (level 2)
- We need to implement PSM from the beginning to ensure that each individual in the SPC is matched to at least one sample from the NTS. See https://github.com/Urban-Analytics-Technology-Platform/acbm/issues/13
- Matching variables are decided using trial and error (see [2_match_households_and_individuals](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/d2f9e747c3d55148316661b13b1650fac4a5a4ad/notebooks/2_match_households_and_individuals.ipynb). Using PSM would allow us to use all variables
- For each SPC household, we randomly select one of the matched NTS households
- Rest of the assumptions are outlined in the [wiki page](https://github.com/Urban-Analytics-Technology-Platform/acbm/wiki/Adding-activity-patterns-to-synthetic-population)

#### Individual level matching
#### Individual level matching
- Done based on age_group and sex only. PSM without replacement

### Assigning activities to geographic locations

#### Mode and trip purpose mapping 
Expand All @@ -51,11 +51,11 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil
- For education POIs, I've done the following:   

> "kindergarden": ["education_kg", "work"],
>
>
> "school": ["education_school", "work"],
>
>
> "university": ["education_university", "work"],
>
>
> "college": ["education_college", "work"],

##### Selecting feasible zones for each activity
Expand All @@ -68,27 +68,27 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil
- If an individual in the NTS has an "education" activity, I map their age to an education type. See the age_group_mapping dictionary in 3_locations_primary:

> age_group_mapping = {
>
>
> 1: "education_kg", # "0-4"
>
>
> 2: "education_school", # "5-10"
>
>
> 3: "education_school", # "11-16"
>
>
> 4: "education_university", # "17-20"
>
>
> 5: "education_university", # "21-29"
>
>
> 6: "education_university", # "30-39"
>
>
> 7: "education_university", # "40-49"
>
>
> 8: "education_university", # "50-59"
>
>
> 9: "education_university" # "60+"
> }


- When selecting a location for an education activity in [select_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L578), we try to select a zone that has a POI that matches the persons age group. If we can't we choose any other feasible zone with an education POI
- This logic should be moved upstream to the [get_possible_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L201). For each activity, we should always ensure that our list of feasible zones has a zone with our specific POI category. This should be added in the [filter_by_activity](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L374) logic. The filter_by_activity logic currently looks at activity purpose from the NTS (e.g. "education"). We need to add the extra level of detail from age_group_mapping, and then filter based on that instead
- We select a zone from the feasible zones probabilistically based on total floor area of the POIs that match the relevant activity. See [select_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L578)
Expand All @@ -103,4 +103,3 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil
- (**DONE** [here](https://github.com/Urban-Analytics-Technology-Platform/acbm/commit/6acecb928ea2b9bf26952eb45b86f2918a6dccdf)) migrate logic for age_group_mapping from `select_zone()` to `get_possible_zones()`
- edit `get_possible_zones()` to ensure it never returns an empty list of zones. See above for how to do this
- 14/05/2024: I created another function `fill_missing_zones()`. see [this commit](https://github.com/Urban-Analytics-Technology-Platform/acbm/commit/10ae82b3923cdc51474d3721df80e332ea74ba03#diff-48d91584494e303c162dd8c5b8881de35f33976f2f688cd5a56db01b7ff1f233)

4 changes: 2 additions & 2 deletions notebooks/1_prep_synthpop.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@
"source": [
"# Pick a region with SPC output saved\n",
"path = \"../data/external/spc_output/raw/\"\n",
"region = \"west-yorkshire\""
"region = \"leeds\""
]
},
{
Expand Down Expand Up @@ -195,7 +195,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
"version": "3.11.9"
}
},
"nbformat": 4,
Expand Down
Loading
Loading