-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add prepare_sector_split.R #30
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly a rubber stamp/ review of syntax, but LGTM
abcd_id <- abcd %>% | ||
dplyr::distinct(.data$company_id, .data$name_company) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: single line pipe (i don't really care that much, just making note)
# identify lost_companies_sector_split and write to csv for inspection | ||
lost_companies_sector_split <- companies_sector_split %>% | ||
dplyr::anti_join( | ||
abcd_id, | ||
by = c("company_id") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: single line pipe
config_files <- config::get("file_names") | ||
config_match_prio <- config::get("match_prioritize") | ||
config_prepare_sector_split <- config::get("sector_split") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this add new/necessary keys to the config
? Should that be documented somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, they are already in the example.config.yml
It should be documented, but at this point I prefer getting the full workflow set up before writing an extensive documentation about these. But will open a task for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
We should consider how exactly we want to slice and dice the architecture here, and what/ where things should be documented (e.g. {config.yml + scripts} could also be {function args + functions} in a package).
The technical profile document will inform this, and ideally we can discuss it a bit more in the Tech Review.
closes #9
closes #10
workflow.multi.loanbook
gains moduleprepare_sector_split.R
, which calculates equal weights and primary energy based sector weights for companies"worst_case"
sector split, which was an option in the previousworkflow.aggregate.loanbooks
. This option has not been used and is unnecessarily complicated.workflow.aggregate.loanbooks
run_match_prioritize.R
gains an option to split loan values based on outputs ofprepare_sector_split.R
. Option is set viaconfig.yml
primary_energy_efficiency
andunit_conversion
) needed to calculate the primary energy based sector split are integrated directly intoprepare_sector_split.R
. Previously, they were generated indata-raw/
, but since we are currently not using a package structure and the data sets are not expected to be used elsewhere, they are integrated directly in the script for now.prepare_sector_split.R
can remove inactive companies based on the output ofprepare_abcd.R
, in case the optionremove_inactive_companies == TRUE