Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle _id columns in codelists for smoother use in sp_add_codelist() #90

Open
dan-bart opened this issue Jun 13, 2021 · 2 comments
Open

Comments

@dan-bart
Copy link

dt<-sp_get_table("budget-central", 2017)
dt %>% sp_add_codelist("finmisto")
returns:
"Error: Something went wrong with matching the codelist to the data for period 2017-01-31.
Please inspect the dates on the codelist to make sure there are no duplicate items valid for one given date.
You may want to filter/edit the codelist manually and pass it to the add_codelist function as an object."

I had to rename the column and specify it in the function
finm<-sp_get_codelist("finmisto")
finm<-rename(finm, "finmisto" = "finmisto_id")
dt2<- dt %>% sp_add_codelist(finm, by = "finmisto")

Maybe it would be beneficial to change the colname in finmisto in "sp_get_codelist" function? (Unless it is used elsewhere)

@petrbouchal
Copy link
Owner

Thanks. I suspect that because the codelist contains these *_id columns, it is meant to be added in some other way.

I'd rather not rename codelists columns en masse to something not contained in the data until I understand why it is called X in data and X_id in the codelist. I think this might be a case of a secondary codelist as documented here
https://petrbouchal.xyz/statnipokladna/articles/statnipokladna.html#primary-and-secondary-codelists. I just don't know what this one is secondary to.

For this particular codelist, sp_add_codelist() actually joins on the ico column, which does not identify codelist rows uniquely, hence the error message.

For this particular codelist, you probably do want to join the _id and non-_id column, which can be done more concisely like so:

dt %>% sp_add_codelist("finmisto", by = c("finmisto" = "finmisto_id"))

(The by parameter is passed on to left_join() as is, so can be specified in the same way.)

But I found another genuine bug which might affect this, which is already fixed on Github - so just go ahead and use the Github version for now.

@petrbouchal petrbouchal changed the title sp_add_codelist for "finmisto" requires preprocessing of the codelist Handle _id columns in codelists for smoother use in sp_add_codelist() Jun 14, 2021
@petrbouchal
Copy link
Owner

(I updated the issue title to a more general one and will keep it open as a watching brief, in case I figure out the logic of the key column names)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants