-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incorrect comorbidity calculations for pulmonary and cancer codes, maybe more #9
Comments
Hello Jack! |
@jackwasey Excellent observation! We need to standardize our approaches across all these packages. To that end, we need to first analyze how ICD codes change over time (eg. was 498.82 ever a valid ICD code? in this case, no) and specifically determine if any ICD9/ICD10 codes that have been added or removed over time were part of any of the categories for Charlson or Elixhauser. We also need to discuss a higher level question: if ICD codes change over time (which they do) but algorithms like Charlson and Elixhauser are not updated periodically, who is accountable for making those algorithms "current"? Programmers or researchers? I generally am a fan of ellessenne's approach: the fact that a data set might contain invalid ICD codes is the data curator's fault and should not be addressed by PS: I usually used my own programs to extract comorbidities. Recently, for a specific project, I decided to use R; in comparing |
Yes, agreed that nobody ever talks about the changing codes year to year, although in reality, these are relatively slight, and may hardly change the comorbidities results at all. My approach with Regarding your questions: researchers (like me) are pretty much forced to do what has been done before, which is to gloss over the annual changes. Ideally, we could at least audit year-to-year changes, which I did make possible with
I'm sorry you experienced a crash with Looking again for common ground for us to collaborate, it would be nice to share test data or test code. This can be used to compare results, and we will only grow stronger from finding how our different implementations give different results. Speed is important for big data sets, and this is something I've been working hard on, so we can also compare speed with some common benchmark code. Happy to have this discussion. |
Sharing test data and test code would be a great idea. |
Thanks @salmasian and @jackwasey for the interesting discussion! |
Hi Folks, Please take a look at this PDF paper which I've submitted to JSS. I roughly benchmarked the three packages and the results are in the paper. The benchmark code is in the github repo if you want to run it. I also included a more detailed discussion about whether to assign incorrect three-digit codes to a neighboring comorbidity or not. Still interested in sharing ideas on code validation. At some point, there are diminishing returns, and it seems most researchers are just not interested in getting microscopic detail of ICD code accuracy when calculating comorbidities. Jack |
Both your packages are great. But it would be very helpful if you can better describe or explain the differences. My question is: Why does the regular sum score differ between the two packages for ICD-9 "Quan" Elixhauser? I restrict the question to "Quan" Elixhauser because I prefer Elixhauser over Charlson, and it seems to me that Alessandro's package To illustrate the problematic difference with a reproducible example, I use the Vermont sample data that is shipped with Jack's package
Here, I calculate a regular sum score for
Here, I calculate a regular sum score for
Why does the sum score variable Anders |
This is interesting, thank you @aalexandersson! I replicated your example using simulated ICD9 data (only valid codes): library(comorbidity)
library(icd)
library(tidyverse) Simulating real ICD-9 codes: set.seed(1)
N <- 15000
x <- data.frame(
id = sample(1:1000, size = N, replace = TRUE),
code = sample_diag(N, version = "ICD9_2015"),
stringsAsFactors = FALSE)
head(x)
#> id code
#> 1 266 3089
#> 2 373 V554
#> 3 573 85254
#> 4 909 45384
#> 5 202 2191
#> 6 899 20970 Package icd_quan <- icd9_comorbid_quan_elix(x) %>%
apply(2, as.integer)
elixsum <- rowSums(icd_quan, na.rm = TRUE)
table(elixsum)
#> elixsum
#> 0 1 2 3 4 5 6
#> 244 366 235 114 34 6 1 Package comorbidity9 <- comorbidity(x = x, id = "id", code = "code", score = "elixhauser_icd9")
elixsum <- rowSums(comorbidity9[2:32], na.rm = TRUE)
table(elixsum)
#> elixsum
#> 0 1 2 3 4 5 6
#> 245 355 238 113 35 11 3 These are real ICD9 codes, so there should be no difference in principle if the only difference between P.s.: @jackwasey thanks for the draft of your JSS paper. I am a bit busy at the moment with other stuff but I will definitely read it sooner rather than later! I have some opinions on assign incorrect three-digit codes to a neighboring comorbidity or not, but I want to read your thoughts more carefully before commenting on it. Edit: I noticed that |
One immediate issue is that |
@jackwasey @ellessenne
Is the remaining difference because |
Which mismatched codes does Using the Stata command
That is too many to review manually, so for now I only look at the first four mismatches (as noted above). For id 1, |
If you compare the results with the IDs in the same order, there are only three mismatches. The first one I look at: I haven't had time to look any deeper, or at why the sums don't match. I'm glad for this validation, though. |
How do I sort the
|
I use something like:
Beware row name vs id column distinction in R, too. |
Thank you, Jack. The R code snippet sorts the input data with N=15,000. But it seems to me that it is the output data with N=1,000 that needs to be sorted by One part I really like with |
Will do. FYI, |
Hi guys, library(comorbidity)
library(icd)
#> Welcome to the 'icd' package for finding comorbidities and interpretation of ICD-9 and ICD-10 codes.
#> ?icd to get started, then see the vignettes and help for details and examples.
#>
#> Suggestions and contributions are welcome at https://github.com/jackwasey/icd . Please cite this package if you find it useful in your published work citation(package = "icd")
library(tidyverse)
#> ── Attaching packages ──────────────────────────────────────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 2.2.1 ✔ purrr 0.2.4
#> ✔ tibble 1.4.2 ✔ dplyr 0.7.4
#> ✔ tidyr 0.8.0 ✔ stringr 1.3.1
#> ✔ readr 1.1.1 ✔ forcats 0.3.0
#> ── Conflicts ─────────────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::explain() masks icd::explain()
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
# Simulating real ICD-9 Codes
set.seed(1)
N <- 15000
x <- data.frame(
id = sample(1:1000, size = N, replace = TRUE),
code = sample_diag(N, version = "ICD9_2015"), # the codes simulated via sample_diag are only valid codes
stringsAsFactors = FALSE
)
head(x)
#> id code
#> 1 266 3089
#> 2 373 V554
#> 3 573 85254
#> 4 909 45384
#> 5 202 2191
#> 6 899 20970
x <- x[order(x[["id"]]), ]
row.names(x) <- NULL
head(x)
#> id code
#> 1 1 7966
#> 2 1 01226
#> 3 1 64260
#> 4 1 23877
#> 5 1 11509
#> 6 1 74710
# Package icd
icd_quan <- icd9_comorbid_quan_elix(x, hierarchy = FALSE)
icd_quan <- data.frame(id.icd = row.names(icd_quan), score.icd = rowSums(icd_quan), stringsAsFactors = FALSE)
# Package comorbidity
comorbidity9 <- comorbidity(x = x, id = "id", code = "code", score = "elixhauser_icd9") %>%
mutate(id.comorbidity = as.character(id)) %>%
rename(score.comorbidity = score) %>%
select(id.comorbidity, score.comorbidity)
# Merge and id those that differ
singledf <- full_join(icd_quan, comorbidity9, by = c("id.icd" = "id.comorbidity")) %>%
mutate(same = as.numeric(score.icd == score.comorbidity)) %>%
rename(id = id.icd)
x[["id"]] <- as.character(x[["id"]])
x <- left_join(x, singledf, "id")
# Show
filter(x, same == 0)
#> id code score.icd score.comorbidity same
#> 1 43 1639 2 3 0
#> 2 43 6114 2 3 0
#> 3 43 9658 2 3 0
#> 4 43 39890 2 3 0
#> 5 43 52525 2 3 0
#> 6 43 83119 2 3 0
#> 7 43 40211 2 3 0
#> 8 43 66901 2 3 0
#> 9 43 40211 2 3 0
#> 10 43 33901 2 3 0
#> 11 43 94502 2 3 0
#> 12 43 37312 2 3 0
#> 13 43 71680 2 3 0
#> 14 43 3545 2 3 0
#> 15 43 V1221 2 3 0
#> [...] I kept only the output for the individual with Anyway, I think there is no real, unique solution for the "keeping the ID" problem: it is more efficient to use matrices (as evident by your benchmarks, @jackwasey), but on the other side they can only contain single-type data. I designed |
Thanks! So, there are 35 mismatched ids? That's worse than 3 but much better than 727 :-)
|
Alright, thank you! I still find three minor renal codes are different. You found a minor error in the renal section of the transcribed Quan Elixhauser ICD-9 map: I had accidentally written 588 instead of 588.0 . The following code shows the output is now identical:
There is still a question of the scores, which I'll get to. Most people use Van Walraven system for scoring Elixhauser comorbidities, not just the sum of the number of flags. What scoring system did you implement? |
References:
I have an unrelated question though: in my experience as an applied researcher, I never used the actual weighted comorbidity score for anything; I always ended up using single comorbidities or even adding additional comorbidities based on ICD codes, as I was dealing with large health-care data and registries. What is your experience on this point? I reckon this is probably just my biased experience, so that's why I thought about asking! I am genuinely curious here 😄 |
Interesting to hear that. I also use the comorbidities directly, usually for matching. |
My experience is the same. However, applied researchers differ widely in skills and needs. I work for the Florida cancer data registry. Before discussing scoring other than sum scores, it would help me if we first could resolve the mismatches so far: Do we agree that there are 35 mismatched IDs in the example data in terms of sum scores? Jack wrote:
But I got 29 "element mismatches":
|
Would you please make sure |
I had up-to-date
As a result, when I re-run Jack's code the 29 element mismatches are resolved:
However, there are still these 9 mismatched ids when running Alessandro's code:
I do not understand what the "element mismatches" were but that's okay. More important, do you still have the 9 mismatched ids listed above? |
I agree with Jack that it is now time to focus on the scoring: In Alessandro's code, I do not understand the This can be verified by running Alessandro's code until this line:
There, run this code:
|
Here is a list of all 9 observations for
|
Hello @aalexandersson! The difference in score is due to the fact that See below, calling library(comorbidity)
library(icd)
#> Welcome to the 'icd' package for finding comorbidities and interpretation of ICD-9 and ICD-10 codes.
#> ?icd to get started, then see the vignettes and help for details and examples.
#>
#> Suggestions and contributions are welcome at https://github.com/jackwasey/icd . Please cite this package if you find it useful in your published work citation(package = "icd")
requireNamespace("testthat")
#> Loading required namespace: testthat
library(tidyverse)
#> -- Attaching packages --------------------------------- tidyverse 1.2.1 --
#> v ggplot2 2.2.1 v purrr 0.2.4
#> v tibble 1.4.2 v dplyr 0.7.4
#> v tidyr 0.8.0 v stringr 1.3.1
#> v readr 1.1.1 v forcats 0.3.0
#> -- Conflicts ------------------------------------ tidyverse_conflicts() --
#> x dplyr::explain() masks icd::explain()
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag() masks stats::lag()
set.seed(1)
N <- 15000
x <- data.frame(id = sample(1:1000, size = N, replace = TRUE), code = sample_diag(N,
version = "ICD9_2015"), stringsAsFactors = FALSE)
x <- x[order(x[["id"]]), ]
row.names(x) <- NULL
# Package icd
icd_quan <- icd9_comorbid_quan_elix(x, hierarchy = FALSE)
icd_quan <- data.frame(id.icd = row.names(icd_quan), score.icd = rowSums(icd_quan),
stringsAsFactors = FALSE)
# Package comorbidity
comorbidity9 <- comorbidity(x = x, id = "id", code = "code", score = "elixhauser_icd9", assign0 = FALSE) %>%
mutate(id.comorbidity = as.character(id)) %>%
rename(score.comorbidity = score) %>%
select(id.comorbidity, score.comorbidity)
# Merge and id those that differ
singledf <- full_join(icd_quan, comorbidity9, by = c(id.icd = "id.comorbidity")) %>%
mutate(same = as.numeric(score.icd == score.comorbidity)) %>%
rename(id = id.icd)
x[["id"]] <- as.character(x[["id"]])
x <- left_join(x, singledf, "id")
# Show
filter(x, same == 0)
#> [1] id code score.icd score.comorbidity
#> [5] same
#> <0 rows> (or 0-length row.names) |
Hello, as I mentioned before: icd has functions I'm glad at last we have verified the calculations! |
This has been a really helpful discussion, so can we go back to the original problem now? Let me know when you've read the discussion about interpretation of incorrect three-digit codes in my JSS article. |
@jackwasey I also read the draft of your JSS paper (very well written by the way, good job): summarising your main points,
Please correct me if I am mistaken. I have some ideas on validation (well, sort of) to play around and potentially implement in Alessandro P.s.: it was very interesting to read about the performance of |
Thank you both. The official Stata command |
Thanks @aalexandersson, I was not aware of Stata's |
@jackwasey @ellessenne Here is the same test data as before:
Here are the AHRQ sum scores from
This is the output:
Here are the AHRQ sum scores from
This is the output:
Why do the two sets of sum scores differ for AHRQ? How can the two programs calculate the same sum scores for AHRQ? |
Hi @aalexandersson, could you try running the same code with |
Here is the output when instead using syntax
|
I see. Let me have a deeper look, I'll get back to you ASAP! |
For regular (unweighted) sum scores, I think we should use |
@aalexandersson I had a quick look at some of the differences and I think it may depend on which codes have been used to detect the comorbidities; I am overall using the "Enhanced ICD9-CM" from Quan et al., not sure what @jackwasey is using though. |
I like your idea of having no default for |
For AHRQ comorbidities, it seems to me that @jackwasey in I base this conclusion from reading jackwasey/icd#112 and https://cran.r-project.org/web/packages/icd/vignettes/compare-maps.html . I guess that explains the differences in sum scores between |
I just remembered that for
|
Thanks for bringing this up and you both looking into it. I will have some time soon to dig into this. |
I used
I think this is the remaining issue: I will try to look into this next week. Any feedback is welcome. After all, you both are the experts! At least, I am learning :-) |
Here is yet another attempt to be more precise about "AHRQ Elixhauser" comorbidities and its AHRQ scoring. Again, this is simply my understanding of what you both developed: The (29) AHRQ Elixhauser comorbidities: -The package |
Hi, all. I have great appreciation for the work put into these packages -- thank you! I just wanted to pop in with my opinion on how I would deal with invalid or new ICD codes. I think, as user, the most important thing is to be informed. I have data with a few invalid ICD codes that, using {comorbidity}, get assigned to a comorbidity. (Appropriately? Inappropriately? A matter of perspective.) If I were writing this (I'm not skilled enough to do so well!), I would inform a user about these codes (I love a default verbose = TRUE!). Looking at the implementation of Thanks again for the thoughtful work on these! |
Hi, I'm the author of the R package 'icd', and I'm glad to see that several of us have worked on solving the comorbidity computation problem. Just noticed your package today. Also, glad to see you live in my home country!
cc @patrickmdnet who is the author of 'medicalrisk'
I took some time this morning to compare the comorbidity computations between our packages both in speed and content. I was distressed to see we all differed from each other, particularly in the COPD/chronic lung disease and cancer/tumor categories. I dug into your source code and noticed you grep for descendents of a non-existent top-level code (498), giving a false positive for chronic lung disease with a random test code 498.82 . It is an open question what we should all be doing if potentially valid, but utterly non-existent codes appear, particularly as different annual revisions may gain or lose codes, and we would probably want to sweep them all up when looking for comorbidities.
In 'icd' I took the view that I would count non-existent descendents of extant codes, but I would exclude codes which had no parent with any association with a comorbidity.
I didn't look into the cancer side, but there were many more discrepancies which I suspect to be of the same origin.
Can I suggest randomly generating strings for testing. You can see I do this in the icd source code. I see you generated test data by sampling only valid codes.
One way for us all to work together might be for you to continue to implement comorbidities how you wish, and consider importing the 'icd' package for validation and explanation of actual codes, which I've put a lot of time into. I'm open to considering other ways for us to collaborate.
Best wishes,
Jack
The text was updated successfully, but these errors were encountered: