Skip to content

must.include and can.include don't work simultaneously #27

@julienvollering

Description

@julienvollering

I notice that the can.include and must.include each work individually, but not together. See my reprex below.
I can't think of any reason why they can't, in principle, both be used. It would be very useful to be able to build a sample that includes legacy samples but also is restricted to a subset of the population.

Looking at the R-code, it seems to me that the problem lies with: can.include <- 1:nrow(dat) in clhs.data.frame (https://github.com/pierreroudier/clhs/blob/8d45408d030b74b81073a8aa5fc4aec4d860ce06/R/clhs-data.frame.R#L122C1-L122C33), where if there is a must.include, then can.include becomes the rest of the rows. I don't know, however, if changing this in the R-code would work with the cpp implementation...

library(clhs)
#> The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
#> which was just loaded, were retired in October 2023.
#> Please refer to R-spatial evolution reports for details, especially
#> https://r-spatial.org/r/2023/05/15/evolution4.html.
#> It may be desirable to make the sf package available;
#> package maintainers should consider adding sf to Suggests:.

df <- data.frame(
  a = runif(1000), 
  b = rnorm(1000), 
  c = sample(LETTERS[1:5], size = 1000, replace = TRUE))

res1 <- clhs(df, size = 50, use.cpp = TRUE, iter = 5000, progress = FALSE, simple = FALSE,
            can.include = 1:500)
#> Warning: NAs introduced by coercion
range(res1$index_samples) # can.include correctly restricts range
#> [1]  10 498

res2 <- clhs(df, size = 50, use.cpp = TRUE, iter = 5000, progress = FALSE, simple = FALSE,
             must.include = 1:25)  
#> Warning: NAs introduced by coercion
print(res2$index_samples, N=50) # must.include correctly guarantees selection
#>  [1] 922 279 712 736 754 962 569 810 557 554 109 244 847 356 629 352 724 686 879
#> [20] 819 286 448 825 133 523   1   2   3   4   5   6   7   8   9  10  11  12  13
#> [39]  14  15  16  17  18  19  20  21  22  23  24  25
range(res2$index_samples) 
#> [1]   1 962

res3 <- clhs(df, size = 50, use.cpp = TRUE, iter = 5000, progress = FALSE, simple = FALSE,
            can.include = 1:500, must.include = 1:25)
#> Warning: NAs introduced by coercion
range(res3$index_samples) # in combination with must.include, can.include does not restrict range
#> [1]   1 954
print(res3$index_samples, N=50)
#>  [1] 399 830 841 360 625 548 448  19 232 954 199 785 441 322 603 252 754 804 244
#> [20] 484 167 611 236 261 462   1   2   3   4   5   6   7   8   9  10  11  12  13
#> [39]  14  15  16  17  18  19  20  21  22  23  24  25

Created on 2024-06-20 with reprex v2.0.2

Session info
sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=English_United Kingdom.utf8 
#> [2] LC_CTYPE=English_United Kingdom.utf8   
#> [3] LC_MONETARY=English_United Kingdom.utf8
#> [4] LC_NUMERIC=C                           
#> [5] LC_TIME=English_United Kingdom.utf8    
#> 
#> time zone: Europe/Oslo
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] styler_1.10.2     digest_0.6.33     fastmap_1.1.1     xfun_0.41        
#>  [5] magrittr_2.0.3    glue_1.6.2        R.utils_2.12.3    knitr_1.45       
#>  [9] htmltools_0.5.7   rmarkdown_2.25    lifecycle_1.0.4   cli_3.6.1        
#> [13] R.methodsS3_1.8.2 vctrs_0.6.4       reprex_2.0.2      withr_2.5.2      
#> [17] compiler_4.3.1    R.oo_1.25.0       R.cache_0.16.0    purrr_1.0.2      
#> [21] rstudioapi_0.15.0 tools_4.3.1       evaluate_0.23     yaml_2.3.7       
#> [25] rlang_1.1.2       fs_1.6.3

Created on 2024-06-20 with reprex v2.0.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions