-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
92 lines (68 loc) · 3.69 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# svyVarSel
<!-- badges: start -->
<!-- badges: end -->
This package allows to fit linear and logistic LASSO and elastic net models to complex survey data.
This package depends on `survey` and `glmnet` packages.
Five functions are available in the package:
- `welnet`: This is the **main function**. This function allows to fit elastic net (linear or logistic) models to complex survey data (including ridge and LASSO regression models, depending on the selected mixing parameter), considering sampling weights in the estimation process and selecting the lambda that minimizes the error based on different replicate weights methods.
- `wlasso`: This function allows to fit LASSO prediction (linear or logistic) models to complex survey data, considering sampling weights in the estimation process and selecting the lambda that minimizes the error based on different replicate weights methods (equivalent to the `welnet()` function when `alpha=1`).
- `welnet.plot`: plots objects of class `welnet`, indicating the estimated error of each lambda value and the number covariates of the model that minimizes the error.
- `wlasso.plot`: plots objects of class `wlasso`, indicating the estimated error of each lambda value and the number covariates of the model that minimizes the error.
- `replicate.weights`: allows randomly defining training and test sets by means of the replicate weights' methods analyzed throughout the paper. The functions `welnet()` and `wlasso()` depend on this function to define training and test sets. In particular, the methods that can be considered by means of this function are:
- The ones that depend on the function `as.svrepdesign` from the `survey` package: Jackknife Repeated Replication (`JKn`), Bootstrap (`bootstrap` and `subbootstrap`) and Balanced Repeated Replication (`BRR`).
- **New proposals:** Design-based cross-validation (`dCV`), split-sample repeated replication (`split`) and extrapolation (`extrapolation`).
## Installation
To install it from [CRAN](https://CRAN.R-project.org):
```{r, eval = FALSE}
install.packages("svyVarSel")
```
To install the updated version of the package from GitHub:
```{r, eval = FALSE}
devtools::install_github("aiparragirre/svyVarSel")
```
## Example
Fit a logistic elastic net model as follows:
```{r example}
library(svyVarSel)
data(simdata_lasso_binomial)
mcv <- welnet(data = simdata_lasso_binomial,
col.y = "y", col.x = 1:50,
family = "binomial",
alpha = 0.5,
cluster = "cluster", strata = "strata", weights = "weights",
method = "dCV", k=10, R=20)
```
Or equivalently:
```{r example-cont}
mydesign <- survey::svydesign(ids=~cluster, strata = ~strata, weights = ~weights,
nest = TRUE, data = simdata_lasso_binomial)
mcv <- welnet(col.y = "y", col.x = 1:50, design = mydesign,
family = "binomial", alpha = 0.5,
method = "dCV", k=10, R=20)
```
Then, plot the result as follows:
```{r example-plot}
welnet.plot(mcv)
```
If you only aim to obtain replicate weights for other purposes, use the `replicate.weights()` function:
```{r example-rw}
newdata <- replicate.weights(data = simdata_lasso_binomial,
method = "dCV",
cluster = "cluster",
strata = "strata",
weights = "weights",
k = 10, R = 20,
rw.test = TRUE)
```