diff --git a/docs/KSIC2025/index.html b/docs/KSIC2025/index.html index 19052a4..ffa6760 100644 --- a/docs/KSIC2025/index.html +++ b/docs/KSIC2025/index.html @@ -388,6 +388,10 @@ margin-right: 0; } + + + +
Baseline 맞춘다(X), RCT를 모방한다(O)
Matching, IPTW의 장단점을 이해한다
Baseline 맞춘다(X), RCT를 모방한다(O)
Matching은 ATT(average treatment effect on treated), IPTW는 ATE(average treatment effect)
Matchinig은 심플하지만 샘플이 감소. IPTW는 샘플은 유지하지만 분석방법 복잡 & Weight문제
3그룹 이상일땐 가장 작은 그룹에 맞춰 매칭 or IPTW with twang
package.
ATE(average treatment effect) vs ATT(average treatment effect on treated)
Matching은 ATT, IPTW는 ATE
openstat.ai 에서 2그룹 매칭 & IPTW 후 분석가능
-Clone Censor weight라는 새로운 방법론
+2그룹 MatchIt
, 3그룹 twang
패키지
Logistic regression, Nearest neighbor, caliper 이해
3그룹 matching은 가장 작은 N수에 맞춰 2번 수행
openstat.ai 에서 2그룹 matching/IPTW 지원
분석 이슈
+Matching 후 pair정보 이용해야 하는가?(ex: stratified cox)
성별에 따라 매칭/IPTW 해도 되는가?
목표는 Causal inference, RCT like design
+\(ITE = Y_{1i} - Y_{0i}\), 하늘만이..
\(ATE = E[Y_1 - Y_0]\), RCT
\(ATT = E[Y_1 - Y_0 | T=1]\), \(ATC = E[Y_1 - Y_0 | T=0]\)
Propensity score란?
+따라서, 치료군의 PS와 비슷한 사람을 대조군에서 뽑으면 두 그룹의 Baseline이 비슷해지겠군!
+RCT: 어떤 사람이 두 군에 배정될 확률이 50:50
+PS matching: PS가 0.7인 사람이 두 군에 배정될 확률? 50:50
+그럼 PS matching하면 RCT만큼 인정받을 수 있다?
+Inverse probability of treatment weighting
+PS 0.7인 사람이 각 군에 속할 확률은?
+\[0.7 \times \frac{1}{0.7} = 0.3 \times \frac{1}{0.3} = 1\]
+그럼 IPTW는 ATE vs ATT?
+ATE weight
+\[ + w_i \text{ for } ATE= + \begin{cases} + \frac{1}{p_i} & \text{if treated} \\ + \frac{1}{1 - p_i} & \text{if control} + \end{cases} + \] 전체 샘플(Treated + Control) 의 2배를 랜덤하게 배정한 RCT
+ATT weight
+\[ + w_i \text{ for } ATT = + \begin{cases} + 1 & \text{if treated} \\ + \frac{p_i}{1 - p_i} & \text{if control} + \end{cases} + \] Treated + Treated 를 랜덤하게 배정
+name | +age | +TAVI | +Survival | +Propensity_score | +ATE | +ATT | +
---|---|---|---|---|---|---|
빈센조 | +35 | +1 | +1 | +0.1718567 | +5.818801 | +1.0000000 | +
루카스 | +48 | +0 | +1 | +0.3163092 | +1.462649 | +0.4626495 | +
제이슨 | +50 | +0 | +1 | +0.3435664 | +1.523383 | +0.5233834 | +
토마스 | +53 | +1 | +0 | +0.3864109 | +2.587919 | +1.0000000 | +
리오넬 | +55 | +0 | +1 | +0.4160330 | +1.712426 | +0.7124256 | +
카밀라 | +68 | +1 | +1 | +0.6136449 | +1.629607 | +1.0000000 | +
아칸지 | +70 | +1 | +1 | +0.6424478 | +1.556547 | +1.0000000 | +
에밀리 | +75 | +0 | +1 | +0.7097903 | +3.445784 | +2.4457837 | +
노이어 | +80 | +1 | +0 | +0.7690095 | +1.300374 | +1.0000000 | +
호날두 | +85 | +1 | +0 | +0.8192224 | +1.220670 | +1.0000000 | +
앨리스 | +40 | +0 | +1 | +0.2202580 | +1.282475 | +0.2824754 | +
밥 | +45 | +0 | +1 | +0.2777194 | +1.384504 | +0.3845035 | +
찰리 | +52 | +0 | +0 | +0.3718948 | +1.592090 | +0.5920901 | +
다니엘 | +60 | +1 | +0 | +0.4923210 | +2.031195 | +1.0000000 | +
엘리자베스 | +62 | +1 | +1 | +0.5231400 | +1.911534 | +1.0000000 | +
프랭크 | +67 | +0 | +1 | +0.5989249 | +2.493298 | +1.4932985 | +
그레이스 | +73 | +1 | +1 | +0.6837417 | +1.462541 | +1.0000000 | +
헨리 | +77 | +0 | +1 | +0.7345263 | +3.766852 | +2.7668518 | +
이사벨 | +82 | +1 | +0 | +0.7901901 | +1.265518 | +1.0000000 | +
제임스 | +88 | +1 | +0 | +0.8450253 | +1.183396 | +1.0000000 | +
존 | +42 | +0 | +1 | +0.2421700 | +1.319557 | +0.3195571 | +
마리아 | +49 | +0 | +1 | +0.3297948 | +1.492080 | +0.4920803 | +
피터 | +54 | +1 | +0 | +0.4011317 | +2.492947 | +1.0000000 | +
사라 | +59 | +1 | +0 | +0.4769187 | +2.096793 | +1.0000000 | +
데이비드 | +61 | +0 | +1 | +0.5077378 | +2.031438 | +1.0314379 | +
제니퍼 | +46 | +0 | +1 | +0.2902582 | +1.408963 | +0.4089632 | +
케빈 | +51 | +1 | +1 | +0.3576063 | +2.796371 | +1.0000000 | +
레베카 | +76 | +0 | +1 | +0.7223278 | +3.601369 | +2.6013690 | +
토니 | +81 | +1 | +0 | +0.7797825 | +1.282409 | +1.0000000 | +
엘리 | +83 | +1 | +0 | +0.8002318 | +1.249638 | +1.0000000 | +
스티브 | +37 | +1 | +1 | +0.1901277 | +5.259622 | +1.0000000 | +
안나 | +47 | +0 | +1 | +0.3031256 | +1.434979 | +0.4349789 | +
마이클 | +51 | +0 | +1 | +0.3576063 | +1.556678 | +0.5566777 | +
제시카 | +68 | +1 | +0 | +0.6136449 | +1.629607 | +1.0000000 | +
댄 | +84 | +1 | +1 | +0.8099085 | +1.234707 | +1.0000000 | +
소피아 | +59 | +0 | +1 | +0.4769187 | +1.911749 | +0.9117489 | +
브라이언 | +62 | +1 | +1 | +0.5231400 | +1.911534 | +1.0000000 | +
나탈리 | +78 | +0 | +1 | +0.7463771 | +3.942861 | +2.9428614 | +
대니얼 | +84 | +1 | +0 | +0.8099085 | +1.234707 | +1.0000000 | +
엘레나 | +89 | +1 | +0 | +0.8529311 | +1.172428 | +1.0000000 | +
로버트 | +39 | +0 | +1 | +0.2098490 | +1.265581 | +0.2655808 | +
줄리아 | +46 | +0 | +1 | +0.2902582 | +1.408963 | +0.4089632 | +
스콧 | +53 | +0 | +0 | +0.3864109 | +1.629755 | +0.6297551 | +
니콜 | +77 | +1 | +0 | +0.7345263 | +1.361422 | +1.0000000 | +
앤드류 | +75 | +1 | +1 | +0.7097903 | +1.408867 | +1.0000000 | +
케이트 | +54 | +0 | +1 | +0.4011317 | +1.669816 | +0.6698162 | +
라이언 | +59 | +1 | +1 | +0.4769187 | +2.096793 | +1.0000000 | +
미셸 | +86 | +0 | +1 | +0.8281768 | +5.819935 | +4.8199354 | +
조셉 | +88 | +1 | +0 | +0.8450253 | +1.183396 | +1.0000000 | +
엘레나 | +83 | +1 | +0 | +0.8002318 | +1.249638 | +1.0000000 | +
그룹 | +Treatment | +Control | +
---|---|---|
ATT | +70 | +69.04 | +
ATE | +62.17 | +63.1 | +
Original Cohort | +70(63.72) | +56.35(63.72) | +
IPTW는 ATE니까 무조건 이걸해야겠네?
+Truncated weight
+분석난이도 증가
+GLM, Cox에 Weight를 고려 (glm
, cox
weights 옵션 또는 svyglm
, svycox
)
Weighted Kaplan-meier from svycox (survfit
weights 옵션 또는 svykm
)
log-rank test(X), survey rank test(O)
## Gaussian
-glm_gaussian <- glm(mpg~cyl + disp, data = mtcars)
-glmshow.display(glm_gaussian, decimal = 2)
$first.line
-[1] "Linear regression predicting mpg\n"
-
-$table
- crude coeff.(95%CI) crude P value adj. coeff.(95%CI) adj. P value
-cyl "-2.88 (-3.51,-2.24)" "< 0.001" "-1.59 (-2.98,-0.19)" "0.034"
-disp "-0.04 (-0.05,-0.03)" "< 0.001" "-0.02 (-0.04,0)" "0.054"
-
-$last.lines
-[1] "No. of observations = 32\nR-squared = 0.7596\nAIC value = 167.1456\n\n"
## Gaussian
+glm_gaussian <- glm(mpg~cyl + disp, data = mtcars)
+glmshow.display(glm_gaussian, decimal = 2)
$first.line
+[1] "Linear regression predicting mpg\n"
+
+$table
+ crude coeff.(95%CI) crude P value adj. coeff.(95%CI) adj. P value
+cyl "-2.88 (-3.51,-2.24)" "< 0.001" "-1.59 (-2.98,-0.19)" "0.034"
+disp "-0.04 (-0.05,-0.03)" "< 0.001" "-0.02 (-0.04,0)" "0.054"
+
+$last.lines
+[1] "No. of observations = 32\nR-squared = 0.7596\nAIC value = 167.1456\n\n"
## Binomial
-glm_binomial <- glm(vs~cyl + disp, data = mtcars, family = binomial)
-glmshow.display(glm_binomial, decimal = 2)
$first.line
-[1] "Logistic regression predicting vs\n"
-
-$table
- crude OR.(95%CI) crude P value adj. OR.(95%CI) adj. P value
-cyl "0.2 (0.08,0.56)" "0.002" "0.15 (0.02,1.02)" "0.053"
-disp "0.98 (0.97,0.99)" "0.002" "1 (0.98,1.03)" "0.715"
-
-$last.lines
-[1] "No. of observations = 32\nAIC value = 23.8304\n\n"
## Binomial
+glm_binomial <- glm(vs~cyl + disp, data = mtcars, family = binomial)
+glmshow.display(glm_binomial, decimal = 2)
$first.line
+[1] "Logistic regression predicting vs\n"
+
+$table
+ crude OR.(95%CI) crude P value adj. OR.(95%CI) adj. P value
+cyl "0.2 (0.08,0.56)" "0.002" "0.15 (0.02,1.02)" "0.053"
+disp "0.98 (0.97,0.99)" "0.002" "1 (0.98,1.03)" "0.715"
+
+$last.lines
+[1] "No. of observations = 32\nAIC value = 23.8304\n\n"
TableSubgroupMultiGLM(status ~ sex, var_subgroups = c("kk", "kk1"), data = lung, family = "binomial")
Variable Count Percent OR Lower Upper P value P for interaction
-sex2 Overall 228 100 3.01 1.65 5.47 <0.001 <NA>
-1 kk <NA> <NA> <NA> <NA> <NA> <NA> 0.476
-2 0 38 16.9 7 0.7 70.03 0.098 <NA>
-3 1 187 83.1 2.94 1.55 5.57 0.001 <NA>
-4 kk1 <NA> <NA> <NA> <NA> <NA> <NA> 0.984
-5 0 8 3.6 314366015.19 0 Inf 0.997 <NA>
-6 1 217 96.4 2.85 1.55 5.25 0.001 <NA>
Variable Count Percent Point Estimate Lower Upper sex=1 sex=2 P value P for interaction
-sex Overall 228 100 1.91 1.14 3.2 100 100 0.014 <NA>
-1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
-2 kk <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 0.525
-3 0 38 16.9 2.88 0.31 26.49 10 100 0.35 <NA>
-4 1 187 83.1 1.84 1.08 3.14 100 100 0.026 <NA>
-5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
-6 kk1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 0.997
-7 0 8 3.6 <NA> <NA> <NA> 0 100 <NA> <NA>
-8 1 217 96.4 1.88 1.12 3.17 100 100 0.018 <NA>
TableSubgroupMultiGLM(status ~ sex, var_subgroups = c("kk", "kk1"), data = lung, family = "binomial")
Variable Count Percent OR Lower Upper P value P for interaction
+sex2 Overall 228 100 3.01 1.65 5.47 <0.001 <NA>
+1 kk <NA> <NA> <NA> <NA> <NA> <NA> 0.476
+2 0 38 16.9 7 0.7 70.03 0.098 <NA>
+3 1 187 83.1 2.94 1.55 5.57 0.001 <NA>
+4 kk1 <NA> <NA> <NA> <NA> <NA> <NA> 0.984
+5 0 8 3.6 314366015.19 0 Inf 0.997 <NA>
+6 1 217 96.4 2.85 1.55 5.25 0.001 <NA>
Variable Count Percent Point Estimate Lower Upper sex=1 sex=2 P value P for interaction
+sex Overall 228 100 1.91 1.14 3.2 100 100 0.014 <NA>
+1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
+2 kk <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 0.525
+3 0 38 16.9 2.88 0.31 26.49 10 100 0.35 <NA>
+4 1 187 83.1 1.84 1.08 3.14 100 100 0.026 <NA>
+5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
+6 kk1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 0.997
+7 0 8 3.6 <NA> <NA> <NA> 0 100 <NA> <NA>
+8 1 217 96.4 1.88 1.12 3.17 100 100 0.018 <NA>