Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomization Assessment and Balance Testing #9

Open
jwbowers opened this issue Feb 23, 2019 · 4 comments
Open

Randomization Assessment and Balance Testing #9

jwbowers opened this issue Feb 23, 2019 · 4 comments

Comments

@jwbowers
Copy link
Collaborator

People commonly use F-tests or Likelihood Ratio tests (Even with cluster randomized studies). Hansen and Bowers 2008 showed that these approaches do not control the false positive rate in small samples. Explain and compare the different approaches.

@jwbowers
Copy link
Collaborator Author

Perhaps also include a section on re-randomization. If the number of controls and treateds are the same then the simple mean difference is still an unbiased estimator of the average treatment effect. However, the t-test in a linear model using the treatment variable should be conservative — and so the precision benefits of re-randomization might vanish unless more direct permutation based randomization inference is used. See Morgan and Rubin.

@donaldpgreen
Copy link
Collaborator

donaldpgreen commented Feb 24, 2019 via email

@jwbowers
Copy link
Collaborator Author

I think that permutation based or exact randomization inference would get right p-values for the F-test. Issue I saw at EGAP (and see elsewhere) are asymptotic F tests and LR tests (after, say, multinomial logit or logit). Probably reasonable approximations to the randomization inference in large samples (just as our d^2 test is a large sample approximation). Funkier in cluster randomized contexts or smaller samples, etc.. Idea here is to help people make some judgements when the F-test/LR-test/d^2 test approaches are sensible versus when one should verify. I also suspect that the d^2 test would break down at smaller samples than the F-test or LR-test would break down. But this is just a suspicion since all are supposed to be convenient, fast, large-sample approaches.

@donaldpgreen
Copy link
Collaborator

donaldpgreen commented Feb 25, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants