-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsetx.sthlp
198 lines (145 loc) · 8.22 KB
/
setx.sthlp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
.-
help for ^setx^
.-
Sets values of explanatory variables (x's)
------------------------------------------
^setx^
^setx^ function [weight] [^if^ exp] [^in^ range] [^, noinh^er ^nocw^del ^nob^ound]
^setx^ varname1 function1 varname2 function2 varname3 function3 ...
[weight] [^if^ exp] [^in^ range] [^, noi^nher ^nocw^del ^nob^ound]
^setx^ (varname1 varname2) function1 (varname3 varname4) function2 ...
[weight] [^if^ exp] [^in^ range] [^, noi^nher ^nocw^del ^nob^ound]
where
function = mean|median|min|max|p#|math|#|^`^macro^'^|varname^[^#^]^
Description
-----------
After estimating parameters with @estsimp@ or @relogit@, use ^setx^ to set values
for the explanatory variables (the X's), change values that have already been
set, or list the values that have been chosen. The main value-types are
^mean^ arithmetic mean
^median^ median
^min^ minimum
^max^ maximum
^p^# #th percentile
math a mathematical expression, such as 5*5 or sqrt(23)
# a numeric value, such as 5
^`^macro^'^ the contents of a local macro
^[^#^]^ the value in the #th observation of the dataset
Setx will not accept spaces in mathematical expressions unless you enclose the
expression in parentheses. For instance, ^setx x4 ln(20)^ and ^setx x4 (ln( 20 ))^
are equivalent valid commands, but ^setx x4 ln( 20 )^ is a syntax error.
If you ran relogit with the wc() or pc() options, indicating that the data
were selected on the dependent variable, ^setx^ will correct the selection
bias when calculating summary statistics. For this reason, means and percentiles
produced by ^setx^ may differ from means and percentiles of the (biased) sample.
When the proportion of 1's in the population is known only to fall within a
range, such as pc(.2 .3), ^setx^ will calculate bounds on the values of the
explanatory variables. The result will two X-vectors, the first assuming that
the true proportion of 1's is at its lower bound, and the second conditional
on the true proportion being at its upper bound. The program will pass these
vectors to ^relogitq^ and use them to calculate bounds on quantities of interest.
To set each explanatory variable at a single value that falls midway between
its upper and lower bounds, use the ^nob^ound option that is described below.
If you used multiply imputed datasets at the estimation stage, ^setx^ will use
those same imputed datasets to calculate values for the explanatory variables.
For instance, ^estsimp x1 mean^ would calculate the mean of x1 across all the
imputed datasets.
When using ^setx^ or any other Stata command to calculate summary statistics such
as means, medians, minimums, maximums, and percentiles, it is important to
define the sample. At the estimation stage, Stata automatically disregards
observations that do not satisfy the "if", "in", and "weight" conditions
specified by the user. It also ignores observations with missing values on
one or more variables. Before setting a particular variable equal to its mean
or any other summary statistic, users must decide whether to calculate the
statistic based only on observations that were used during the estimation
stage, or to include other observations in the calculation.
By default, ^setx^ inherits the if-in-weight conditions from @estsimp@ and
disregards (casewise-deletes) any observation with missing values on the
dependent or explanatory variables. You can specify different if-in-weight
conditions by including them in the ^setx^ command line, and you can disregard
all inherited conditions by using the ^noinh^er and ^nocw^del options described
below.
^setx^ relies upon five globals: the matrices mrt_xc, mrt_xcH, and mrt_xcL,
and the macros mrt_vt and mrt_seto. If you change the values of these
globals, the program may not work properly.
^setx^ accepts aweights and fweights. It also accepts the special options
listed below.
Options
-------
^noinh^er causes setx to ignore all if-in-weight conditions that are inherited
from estsimp. The user can specify new if-in-weight conditions by typing
them as part of the setx command.
^nocw^del forces setx to calculate summary statistics based on all valid
observations for a given variable, even if the observations contain missing
values for the other variables. If ^nocw^del is not specified, setx will
casewise-delete observations with missing values.
^nob^ound. This option is available only after -relogit-, and only when the
true proportion of 1's is assumed to fall within a specified range. Suppose
the user typed ^pc(.2 .4)^ with relogit and then entered ^setx x1 mean^.
By default, setx would set the variable x1 equal to two values: the mean
of x1, assuming that the true proportion of ones is only 0.2, and the mean
of x1, allowing that the true proportion is as high as 0.4. Both values for
x1 will be passed to relogitq and used to calculate bounds on quantities of
interest. The ^nob^ound option overrides this procedure by setting each x
to a single value: the midpoint of its upper and lower bound. Thus, the
command ^setx x1 mean, nob^ound would set x1 equal to the following expression:
[(mean(x1)|tau=0.2) + (mean(x1)|tau=0.4)]/2, where tau represents the
presumed proportion of 1's in the population.
^keepmrt^ is a programmer's option that instructs ^setx^ to return the matrix
r(mrt_xc) or the matrices r(mrt_xcH) and r(mrt_xcL) without changing the
globals mrt_xc, mrt_xcL, mrt_xcH, mrt_vt, and mrt_seto.
Examples
--------
To list values that have already been set:
. ^setx^
To set each explanatory variable at its mean:
. ^setx mean^
To set each explanatory variable at its median, based on a sample in which
x3>12, all inherited conditions are ignored, and casewise deletion is
suppressed:
. ^setx median if x3>12, noinh nocw^
To set each explanatory variable to the value contained in the 15th
observation of the dataset
. ^setx [15]^
The command can also set each variable separately. For instance, to set x1
at its mean, x2 at its median, x3 at its minimum, x4 at its maximum,
x5 at its 25th percentile, x6 at ln(20), x7 at 2.5, and x8 equal to a local
macro called myval, type:
. ^setx x1 mean x2 median x3 min x4 max x5 p25 x6 ln(20) x7 2.5 x8 `myval'
^setx^ can also set values for groups of variables. To set x1 and x2 to
their means, x3 to its median, and x4 and x5 to their 25th percentiles, type:
. ^setx (x1 x2) mean x3 median (x4 x5) p25^
To change the value of x3 from its peviously chosen value to 5*5
. ^setx x3 5*5^
To set all variables except x10 at their means, and fix x10 at its 25th
percentile, call setx twice: once to set all variables at their means, and
a second time to change the value of x10 to its 25th percentile.
. ^setx mean^
. ^setx x10 p25^
Distribution
------------
^setx^ is part of CLARIFY, a suite of Stata programs for interpreting
statistical results, and is (C) Copyright, 1999, Michael Tomz, Jason
Wittenberg and Gary King, All Rights Reserved. You may copy and
distribute this program provided no charge is made and the copy is
identical to the original. To request an exception, please contact:
Michael Tomz <tomz@@fas.harvard.edu>
Department of Government, Harvard University
Littauer Center North Yard
Cambridge, MA 02138
We recommend that you distribute the current version of this program,
which is available from http://GKing.Harvard.Edu.
References
----------
Gary King, Michael Tomz, and Jason Wittenberg 1998. "Making
the Most of Statistical Analyses: Improving Interpretation and
Presentation." Paper prepared for presentation at the Annual
Meetings of the American Political Science Association,
Boston, MA (August).
Gary King and Langche Zeng. 1999a. "Logistic Regression in Rare
Events Data," Department of Government, Harvard University,
available from http://GKing.Harvard.Edu.
Gary King and Langche Zeng. 1999b. "Estimating Absolute, Relative,
and Attributable Risks in Case-Control Studies," Department of
Government, Harvard University, available from
http://GKing.Harvard.Edu.