Error while calculating standard errors #13

achinmay17 · 2023-12-20T03:20:22Z

Hi, I am trying to run Doubly Robust S-DID with unbalanced panel and varying base period.
the control group is 'not_yet_treated'
My code is as following:

    att_gt = ATTgt(data=diddata, cohort_name="course_month_end_date", base_period='varying', freq='M') 
    att_gt.fit(formula = formula, est_method='dr',control_group=control_group, progress_bar = True)

however, I am getting following error which I am not able to understand

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[241], line 7
      4 diddata = diddata.reset_index().set_index(keys=['id','month_end_date'])
      6 att_gt = ATTgt(data=diddata, cohort_name="course_month_end_date", base_period='varying', freq='M')
----> 7 att_gt.fit(formula = formula, est_method='dr',control_group=control_group, progress_bar = False)

File ~/anaconda3/lib/python3.11/site-packages/differences/attgt/attgt.py:718, in ATTgt.fit(self, formula, weights_name, control_group, base_delta, est_method, as_repeated_cross_section, boot_iterations, random_state, alpha, cluster_var, split_sample_by, n_jobs, backend, progress_bar)
    688     res = get_att_gt(
    689         data=(
    690             self._data_matrix
   (...)
    714         ),
    715     )
    717     # standard errors & ci/cbands
--> 718     res = get_standard_errors(
    719         ntl=res,
    720         cluster_groups=cluster_groups,
    721         alpha=alpha,
    722         boot_iterations=boot_iterations,
    723         random_state=random_state,
    724         n_jobs_boot=n_jobs,
    725         backend_boot=backend,
    726         progress_bar=progress_bar,
    727         sample_name=s if s != "full_sample" else None,
    728         release_workers=s_idx == n_sample_names,
    729     )
    731     self._result_dict[s]["ATTgt_ntl"] = res
    733 self._fit_res = output_dict_to_dataframe(
    734     extract_dict_ntl(self._result_dict),
    735     stratum=bool(self._strata),
    736     date_map=self._map_datetime,
    737 )

File ~/anaconda3/lib/python3.11/site-packages/differences/attgt/attgt_cal.py:442, in get_standard_errors(ntl, cluster_groups, alpha, boot_iterations, random_state, backend_boot, n_jobs_boot, progress_bar, sample_name, release_workers)
    436     raise ValueError(
    437         "'boot_iterations' must be >= 0. "
    438         "If boot_iterations=0, analytic standard errors are computed"
    439     )
    441 # influence funcs + idx for not nan cols
--> 442 inf_funcs, not_nan_idx = stack_influence_funcs(ntl, return_idx=True)
    444 # create an empty array to populate with the standard errors
    445 se_array = np.empty(len(ntl))

File ~/anaconda3/lib/python3.11/site-packages/differences/attgt/utility.py:382, in stack_influence_funcs(ntl, return_idx)
    380     inf_funcs = inf_funcs.toarray()  # faster mboot if dense matrix
    381 else:
--> 382     inf_funcs = np.stack(
    383         [r.influence_func for r in ntl if r.influence_func is not None], axis=1
    384     )
    386 if return_idx:
    387     # indexes for the non-missing influence_func
    388     not_nan_idx = np.array(
    389         [i for i, r in enumerate(ntl) if r.influence_func is not None]
    390     )

File <__array_function__ internals>:200, in stack(*args, **kwargs)

File ~/anaconda3/lib/python3.11/site-packages/numpy/core/shape_base.py:460, in stack(arrays, axis, out, dtype, casting)
    458 arrays = [asanyarray(arr) for arr in arrays]
    459 if not arrays:
--> 460     raise ValueError('need at least one array to stack')
    462 shapes = {arr.shape for arr in arrays}
    463 if len(shapes) != 1:

ValueError: need at least one array to stack

Based on what I could understand from the package, it is not able to calculate standard errors. It would be great if you can help with debugging. Thanks.

The text was updated successfully, but these errors were encountered:

bernardodionisi · 2023-12-20T15:38:13Z

Are all your cohorts very small? How unbalanced is the data? Would you be able to share some data to reproduce this error? A simulated dataset that contains the same entity-time structure and cohort composition should do. Thanks!

jonahnieuwenhuijzen · 2024-05-30T11:51:18Z

@bernardodionisi

Are all your cohorts very small? How unbalanced is the data? Would you be able to share some data to reproduce this error? A simulated dataset that contains the same entity-time structure and cohort composition should do. Thanks!

I have the same problem (ValueError: need at least one array to stack
) and the data is very unbalanced and cohorts are small. Do you have a solution?

bernardodionisi · 2024-05-30T14:40:40Z

Hi @jonahnieuwenhuijzen

have you tried different estimation methods? using the est_method parameter? The default is dr-mle, you may try dr-ipt which changes how the propensity scores are calculated. But could you try to experiment with other methods? Let me know if it helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while calculating standard errors #13

Error while calculating standard errors #13

achinmay17 commented Dec 20, 2023

bernardodionisi commented Dec 20, 2023

jonahnieuwenhuijzen commented May 30, 2024

bernardodionisi commented May 30, 2024

Error while calculating standard errors #13

Error while calculating standard errors #13

Comments

achinmay17 commented Dec 20, 2023

bernardodionisi commented Dec 20, 2023

jonahnieuwenhuijzen commented May 30, 2024

bernardodionisi commented May 30, 2024