Feature: Can we trivially speed up things for categorical X by aggregating Y's within categories?

In the course of pursuing faster null fitting for categorical covariates, it occurred to me that we may be able to speed up fitting under the alternative using a trivial technique:

In any discrete X setting with p categories **I think** we can immediately collapse Y from n x J to p x J, summing over the same X's. This may reduce computation especially for large n. This should work for any g().

I can't remember if scaling in n with radEmu is poor, but this could be worth a try. There's no need to keep rows as distinct, as any time they appear in the likelihood there is going to be aggregation over common X's. 

@svteichman would you have the bandwidth to investigate if this is promising? This is fitting under both the null and the alternative (potentially best tested individually), and any g(). I think we could first confirm that the results of fitting are the same when Y is n x J vs when Y is p x J (aggregating over the p categories). 

There would need to be a standard error adjustment (for `I`? `Dy`?) to ensure the Wald test stats are right. Presumably the same is true for the score. 

This is lower priority than #173 . 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Can we trivially speed up things for categorical X by aggregating Y's within categories? #174

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Can we trivially speed up things for categorical X by aggregating Y's within categories? #174

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions