Skip to content

M-Estimate #3

@alexjwang

Description

@alexjwang

Describe the encoding method below. Attach any relevant links that reference the encoding method.
Very similar to Target Encoding--only difference is that it has only one tunable parameter (m) versus target encoder's two tunable parameters (min_samples_leaf and smoothing).
https://contrib.scikit-learn.org/categorical-encoding/mestimate.html

Describe the encoder class method. Any additional functions aside from the essential fit(), transform(), and get_features()? For example, Hashing Encoder has get_hash_method().
Similar to Target Encoding.

Describe the encoder primitive for use with Featuretools.
Should have a mapping to encode any values in the dataframe column into its appropriate weighted average.

Describe the use cases in which this encoder would be useful (what kinds of data, high-cardinality, etc.).
Useful in high-cardinality data where one-hot encoding and other similar high-dimensionality resulting encoders do not work. Works in the same situations that Target Encoding does, but could be useful if Target's aforementioned parameters do not work for the situation.

Input type?
[Categorical]

Output type?
Numeric

List third party libraries required:
category-encoders

Describe encoding method's behavior with train, test, and new data.
Use train to learn the averages, test to validate the encoding and ML models, and new data will be encoded based off of the fitted encoder from the train data step.

Test cases.
np.nan

Metadata

Metadata

Assignees

No one assigned

    Labels

    New Method IdeaProposal for new encoding method

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions