We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Description:
The numerical values in the columns Goal, Pledged, and Backers appear to be highly skewed, which may indicate the presence of outliers.
Task: check and address this
Visualize the distribution of the data: Create histograms and boxplots to check the distribution of the data and identify outliers.
Remove outliers: Apply a method to remove outliers, such as setting a threshold or using the Interquartile Range (IQR).
Delete outliers (z.B. mit IQR-Methode) def remove_outliers(df, column): Q1 = df[column].quantile(0.25) Q3 = df[column].quantile(0.75) IQR = Q3 - Q1 lower_bound = Q1 - 1.5 * IQR upper_bound = Q3 + 1.5 * IQR return df[(df[column] >= lower_bound) & (df[column] <= upper_bound)]
for feature in numerical_features: kickstarter = remove_outliers(kickstarter, feature)
If evaluation is positive, a function will be created and added to the base.jpyt
The text was updated successfully, but these errors were encountered:
New ticket: deciding on best transformation for outliers
Sorry, something went wrong.
Essejran
No branches or pull requests
Description:
The numerical values in the columns Goal, Pledged, and Backers appear to be highly skewed, which may indicate the presence of outliers.
Task:
check and address this
Visualize the distribution of the data:
Create histograms and boxplots to check the distribution of the data and identify outliers.
Remove outliers:
Apply a method to remove outliers, such as setting a threshold or using the Interquartile Range (IQR).
Delete outliers (z.B. mit IQR-Methode)
def remove_outliers(df, column):
Q1 = df[column].quantile(0.25)
Q3 = df[column].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
return df[(df[column] >= lower_bound) & (df[column] <= upper_bound)]
for feature in numerical_features:
kickstarter = remove_outliers(kickstarter, feature)
If evaluation is positive, a function will be created and added to the base.jpyt
The text was updated successfully, but these errors were encountered: