Skip to content

Latest commit

 

History

History
19 lines (15 loc) · 764 Bytes

Instruction.md

File metadata and controls

19 lines (15 loc) · 764 Bytes

Titanic dataset

  • Explore: How does each feature relate to whether a person survives/alives? Do the EDA in more detail than usual and explain the results! Splitting: 80-20, stratify: y, random_state = 0

Preprocessing: * Drop decks * Fill in the missing value using a simple imputer * One hot encoding: sex, alone * Ordinal encoding: class * Binary encoding: embark town

Model selection: * Evaluation metrics used: F1_score * Logre, KNN, DecisionTreeClassifier, RandomForestClassifier * Hyperparameter tuning of 2 models that you think is the best * Create a summary for the evaluation results and conclusions on which model is the best for the titanic dataset

If possible, use pipelines when necessary to avoid data leakage.