Data-Science-London-Scikit-learn

Objective

In this project, we are exploring Scikit-learn’s classification capabilities and accuracy estimation techniques using a synthetic dataset provided during a Data Science London meetup. The goal is to develop a binary classifier to categorize 9,000 objects, each represented by 40 numerical features composed of decimal values.

Dataset

This exercise utilizes a synthetic dataset with 40 features, representing objects from two distinct classes (labeled as 0 or 1). The training set consists of 1,000 samples, while the testing set contains 9,000 samples.

Model

The chosen architecture for this practice is the Random Forest classifier.

Accuracy estimation

Accuracy was calculated using the cross-validation method, with the training dataset divided into five folds.

Citation

Ben Hamner and Will Cukierski. Data Science London + Scikit-learn. https://kaggle.com/competitions/data-science-london-scikit-learn, 2013. Kaggle.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
scikit-learn-notebook.ipynb		scikit-learn-notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-Science-London-Scikit-learn

Objective

Dataset

Model

Accuracy estimation

Citation

About

Uh oh!

Releases

Packages

Languages

igoldshm/Data-Science-London-Scikit-learn

Folders and files

Latest commit

History

Repository files navigation

Data-Science-London-Scikit-learn

Objective

Dataset

Model

Accuracy estimation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages