File | Description | Source link (with details) | Preprocessing applied | Label column |
---|---|---|---|---|
generated.csv |
Automatically-generated dataset containing data samples separated into very well-delineated categories. This can be considered a "best-case scenario" test case. | label |
||
defaults.csv |
Defaults on credit card payments | UCI | Minor (column name reformatting) | defaulted |
winequality.csv |
Quality ratings of Portuguese white wines | UCI | Added binarized label column recommend indicating quality >= 7 |
recommend |
vehicles.csv |
Recognizing vehicle type from its silhouette | OpenML | None | Class |
eeg.csv |
EEG eye state measurements | OpenML | Dropped a few outlier rows | Class |
These can all be loaded using Pandas:
import pandas as pd
dataset = pd.read_csv("file.csv")