ID3 is a Machine Learning Decision Tree Classification Algorithm that uses two methods to build the model. The two methods are Information Gain and Gini Index.
- Version 1.0.0 - Information Gain Only
- Version 2.0.0 - Gini Index added
- Version 2.0.1 - Documentation Sorted
- Version 2.0.2 - All Sorted
Install directly from my PyPi
pip install classic-ID3-DecisionTree
Or Clone the Repository and install
python3 setup.py install
The Training Set array consisting of Features.
The Training Set array consisting of Outcome.
The Entire DataSet.
Initialise the instance of Decision Tree Classifier class.
Add the features to the model by sending the dataset. The model will fetch the column features. The second parameter is the column name of outcome array.
To build the decision tree using Information Gain
To build the decision tree using Gini Index
Predict the Test Set Results
pip install classic-ID3-DecisionTree
from classic_ID3_decision_tree import DecisionTreeClassifier
id3 = DecisionTreeClassifier()
id3.add_features(dataset, result_col_name)
id3.information_gain(X_train, y_train)
id3.gini_index(X_train, y_train)
y_pred = id3.predict(X_test)
Download dataset from here
- import numpy as np
- import matplotlib.pyplot as plt
- import pandas as pd
- dataset = pd.read_csv('house-votes-84.csv')
- rawdataset = pd.read_csv('house-votes-84.csv')
- party = {'republican':0, 'democrat':1}
- vote = {'y':1, 'n':0, '?':0}
- for col in dataset.columns:
- if col != 'party':
- dataset[col] = dataset[col].map(vote)
- dataset['party'] = dataset['party'].map(party)
- X = dataset.iloc[:, 1:17].values
- y = dataset.iloc[:, 0].values
- from sklearn.model_selection import KFold
- kf = KFold(n_splits=5)
- for train_index, test_index in kf.split(X,y):
- X_train, X_test = X[train_index], X[test_index]
- y_train, y_test = y[train_index], y[test_index]
- from classic_ID3_decision_tree import DecisionTreeClassifier
- id3 = DecisionTreeClassifier()
- id3.add_features(dataset, 'party')
- print(id3.features)
- id3.information_gain(X_train, y_train)
- OR
- id3.gini_index(X_train, y_train)
- y_pred = id3.predict(X_test)
You can find the code at my Github.