Skip to content

ID3 is a Machine Learning Decision Tree Algorithm that uses two methods to build the model. The two methods are Information Gain and Gini Index.

License

Notifications You must be signed in to change notification settings

safirmotiwala/ML-ID3-Decision-Tree-Classification-Library-PyPi

Repository files navigation

ID3 Decision Tree Algorithm

ID3 is a Machine Learning Decision Tree Classification Algorithm that uses two methods to build the model. The two methods are Information Gain and Gini Index.

  • Version 1.0.0 - Information Gain Only
  • Version 2.0.0 - Gini Index added
  • Version 2.0.1 - Documentation Sorted
  • Version 2.0.2 - All Sorted

Installation

Install directly from my PyPi

pip install classic-ID3-DecisionTree

Or Clone the Repository and install

python3 setup.py install

Parameters

* X_train


The Training Set array consisting of Features.

* y_train


The Training Set array consisting of Outcome.

* dataset


The Entire DataSet.

Attributes

* DecisionTreeClassifier()


Initialise the instance of Decision Tree Classifier class.

* add_features(dataset, result_col_name)


Add the features to the model by sending the dataset. The model will fetch the column features. The second parameter is the column name of outcome array.

* information_gain(X_train, y_train)


To build the decision tree using Information Gain

* gini_index(X_train, y_train)


To build the decision tree using Gini Index

* predict(y_test)


Predict the Test Set Results

Documentation

1. Install the package

pip install classic-ID3-DecisionTree

2. Import the library

from classic_ID3_decision_tree import DecisionTreeClassifier

3. Create an object for Decision Tree Classifier class

id3 = DecisionTreeClassifier()

4. Add Column Features to the model

id3.add_features(dataset, result_col_name)

5. Build the Decision Tree Model using Information Gain

id3.information_gain(X_train, y_train)

OR

5. Build the Decision Tree Model using Gini Index

id3.gini_index(X_train, y_train)

6. Predict the Test Set Results

y_pred = id3.predict(X_test)


Example Code

0. Download the dataset

Download dataset from here

1. Import the dataset and Preprocess

  • import numpy as np
  • import matplotlib.pyplot as plt
  • import pandas as pd
  • dataset = pd.read_csv('house-votes-84.csv')
  • rawdataset = pd.read_csv('house-votes-84.csv')
  • party = {'republican':0, 'democrat':1}
  • vote = {'y':1, 'n':0, '?':0}
  • for col in dataset.columns:
    • if col != 'party':
      • dataset[col] = dataset[col].map(vote)
  • dataset['party'] = dataset['party'].map(party)
  • X = dataset.iloc[:, 1:17].values
  • y = dataset.iloc[:, 0].values
  • from sklearn.model_selection import KFold
  • kf = KFold(n_splits=5)
  • for train_index, test_index in kf.split(X,y):
    • X_train, X_test = X[train_index], X[test_index]
    • y_train, y_test = y[train_index], y[test_index]

2. Use the ID3 Library

  • from classic_ID3_decision_tree import DecisionTreeClassifier
  • id3 = DecisionTreeClassifier()
  • id3.add_features(dataset, 'party')
  • print(id3.features)
  • id3.information_gain(X_train, y_train)
  • OR
  • id3.gini_index(X_train, y_train)
  • y_pred = id3.predict(X_test)

Footnotes

You can find the code at my Github.

Connect with me on Social Media

Releases

No releases published

Packages

No packages published

Languages