MLKit 🤖

MLKit is a Python library featuring machine learning algorithms implemented from scratch, inspired by the concepts and functionality of popular libraries like Scikit-Learn.

Please note that this repo is a work in progress, my first project of this magnitude, and its main purpose is educational.

Installation

To install the MLKit library, run the following command:

pip install git+https://github.com/b-ionut-r/mlkit.git@main

Usage

Each model follows the Scikit-Learn API (is endowed with .fit(), .predict(), .score() methods)

Regression

To use the linear regression model for regression tasks, follow the example below:

from mlkit.linear_model import LinearRegression
from mlkit.model_selection import train_test_split
from mlkit.preprocessing import MinMaxScaler

# Load your data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the model
reg_model = LinearRegression(n_iters = 25000, 
                             lr = 1e-3, 
                             reg = ("l2", 1e-3),
                             method = "gd",
                             batch_size = 128, # if method is "sgd"
                             scaler = MinMaxScaler(), 
                             verbose = 1000,
                             random_state = 42)
reg_model.fit(X_train, y_train)

# Evaluate the model
print(reg_model.score(X_test, y_test))

Binary Classification

To use the logistic regression model for binary classification, follow the example below:

from mlkit.linear_model import LogisticRegression
from mlkit.preprocessing import MinMaxScaler
from mlkit.model_selection import train_test_split

# Load your data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the model
cls_model = LogisticRegression(n_iters = 25000, 
                               lr = 1e-3, 
                               reg = ("l2", 1e-3), 
                               method = "gd",
                               batch_size = 128,
                               scaler = MinMaxScaler(), 
                               threshold = 0.5,
                               verbose = 1000,
                               random_state = 42)
cls_model.fit(X_train, y_train)

# Evaluate the model
print(cls_model.score(X_test, y_test))

Multiclass Classification

To use the logistic regression model for multiclass classification, follow the example below:

from mlkit.linear_model import MultiLogisticRegression
from mlkit.model_selection import train_test_split
from mlkit.preprocessing import MinMaxScaler, LabelEncoder


# Load your data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the model
cls_model = MultiLogisticRegression(n_iters = 25000, 
                                    lr = 1e-3, 
                                    reg = ("l2", 1e-3), 
                                    method = "gd",
                                    batch_size = 128,
                                    scaler = MinMaxScaler(), 
                                    label_encoder = LabelEncoder(), # one hot encodes label categories
                                    verbose = 1000,
                                    random_state = 42)
cls_model.fit(X_train, y_train)

# Evaluate the model
print(cls_model.score(X_test, y_test))

Decision Tree

To use the decision tree model for classification / regression tasks, follow the example below:

from mlkit.tree import DecisionTreeClassifier # or DecisionTreeRegressor
from mlkit.model_selection import train_test_split

# Load your data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the model
tree_model = DecisionTreeClassifier(criterion = "gini",
                                    splitter = "best",
                                    max_depth = 10, 
                                    min_samples_split = 5, 
                                    min_samples_leaf = 2,
                                    max_features = "sqrt",
                                    max_leaf_nodes = 100,
                                    min_info_gain = 0.1,
                                    random_state = 42)
tree_model.fit(X_train, y_train)

# Evaluate the model
print(tree_model.score(X_test, y_test))

# Plot the tree
tree_model.plot_tree(feature_names)

K-Nearest Neighbors

To use the K-Nearest Neighbors model for classification tasks, follow the example below:

from mlkit.neighbors import KNeighborsClassifier
from mlkit.model_selection import train_test_split
from mlkit.preprocessing import MinMaxScaler

# Load your data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the model
knn_model = KNeighborsClassifier(k = 5, 
                                 metric = 'euclidean',
                                 scaler = MinMaxScaler())
knn_model.fit(X_train, y_train)

# Evaluate the model
print(knn_model.score(X_test, y_test))

Demos

The repository contains several Jupyter notebooks demonstrating the usage of the MLKit library for various machine learning tasks:

demo_binary.ipynb: Binary classification using logistic regression.
demo_multiclass.ipynb: Multiclass classification using logistic regression.
demo_simple_reg.ipynb: Simple linear regression.
demo_multiple_reg.ipynb: Multiple linear regression.
demo_multilabel_reg.ipynb: Multilabel regression.
demo_kneighbors.ipynb: K-Nearest Neighbors algorithm.
demo_binary_tuner.ipynb: Hyperparameter tuning for binary classification.
demo_regression_tuner.ipynb: Hyperparameter tuning for regression.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

Contact

For any questions or inquiries, please contact the repository owner.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
demos		demos
mlkit		mlkit
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLKit 🤖

Installation

Usage

Regression

Binary Classification

Multiclass Classification

Decision Tree

K-Nearest Neighbors

Demos

License

Contributing

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

b-ionut-r/mlkit

Folders and files

Latest commit

History

Repository files navigation

MLKit 🤖

Installation

Usage

Regression

Binary Classification

Multiclass Classification

Decision Tree

K-Nearest Neighbors

Demos

License

Contributing

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages