KMeans-Predictive-Modeling

K-Means Clustering is a Machine Learning Algorithm which creates a model based on the nearest-neighbour-choice.

The model in this repository has been implemented by k-means classification which predics the favourite sport in a state(in India).

Explanation

util_methods.py - Contains the linear algebra methods that are used in the k-means model.

fav_sport.py - Run this file to get output of the model. It contains the following methods:

major_cluster(labels) : Takes a list of input variables as parameter and returns a winner with most number of occurences.
knn_classify(k, labeled_points, new_point) : Driver algorithm which creates cluster based on the value of k.
plot_cities() : Plots the cities based on (latitude and longitude) with the appropriate markers of favourite sport.
classify_and_plot_grid() : Plots the graph of k-means clustering model!

Use

An attempt to make the code as readable as possible has been made.
You can modify the code to create your own k-means classification.
e.g. Instead of states(a list used in the fav_sport.py file), you can use languages which people speak throughout the world.
Feel free to reach out for any suggestions or help!

Inspiration

Joel Grus and his book: Data Science from Scratch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

KMeans-Predictive-Modeling

Explanation

Use

Inspiration

Files

README.md

Latest commit

History

README.md

File metadata and controls

KMeans-Predictive-Modeling

Explanation

Use

Inspiration