Skip to content

Project Description: K-Means Clustering for Data Prediction and Visualization

Notifications You must be signed in to change notification settings

tremick/KMEANS-ALGORITHM-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

KMEANS-ALGORITHM-

Project Description: K-Means Clustering for Data Prediction and Visualization In this project, I utilized the K-means clustering algorithm to perform data prediction and visualization. The project consists of two parts:

  1. Synthetic Data Clustering:

I generated synthetic data using the make_blobs function from the scikit-learn library. The data includes 100 samples with four clusters and a specified cluster standard deviation. After applying K-means clustering with four clusters, the model successfully predicted cluster labels for each data point. The scatter plot visually represents the original data points and their assigned clusters. Additionally, the cluster centers are highlighted in red. 2. Iris Dataset Dimensionality Reduction and Clustering:

I utilized the Iris dataset, loaded using the seaborn library, to showcase the application of K-means clustering on real-world data. The dataset was preprocessed by one-hot encoding categorical variables. Principal Component Analysis (PCA) was employed for dimensionality reduction to visualize the data in a 2D space. K-means clustering with three clusters was applied to the reduced dataset. The scatter plot illustrates the clustered data points in the PCA-transformed space, with cluster centers highlighted in red. Technologies Used:

Python with libraries such as NumPy, Matplotlib, scikit-learn, seaborn, and pandas. Conclusion:

The project demonstrates the versatility of the K-means algorithm in both synthetic and real-world datasets. Visualization techniques aid in understanding the clustering results and provide insights into the structure of the data. This project serves as an illustrative example of implementing K-means clustering for data prediction and visualization using synthetic and real-world datasets.

About

Project Description: K-Means Clustering for Data Prediction and Visualization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published