Py DS_Engineer Lab Report #06

Python Programming for Data Scientists & Engineers Lab #06

Lab #06-1 Linear Regression in Tensorflow

Lab #06-1-2 K-Mean Clustering

A small dataset ( 23 people ) with their names, heights and weights is used in this case. For siplicity on clustering a fiarly small dataset, one iteration of K-mean Clustering was simutated throughout the process into 4 Clusters. The labeling will be assigned back to the data so each person will know what size of the T-shirt they're having! And for the company, they'll be able to determine the quantity and size range based on customers' weights and heights.

Lab #06-2 Spectral + Hierarchical Clustering

Spectral Clustering a.k,a. Graphic Clustering

For social data, a graph formed by distances of points will be induced.The Spectral Clustering will then look at eigenvectors of the Laplacian of the graph to attempt to find a good (low dimensional) embedding of the graph into Euclidean space.

This technique is to find a transfornation of the graph to present manifold thathe the data is assumed to land on.

* Weaknesses : Partitioning is still polluting data with noise.

* Intuitive Parameters : Clustering number must be specifyour or hopefully find a 'suitabele' one through a range of parameters.

* Stability : A little bit more stable than K-mean due to the transformation but still suffer from some issues.

* Performance : A slower algorithm since spatial data don't have a sparse grpah ( unless we prep it by purselves).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Py DS_Engineer Lab Report #06

Python Programming for Data Scientists & Engineers Lab #06

Lab #06-1 Linear Regression in Tensorflow

Lab #06-1-2 K-Mean Clustering

Lab #06-2 Spectral + Hierarchical Clustering

Spectral Clustering a.k,a. Graphic Clustering

For social data, a graph formed by distances of points will be induced.The Spectral Clustering will then look at eigenvectors of the Laplacian of the graph to attempt to find a good (low dimensional) embedding of the graph into Euclidean space.

This technique is to find a transfornation of the graph to present manifold thathe the data is assumed to land on.

* Weaknesses : Partitioning is still polluting data with noise.

* Intuitive Parameters : Clustering number must be specifyour or hopefully find a 'suitabele' one through a range of parameters.

* Stability : A little bit more stable than K-mean due to the transformation but still suffer from some issues.

* Performance : A slower algorithm since spatial data don't have a sparse grpah ( unless we prep it by purselves).

Hierarchical Clustering

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally