Skip to content

Latest commit

 

History

History
26 lines (19 loc) · 1.06 KB

README.md

File metadata and controls

26 lines (19 loc) · 1.06 KB

Exploratory data analysis and clustering using pytorch

Task performed

  • Dataset size and class distribution

  • Perform some sanity checks

    • Checking for corrupted files
    • File typing check
    • Checking image channels
  • Visualizing the dataset

  • Visualizing the distribution of channel pixels by class

  • Identifying very dark and very light images and removing them

  • Identifying duplicates in our data

  • Transforming the images into a Feature Matrix

  • Using KMeans to cluster our images

Technologies applied

Python Scikitlearn Pytorch Jupyter