Skip to content

Exploratory data analysis of histopathologic cancer detect dataset from kaggle

Notifications You must be signed in to change notification settings

smrenato/histopathologic_cancer_detect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory data analysis and clustering using pytorch

Task performed

  • Dataset size and class distribution

  • Perform some sanity checks

    • Checking for corrupted files
    • File typing check
    • Checking image channels
  • Visualizing the dataset

  • Visualizing the distribution of channel pixels by class

  • Identifying very dark and very light images and removing them

  • Identifying duplicates in our data

  • Transforming the images into a Feature Matrix

  • Using KMeans to cluster our images

Technologies applied

Python Scikitlearn Pytorch Jupyter

About

Exploratory data analysis of histopathologic cancer detect dataset from kaggle

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published