Skip to content

NgKhaiPhu/Data-preprocessing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data-preprocessing

This is my submission for the Lab assignment 1 in Data mining class. This implementation consists of 2 sections: "Preprocessing" and "Introduction to numpy and pandas".

Preprocessing

Main features:

  • List attributes with missing data
  • Fill out the missing data (by constant, mean, median or mode)
  • Filter out rows and columns with a certain amount of missing data
  • Remove duplicated rows
  • Remove outliers
  • Calculate point combinations
  • Normalize data (min-max and Z-score)
  • Apply One-hot encoding

Introduction to numpy and pandas

  • Calculate the correlations between any given pairs of numeric attributes, and create a heat map for visualization
  • Use histogram charts to visualize the distribution of scores of every subjects

About

Different methods of data preprocessing

Topics

Resources

Stars

Watchers

Forks