Talk about Data visualization in Science. Based on my experience with ratCAVE project and suggested approaches in Python I created a talk for my fellow MSNE students. The talk covers main problems with use of scatter plot for big, convolved data and explains how to address it.
What should we keep in mind, when working with big datasets? In case of Scatter plots - 3 hyperparameters:
- overplotting - avoid obscuring the data
- saturation - look howmany points overlapping cause saturation of intensity points
- undersampling - taking a subset might not be an answer
Or instead you can work with Heatmaps and remember to address following problems (1 hyperparameter):
- undersaturation
- pick the color map in accordance to the
Presented on 01.06.2018 at the retreat for Master of Science in Neuroengineering students.
To run jupyter notebook as slides I used:
The talk was based on the use of:
- pandas
- seaborn
- datashader
- Nicholas A. Del Grosso - for supervision and inspiration for this talk
- Mohammad Bashiri - for feedback