This repo contains a collection of images that explain core data computing concepts like Apache Spark, file formats, Delta Lake and associated libraries.
You can learn a lot about data computing with some images with descriptive captions. See these pages to learn more:
- Spark
- Delta Lake
PySpark:
- quinn - TODO
- chispa
- ceja - TODO
- mack
- farsante
- unicron - TODO
Scala Spark:
- spark-sbt.g8 - TODO
- bebe - TODO
- spark-daria
- spark-fast-tests
Pandas:
- beavis - TODO