Skip to content

A micro cluster lab to experiment Dask and Spark (Python and Scala) based on Docker

License

Notifications You must be signed in to change notification settings

aminelemaizi/micro-cluster-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Micro-Cluster Lab Using Docker, To Experiment With Spark & Dask on Yarn

For more details about this project please refer to my article where I explain the motivations and how to recreate it by yourself.

Project Folder Tree

├── docker-compose.yml
├── Dockerfile
├── confs
│   ├── config
│   ├── core-site.xml
│   ├── hdfs-site.xml
│   ├── mapred-site.xml
│   ├── requirements.req
│   ├── slaves
│   ├── spark-defaults.conf
│   └── yarn-site.xml
├── datasets
│   ├── alice_in_wonderland.txt
│   └── iris.csv
├── notebooks
│   ├── Bash-Interface.ipynb
│   ├── Dask-Yarn.ipynb
│   ├── Python-Spark.ipynb
│   └── Scala-Spark.ipynb
└── script_files
    └── bootstrap.sh

Create the base container image

docker build . -t cluster-base

Run the cluster or micro-lab

docker-compose up -d

Yarn resource manager UI

Access the Yarn resource manager UI using the following link : http://localhost:8088/cluster/nodes

yarn ui

Jupyter Notebook with starters notebooks

Access Jupyter Notebook using this link : http://localhost:8888/

jupyter

Stopping the micro-lab

docker-compose down

About

A micro cluster lab to experiment Dask and Spark (Python and Scala) based on Docker

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages