Building a docker image with Pyspark and a Azure Gen2 storage connection, so as to enable testing of your data lake with pyspark locally.
docker build -t sparklocal .
docker run -it sparklocal:latest /bin/bash
Type pyspark into the terminal and you should be promoted with the Spark UI at localhost:4040
pyspark
python test_spark.py
python test_adls.py
Apache Spark docker deployments