To setup the spark-cluster
chart locally you need:
-
Instal Minikube or Kubernetes on Docker Desktop for your OS
- Supported Kubernetes versions 1.11.0 - 1.18.0.
minikube start --kubernetes-version=1.18.0 --cpus=12 --memory=14g
- Install Kubectl
- Install Helm and initialize it (for Helm 2.x)
export TILLER_NAMESPACE=kube-system
kubectl create -n kube-system -f scripts/cluster-admin.yaml
kubectl create serviceaccount tiller --namespace kube-system
kubectl create clusterrolebinding tiller-cluster-role --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
helm init --upgrade --service-account tiller
- Add and sync Helm repository
jahstreet
helm repo add jetstack https://charts.jetstack.io
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart
helm repo add loki https://grafana.github.io/loki/charts
helm repo add jahstreet https://jahstreet.github.io/helm-charts
helm repo update
-
Run in a separate terminal
minikube tunnel
-
Install cluster-base chart
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.crds.yaml
helm upgrade --install cluster-base jahstreet/cluster-base --namespace kube-system
- Check the Nginx Ingress Controller load balancer external IP
kubectl get service cluster-base-ingress-nginx-controller --namespace kube-system
- Add entry to hosts file
<load-balancer-external-IP> my-cluster.example.com
- Install spark-cluster chart (NOTE: use release name
spark-cluster
)
helm upgrade --install spark-cluster --namespace spark-cluster jahstreet/spark-cluster \
--timeout 600 \
-f charts/spark-cluster/examples/custom-values-local.yaml
- Installation may take some time, wait until the
Pods
areRunning
kubectl get pods --watch --namespace spark-cluster
-
Go to
https://my-cluster.example.com/jupyterhub
in your browser -
Enter login
admin
and passwordadmin
-
Spawn
Jupyter profile and you'll be redirected to your personalJupyter Notebook
once it's Up and Running -
You can find Livy UI with the clickable links to the Spark UI, logs and debug info for the
Running
Jupyter sessions athttps://my-cluster.example.com/livy
-
Try out notebooks in
examples/
folder -
Install spark-monitoring chart
helm upgrade --install spark-monitoring --namespace monitoring jahstreet/spark-monitoring \
--timeout 600 \
-f charts/spark-monitoring/examples/custom-values-example.yaml
Note: at present the
spark-monitoring
chart requires to be installed with the release namespark-monitoring
to themonitoring
namespace in order to makePrometheus Pushgateway
service monitor work properly. Please refercharts/spark-monitoring/values.yaml
sectionpushgateway
to change that.
- Installation may take some time, wait until the
Pods
areRunning
kubectl get pods --watch --namespace monitoring
- Go to
https://my-cluster.example.com/grafana
in your browser - Login to Grafana with user
admin
and passwordadmin
- Go to
Explore
page via corresponding tab on the left panel, select datasourceLoki
and choose the Kubernetes labels to get Pod logs for
- Also you can find already pre-installed Grafana dasboards:
Spark Metrics
andCluster State Board