A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
-
Updated
Nov 3, 2017 - Jupyter Notebook
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Command line interface for spark cluster management app
This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark clusters are set up within a Docker container on Azure.
📓 Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
local kubernetes-based ml setup
A python library to submit spark job in yarn cluster at different distributions (Currently CDH, HDP)
This project create an Hadoop and Spark cluster on Amazon AWS with Terraform
Performing various product review analysis on Amazon dataset using Apache Spark and MongoDB
A collection of scripts to easily start HDFS and Spark clusters
Research to setup and use a Spark Standalone Multi-Node Cluster.
Spark on Kubernetes PoCs
Template for Spark Data Science Projects
Stuff done on AWS. Gathered the steps of creating spark cluster on EC2.
Docker image to deploy a spark cluster in containers
spark-clusters management with docker
Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.
Add a description, image, and links to the spark-clusters topic page so that developers can more easily learn about it.
To associate your repository with the spark-clusters topic, visit your repo's landing page and select "manage topics."