A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
-
Updated
Nov 3, 2017 - Jupyter Notebook
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Command line interface for spark cluster management app
This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark clusters are set up within a Docker container on Azure.
📓 Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
local kubernetes-based ml setup
A python library to submit spark job in yarn cluster at different distributions (Currently CDH, HDP)
This project create an Hadoop and Spark cluster on Amazon AWS with Terraform
A collection of scripts to easily start HDFS and Spark clusters
Performing various product review analysis on Amazon dataset using Apache Spark and MongoDB
Research to setup and use a Spark Standalone Multi-Node Cluster.
Stuff done on AWS. Gathered the steps of creating spark cluster on EC2.
Docker image to deploy a spark cluster in containers
Template for Spark Data Science Projects
Spark on Kubernetes PoCs
spark-clusters management with docker
Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.
Add a description, image, and links to the spark-clusters topic page so that developers can more easily learn about it.
To associate your repository with the spark-clusters topic, visit your repo's landing page and select "manage topics."