gcp-dataproc

Star

Here are 14 public repositories matching this topic...

aeronaut2001 / Movie-Rating-Analysis

Star

Movie Rating Analysis using Apache Spark (pyspark)

apache-spark pyspark data-analytics gcp-dataproc

Updated Nov 8, 2023
Jupyter Notebook

nrohit78 / PigHive_StackExhangeData

Star

Data is fetched from StackExchange, transformed using Pig, queried and stored in Hive. Additionally, the TF-IDF of the top 10 users is calculated using Hive.

hive pig tf-idf gcp-dataproc google-datap

Updated Nov 21, 2020
PigLatin

aeronaut2001 / Car-Insurance-Cold-Calls-Data-Analysis

Star

Car Insurance Cold Calls Data Analysis using Apache Hive

hive gcp hdfs hql big-data-analytics apache-hadoop gcp-dataproc hql-joins

Updated Nov 8, 2023
HiveQL

visalvo / projectScalable

Star

Project for Scalable and Cloud Programming Course - 2018/19 UNIBO.

scala spark mapbox-gl-js pagerank gcp-dataproc weighted-pagerank

Updated May 21, 2020
JavaScript

aeronaut2001 / Marketing-Campaign-Data-Analysis

Star

Marketing Campaign Data Analysis Using Apache Spark (PySpark)

apache-spark pyspark hql apache-hive gcp-dataproc

Updated Nov 8, 2023
Jupyter Notebook

bug-data / Big_Data_First_Project

Star

First project for Big Data course held at Roma Tre University

python spark hive hadoop bigdata jupyter-notebook gcp university-project hadoop-streaming gcp-storage gcp-compute gcp-dataproc roma-tre-university

Updated Jun 26, 2019
Jupyter Notebook

RickLeite / Hadoop-Google-DataProc-DIOstudy

Star

Hadoop Google DataProc DIO study

hadoop google-cloud-platform gcp-cloud-functions gcp-dataproc digital-innovation-one

Updated Sep 4, 2021
Python

DenisOgr / sentiment-batch-stream-pipeline

Star

nlp twitter spark sentiment-analysis pyspark gcp-cloud-functions gcp-storage gcp-dataproc gcp-app-engine

Updated May 25, 2021
Jupyter Notebook

ElhNour / large-scale-data-management-spark

Star

Process large amount of data and implement complex data analyses using Spark. The dataset has been made available by Google. It includes data about a cluster of 12500 machines, and the activity on this cluster during 29 days.

spark gcp-dataproc large-scale-data-analytics

Updated Jan 13, 2023
Python

emanuelegiona / CC2019

Star

Project for Cloud Computing course (A.Y. 2018/2019)

streaming apache-spark gcp python3 cloud-computing word-count sapienza-university gcp-dataproc

Updated Jan 28, 2020
Python

tansudasli / spark-sandbox

Star

Apache spark sandbox on GCP and Amazon EMR.

python apache-spark aws-emr gcp-dataproc

Updated Mar 4, 2020
Jupyter Notebook

prodriguezdefino / dataproc-workflowtemplate-cloudfunction

Star

Implements a work queue for Dataproc Worflow Template executions

terraform gcp-cloud-functions gcp-dataproc

Updated Sep 28, 2020
HCL

askmrsinh / spark-stocksim

Star

Monte Carlo stock simulation using Apache Spark.

apache-spark stock-market monte-carlo-simulation predictive-analytics spark-sql spark-mllib apache-commons-math gcp-dataproc

Updated Mar 27, 2020
Scala

prakashdontaraju / google-cloud-ecommerce

Star

ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipeline ― Cloud Storage, Dataproc, PySpark, Cloud Spanner and Tableau

Updated Mar 9, 2022
Python

Improve this page

Add a description, image, and links to the gcp-dataproc topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gcp-dataproc topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gcp-dataproc

Here are 14 public repositories matching this topic...

aeronaut2001 / Movie-Rating-Analysis

nrohit78 / PigHive_StackExhangeData

aeronaut2001 / Car-Insurance-Cold-Calls-Data-Analysis

visalvo / projectScalable

aeronaut2001 / Marketing-Campaign-Data-Analysis

bug-data / Big_Data_First_Project

RickLeite / Hadoop-Google-DataProc-DIOstudy

DenisOgr / sentiment-batch-stream-pipeline

ElhNour / large-scale-data-management-spark

emanuelegiona / CC2019

tansudasli / spark-sandbox

prodriguezdefino / dataproc-workflowtemplate-cloudfunction

askmrsinh / spark-stocksim

prakashdontaraju / google-cloud-ecommerce

Improve this page

Add this topic to your repo