A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
-
Updated
Jan 11, 2024 - Java
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.
I installed Hadoop on Virtual Machine and all Assignments are performed on Ubuntu OS. Refer to this repo for completion of the Hadoop Assignments. It is recommended that you have a stable internet connection while doing these things.
Toy Hadoop cluster combining various SQL-on-Hadoop variants
This repository contains a simple Hadoop-like (MapReduce) distributed computing platform implemented in Java. It is extended from a course project at UIUC awarded the best Java version implementation and it's open-sourced for reference.
Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark
A storage reference to a comprehensive guide on installing Hadoop on Windows
The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
The repo contains the steps for setting up the single node cluster in Hadoop 3.2.1 in Ubuntu 20.04 LTS
Twitter data analysis using hadoop (hdfs), flume, map-reduce and hive. Sentiment Analysis is also done using affin dictionary for tweets related to Indian election.
Setup hadoop cluster manually and automatically
WQD7008 Parallel and Distributed Computing Project
EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5
MapReduce Python Example
PageRank algorithm written in Java MapReduce framework
Data Analytics Laboratory
Product recommendation system on Amazon product dataset using Apache Spark framework
Add a description, image, and links to the hadoop-framework topic page so that developers can more easily learn about it.
To associate your repository with the hadoop-framework topic, visit your repo's landing page and select "manage topics."