Skip to content

keyhong/journal-ha-hadoop

Repository files navigation

Journal HA Hadoop Cluster

Description

https://keyhong.github.io/2024/02/07/Hadoop/hadoop3-docker-installation/

Components

Component Version Description
Ubuntu 22.04 The modern, open source operating system on Linux
Python 3.10 An interpreted, object-oriented, high-level programming language with dynamic semantics
Hadoop 3.3.6 A framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models
zookeeper 3.9.1 A centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services
Hive 3.1.3 A distributed, fault-tolerant data warehouse system
spark 3.4.2 A multi-language engine for executing data engineering, data science, and machine learningng

Getting Started

First, build the Docker image. Downloading open-source packages directly on Ubuntu may take some time depending on the network speed.

# Build a Docker image using a Makefile.
$ make build

Note: If the download fails from the CDN server (Kakao mirror server), you can run make build again to resume the download from the point of failure.

Then access the lakehouse services.


Releases

No releases published

Packages

No packages published