GitHub - chaudharysurya14/Ambari-Hadoop_Installation: Deploy a scalable Hadoop cluster using Apache Ambari for efficient big data processing. Configure Hadoop ecosystem components, ensuring security and optimization. Utilize Ambari's monitoring and management capabilities for seamless cluster administration.

Learning how to tame the Big Data with Hadoop and related technologies

Hadoop

Hadoop is an open source software platform for distributed storage and distributed processing of very large datasets on computer clusters built from commodity hardware
Why Hadoop?
- Data is too big
- Vertical scaling isn't an option
  - Disk seek times
  - Hardware failures
  - Processing times
- Horizontal scaling is linear
- You can do much more instead of just batch processing

Download Virtual Box from https://www.virtualbox.org/
Download image of Hadoop to run on Virtual Box
- (Horton Works Data Platform) HDP 2.5 Sandbox is preffered because it boots up faster than new versions
  - Download from https://hortonworks.com/downloads/#sandbox
Import the image into Virtual Box
Once you bootup, you will have CentOS instance that has Hadoop up and running
We can use CLI, it also has browser interface
- Ambari is available to easily navigate and manage different systems on Hadoop
- Goto http://localhost:8888
Launch Dashboard and login to Ambari
- Username: maria_dev
- Password: maria_dev
Trouble shooting
- Enable virtualization in your BIOS
- Disable Hyper-V acceleration in Windows

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
AMBARI_INSTALLATION .pdf		AMBARI_INSTALLATION .pdf
Ambari_installation.txt		Ambari_installation.txt
Big Data and Hadoop.pptx		Big Data and Hadoop.pptx
Big_Data_and_MapReduce.pdf		Big_Data_and_MapReduce.pdf
Big_Data_and_MapReduce_Challenges_Opportunities_an.pdf		Big_Data_and_MapReduce_Challenges_Opportunities_an.pdf
HDP Operations - HDP Administration 2-Student Guide.pdf		HDP Operations - HDP Administration 2-Student Guide.pdf
Hadoop installation script.txt		Hadoop installation script.txt
README.md		README.md