Skip to content

anyarcherty/Hadoop

 
 

Repository files navigation

hadoop-elephant_logo

Hadoop Tutorial

This tutorials helps people who has intesest to learn big data project and want to build Hadoop Distribute File System within Mac or Windonws system.

The project include the overview and four parts of tutorial introduction:

Overview

The overview gives the briefly introduction on our project. It contains the description of software (Docker) and technology (Hadoop, MapReduce).

https://github.com/Hadoop-bigdata/Hadoop/blob/master/Overview.md

First part: Installation

https://github.com/Hadoop-bigdata/Hadoop/blob/master/Hadoop-Installation.md

1. Install Docker
	
2. Pull Hadoop Image in Docker

Second part: Setting up Environment

https://github.com/Hadoop-bigdata/Hadoop/blob/master/Hadoop-Environment.md

3. Update Image

4. Create new Hadoop Image

Third part: Total Purchases

https://github.com/Hadoop-bigdata/Hadoop/blob/master/Hadoop-MapReduce.md

5. Download the Purchases Dataset

6. Edit Map and Reduce Function

7. Running the Mapreduce in Hadoop

Fourth part: Nature Language Processing on Amazon Review

https://github.com/Hadoop-bigdata/Hadoop/blob/master/Hadoop-NLP.md

8. Download the Amazon Review Dataset

9. Edit Map and Reduce Function

10. Running the Mapreduce in Hadoop based on vader sentiment

11. Running the Mapreduce in Hadoop based on Bag of word

Workflow of the MapReduce in Bag of word

screen shot 2017-12-20 at 21 43 55

Workflow of the MapReduce in vader sentiment

screen shot 2017-12-20 at 21 37 49

Appendix: Operation Command in Docker

Appendix gives the basic operation command in Docker to help people connect the different containers.

https://github.com/Hadoop-bigdata/Hadoop/blob/master/Docker-Operation.md

Picture Reference

https://user-images.githubusercontent.com/26347639/34073834-0167b5fe-e271-11e7-8974-0f4850969a7b.png

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 88.9%
  • Shell 11.1%