Average Calculation Problem with Mapreduce Approach

This is an example program to calculate the average of a list of numbers using Mapreduce inside hadoop framework.

NOTES:

Program should run on a hadoop cluster and the configurations are set for hadoop 2.10 in the pom file. Can modify that to relevant hadoop version.

This should be packaged to a runnable jar and run against the following arguments,

Input File Location
Output Folder Location
Maximum No of Mapper classes you expect to split the problem into.(optional, default = 10)

The input file is a list of numbers inside a text file(UTF8) a number per line.Numbers can be either int or double.

Development Approach

Code includes a Mapper, Combiner and a Reducer. Mapper split the list of numbers to a maximum of given number of classes(default 10), and handover to combiner. Combiner collapse the classes it recieve to a single key called 'Average'. Then These 'Average' keys are reduced with the Reducer to print the final output.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
averageprob		averageprob
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Average Calculation Problem with Mapreduce Approach

NOTES:

Development Approach

About

Releases

Packages

Languages

a-Imantha/average-calculation-map-reduce

Folders and files

Latest commit

History

Repository files navigation

Average Calculation Problem with Mapreduce Approach

NOTES:

Development Approach

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages