Skip to content

a-Imantha/average-calculation-map-reduce

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Average Calculation Problem with Mapreduce Approach

This is an example program to calculate the average of a list of numbers using Mapreduce inside hadoop framework.

NOTES:

Program should run on a hadoop cluster and the configurations are set for hadoop 2.10 in the pom file. Can modify that to relevant hadoop version.

This should be packaged to a runnable jar and run against the following arguments,

  • Input File Location
  • Output Folder Location
  • Maximum No of Mapper classes you expect to split the problem into.(optional, default = 10)

The input file is a list of numbers inside a text file(UTF8) a number per line.Numbers can be either int or double.

Development Approach

Code includes a Mapper, Combiner and a Reducer. Mapper split the list of numbers to a maximum of given number of classes(default 10), and handover to combiner. Combiner collapse the classes it recieve to a single key called 'Average'. Then These 'Average' keys are reduced with the Reducer to print the final output.

About

Calculating Average of a list of numbers with a map-reduce approach on hadoop.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages