GitHub - serhangursoy/MapReduceImageSimilarity: This project aims to use MapReduce for finding image similarity in large dataset.

Finding Similar Images to an Input Image in a Large Dataset with MapReduce

This project aims to use MapReduce for finding image similarity in large dataset.

White Paper

You can find paper in Paper folder or just simply following this link. You can read our approaches and results in the white paper.

Prerequisites

You need Java for compiling the JAR. To run this code properly, you need your own AWS account with S3 and EMR access. You also require decent internet connection for checking the status of the clusters.

Running

To use this project, you have to do following,

- Clone this project
- Make sure you have Maven Installed. If you are using Intellij it should come in default package
- Create JAR with Maven with create .jar with "mvn package" command on project directory.
- Create your cluster from AWS EMR Control Panel
- Once your cluters are up and running, go to your S3 bucket and upload your JAR.
- Open up AWS EMR Control Panel again, add a new step, select freshly installed JAR and add the following arguments. 
Ex GistCompare s3://com-rosettahub-default-xxxxx/MapReduce/input/ s3://com-rosettahub-default-xxxxx/output 20000 com-rosettahub-default-xxxxx
- Once everything is done, there should be final file in MapReduce folder. You can review similarities

Built With

MapReduce - Main framework
Hadoop - Used for HDFS and general MapReduce Framework
Maven - Dependency Management

Authors

Serhan Gürsoy - Architecture Engineer - Github
Ege Yosunkaya - Architecture Engineer - Github
Ömer Faruk Karakaya - Architecture Engineer - Github
Musab Erayman - Architecture Engineer - Github

See also the list of contributors who participated in this project.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Paper		Paper
src/main/java		src/main/java
HadoopApp.iml		HadoopApp.iml
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finding Similar Images to an Input Image in a Large Dataset with MapReduce

White Paper

Prerequisites

Running

Built With

Authors

About

Releases

Packages

Languages

serhangursoy/MapReduceImageSimilarity

Folders and files

Latest commit

History

Repository files navigation

Finding Similar Images to an Input Image in a Large Dataset with MapReduce

White Paper

Prerequisites

Running

Built With

Authors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages