Red Flag Deals Hot Deals Scraper

Note

I started this project when I was on my first internship, I have learned a lot since then and I have decided to rewrite this project (in Go)! This project will no longer be maintained and the replacement is located here: https://github.com/gordonpn/hot-flag-deals.

Description

This project aims to scrape the content of the Hot Deals forums, keep track of all interesting and relevant deals, as well as archive all other deals. All relevant deals are emailed daily to a mailing list and then archived. There also exists a front-end at https://gordon-pn.com/deals to view the current relevant deals.

Motivation

Red Flag Deals does aggregate deals on their front page, but the Hot Deals Forums are community driven and sourced by anybody. This is where the purpose of my project comes into play, this project scrapes the Hot Deals Forums several times per day and displays them on a front-end.

With this project, I saved myself the chore of checking the (messy) forum a few times a day while still being aware of the good deals posted by the community.

to github.com/gordonpn/hot-flag-deals

Screenshots

Daily email

Template made by @tiffzeng

Technologies

Maven: Dependency management
Bootstrap: CSS framework for front-end
jQuery: front-end
Javalin: Web framework for Java for the back-end
Spring Framework: Utilized Thymeleaf for email templates as well as some dependency injection
jsoup: library to parse HTML documents

Prerequisites

Java 8+
Apache Maven 3.6+

Installation

Clone the master branch into your workspace.

Compile and package using Maven.

mvn clean compile package

Configuration

Edit the configuration.json to your needs. You must set your gmail and password as environment variables. In my case, my prod machine was running on Linux and my test machines were running on Mac and Windows. Those settings come from the ConfigurationLoader.java.

Usage

The main class com.rfdhd.scraper.App is used for scraping the forum.

java -cp *.jar com.rfdhd.scraper.App

The main class com.rfdhd.scraper.DigestCreator is used for sending the daily digest email. It will take the content of dailyDigest.json as source.

java -cp *.jar com.rfdhd.scraper.DigestCreator

The main class com.rfdhd.scraper.Start is used to start the back-end to respond to the HTTP requests.

java -cp *.jar com.rfdhd.scraper.Start

Use case

The Scraper and the DigestCreator are both automated in Jenkins in order to have the most up to date information on deals.

Roadmap/Todo

Phase 1

Use the Jsoup library to scrape data correctly.
Save all the scraped data in a map.
Save the unfiltered map into scrapings.json
Try to read scrapings.json
Remove duplicates before saving again
Utility class to calculate information from a map.
Filter the raw map using the utility class.
Save the filtered map into currentLinks.json

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
docs		docs
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml
rfd-hd-scraper.sh		rfd-hd-scraper.sh
rfd-hd-send-email.sh		rfd-hd-send-email.sh
rfd-hd-start-server.sh		rfd-hd-start-server.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Red Flag Deals Hot Deals Scraper

Note

Description

Motivation

Screenshots

Daily email

Technologies

Prerequisites

Installation

Configuration

Usage

Use case

Roadmap/Todo

Phase 1

Phase 2

Phase 3

Phase 4

Phase 5

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Red Flag Deals Hot Deals Scraper

Note

Description

Motivation

Screenshots

Daily email

Technologies

Prerequisites

Installation

Configuration

Usage

Use case

Roadmap/Todo

Phase 1

Phase 2

Phase 3

Phase 4

Phase 5

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages