granola-scraper

granola-scraper is an aws-lambda node.js script to scrape granola-related data from google express and put it in a dynamoDB server. It will work in conjunction with aws-api to provide a graphql endpoint.

Local Testing

install serverless npm install serverless -g.
install local dynamo-db serverless dynamodb install.
start local server in a separate command prompt serverless dynamodb start --migrate (pulls table settings from serverless.yaml).
run serverless invoke local -f granolaScraper.

Framework

granola-scraper originally used aws-sam for local testing and deployment, but was ultimately abandoned due to the large number of incompatibilities. Progress is saved for historical documentation in aws_sam branch here.

Use Cases

    What            When        Where                       View Type
1.  All Granola     today       everywhere                  List of many items
2.  Bear Naked      today       walmart, shoprite, target   List of many items
3.  Specific item   today       walmart                     Product page
4.  Specific item   all month   everywhere                  Product page
5.  Chocolate       all month   everywhere                  Detailed list of many items

Table Design

2018-W1_2018-W52 (data range)

date (primary)
url (secondary)
vendor
description
price
regPrice
value

productInfo

url (primary)
brand
name
flavor
size
isNew? <<<<<<< HEAD =======

TODO

Optimization

~~reduce timeout (most wasted time)~~
~~move binary to s3 bucket~~

Readability

refactor into lifecycle phases (setup, scrape, cleanup)
use winston/loggly

Documentation

find out why tar/headless_shell works

linux

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
chrome		chrome
data		data
src		src
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
serverless.yml		serverless.yml
tsconfig.json		tsconfig.json
tslint.json		tslint.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

granola-scraper

Local Testing

Framework

Use Cases

Table Design

2018-W1_2018-W52 (data range)

productInfo

TODO

Optimization

Readability

Documentation

About

Releases

Packages

Languages

License

doctor-kat/granola-scraper

Folders and files

Latest commit

History

Repository files navigation

granola-scraper

Local Testing

Framework

Use Cases

Table Design

2018-W1_2018-W52 (data range)

productInfo

TODO

Optimization

Readability

Documentation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages