DSC180B Capstone Project on Graph Data Analysis
Project Website: https://nhtsai.github.io/graph-rec/
Amazon Product Recommendation using a graph neural network approach.
- dask
- pandas
- torch
- torchtext
- dgl
Amazon Product Dataset from Professor Julian McAuley (link)
- Product Reviews (5-core)
- Product Metadata
- Product Image Features
The graph is a heterogeneous, bipartite user-product graph, connected by reviews.
- Product Nodes (
ASIN
)- Features:
title
,price
, image representation
- Features:
- User Nodes (
reviewerID
) - Edges (
user
,reviewed
,product
) and (product
,reviewed-by
,user
)- Features:
helpful
,overall
- Features:
We use an unsupervised PinSage model (adapted from DGL).
name
: model configuration namerandom-walk-length
: maximum number traversals for a single random walk,default: 2
random-walk-restart-prob
: termination probability after each random walk traversal,default: 0.5
num-random-walks
: number of random walks to try for each given node,default: 10
num-neighbors
: number of neighbors to select for each given node,default: 3
num-layers
: number of sampling layers,default: 2
hidden-dims
: dimension of product embedding,default: 64 or 128
batch-size
: batch size,default: 64
num-epochs
: number of training epochs,default: 500
batches-per-epoch
: number of batches per training epoch,default: 512
num-workers
: number of workers, `default: 3 or (#cores - 1)lr
: learning rate,default: 3e-4
k
: number of recommendations,default: 500
model-dir
: directory of existing model to continue trainingexisting-model
: filename of existing model to continue training,default: null
id-as-features
: use id as features, makes model transductiveeval-freq
: evaluates model on validation set whenepoch % eval-freq == 0
, also evaluates model after last training epochsave-freq
: saves model whenepoch % save-freq == 0
, also saves model after last training epoch