Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
main.py		main.py
main_sampling.py		main_sampling.py
model.py		model.py
model_sampling.py		model_sampling.py
utils.py		utils.py

README.md

DGL Implementation of the CARE-GNN Paper

This DGL example implements the CAmouflage-REsistant GNN (CARE-GNN) model proposed in the paper Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters. The author's codes of implementation is here.

NOTE: The sampling version of this model has been modified according to the feature of the DGL's NodeDataLoader. For the formula 2 in the paper, rather than using the embedding of the last layer, this version uses the embedding of the current layer in the previous epoch to measure the similarity between center nodes and their neighbors.

Example implementor

This example was implemented by Kay Liu during his SDE intern work at the AWS Shanghai AI Lab.

Dependencies

Python 3.7.10
PyTorch 1.8.1
dgl 0.7.1
scikit-learn 0.23.2

Dataset

The datasets used for node classification are DGL's built-in FraudDataset. The statistics are summarized as followings:

Amazon

Nodes: 11,944
Edges:
- U-P-U: 351,216
- U-S-U: 7,132,958
- U-V-U: 2,073,474
Classes:
- Positive (fraudulent): 821
- Negative (benign): 7,818
- Unlabeled: 3,305
Positive-Negative ratio: 1 : 10.5
Node feature size: 25

YelpChi

Nodes: 45,954
Edges:
- R-U-R: 98,630
- R-T-R: 1,147,232
- R-S-R: 6,805,486
Classes:
- Positive (spam): 6,677
- Negative (legitimate): 39,277
Positive-Negative ratio: 1 : 5.9
Node feature size: 32

How to run

To run the full graph version and use early stopping, in the care-gnn folder, run

python main.py --early-stop

If want to use a GPU, run

python main.py --gpu 0

To train on Yelp dataset instead of Amazon, run

python main.py --dataset yelp

To run the sampling version, run

python main_sampling.py

Performance

The result reported by the paper is the best validation results within 30 epochs, and the table below reports the val and test results (same setting in the paper except for the random seed, here seed=717).

Dataset		Amazon	Yelp
Metric (val / test)	Max Epoch	30	30
AUC (val/test)	paper reported	0.8973 / -	0.7570 / -
	DGL full graph	0.8849 / 0.8922	0.6856 / 0.6867
	DGL sampling	0.9350 / 0.9331	0.7857 / 0.7890
Recall (val/test)	paper reported	0.8848 / -	0.7192 / -
	DGL full graph	0.8615 / 0.8544	0.6667/ 0.6619
	DGL sampling	0.9130 / 0.9045	0.7537 / 0.7540

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

caregnn

caregnn

README.md

DGL Implementation of the CARE-GNN Paper

Example implementor

Dependencies

Dataset

How to run

Performance

Files

caregnn

Directory actions

More options

Directory actions

More options

Latest commit

History

caregnn

Folders and files

parent directory

README.md

DGL Implementation of the CARE-GNN Paper

Example implementor

Dependencies

Dataset

How to run

Performance