Skip to content
/ AMLSim Public
forked from IBM/AMLSim

This project is intended to provide a multi-agent based simulator that generates a series of banking transaction data together with a set of known money laundering patterns. We welcome you to enhance this effort since data set is critical to advance our detection capabilities of money laundering activities

License

Notifications You must be signed in to change notification settings

D61-IA/AMLSim

 
 

Repository files navigation

Important: Please use the "master" branch for the practical use and testing. Other branches such as "new-schema" are outdated and unstable. Wiki pages are still under construction and some of them do not catch up with the latest implementations. Please refer this README.md instead.

AMLSim

This project aims at building a multi-agent simulator of anti-money laundering - namely AML, and sharing synthetically generated data so that researchers can design and implement their new algorithms over the unified data.

Dependencies

  • Java 8 (Download and copy all jar files to jars directory: See also jars/README.md)
  • Python 3.7 (The following packages can be installed with pip3 install -r requirements.txt)
    • numpy
    • networkx==1.11 (We do not support version 2.* due to performance issues for large graphs)
    • matplotlib==2.2.3 (The latest version is not compatible)
    • pygraphviz
    • powerlaw
    • python-dateutil

Directory Structure

See Wiki page Directory Structure for details.

Introduction for Running AMLSim

See Wiki page Quick Introduction to AMLSim for details.

1. Generate transaction CSV files from parameter files (Python)

Before running the Python script, please check and edit configuration file conf.json.

{
//...
  "input": {
    "directory": "paramFiles/1K",  // Parameter directory
    "schema": "schema.json",  // Configuration file of output CSV schema
    "accounts": "accounts.csv",  // Account list parameter file
    "alert_patterns": "alertPatterns.csv",  // Alert list parameter file
    "degree": "degree.csv",  // Degree sequence parameter file
    "transaction_type": "transactionType.csv",  // Transaction type list file
    "is_aggregated_accounts": true  // Whether the account list represents aggregated (true) or raw (false) accounts
  },
//...
}

Then, please run transaction graph generator script.

cd /path/to/AMLSim
python3 scripts/transaction_graph_generator.py conf.json

2. Build and launch the transaction simulator (Java)

Parameters for the simulator are defined at the "general" section of conf.json.

{
  "general": {
      "random_seed": 0,  // Seed of random number
      "simulation_name": "sample",  // Simulation name (identifier)
      "total_steps": 720,  // Total simulation steps
      "base_date": "2017-01-01"  // The date corresponds to the step 0 (the beginning date of this simulation)
  },
//...
}

Please compile Java files (if not yet) and launch the simulator.

sh scripts/build_AMLSim.sh
sh scripts/run_AMLSim.sh conf.json

3. Convert the raw transaction log file

The file names of the output data are defined at the "output" section of conf.json.

{
//...
"output": {
    "directory": "outputs",  // Output directory
    "accounts": "accounts.csv",  // Account list CSV
    "transactions": "transactions.csv",  // All transaction list CSV
    "cash_transactions": "cash_tx.csv",  // Cash transaction list CSV
    "alert_members": "alert_accounts.csv",  // Alerted account list CSV
    "alert_transactions": "alert_transactions.csv",  // Alerted transaction list CSV
    "sar_accounts": "sar_accounts.csv",    // SAR account list CSV
    "party_individuals": "individuals-bulkload.csv",
    "party_organizations": "organizations-bulkload.csv",
    "account_mapping": "accountMapping.csv",
    "resolved_entities": "resolvedentities.csv",
    "transaction_log": "tx_log.csv",
    "counter_log": "tx_count.csv",
    "diameter_log": "diameter.csv"
  },
//...
}
python3 scripts/convert_logs.py conf.json

4. Export statistical information of the output data to image files (optional)

python3 scripts/visualize/plot_distributions.py conf.json

5. Validate alert transaction subgraphs by comparison with the parameter file (optional)

python3 scripts/validation/validate_alerts.py conf.json

Remove all log and generated image files from outputs directory and a temporal directory

sh scripts/clean_logs.sh

About

This project is intended to provide a multi-agent based simulator that generates a series of banking transaction data together with a set of known money laundering patterns. We welcome you to enhance this effort since data set is critical to advance our detection capabilities of money laundering activities

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 58.8%
  • Java 35.1%
  • Groovy 5.0%
  • Shell 1.1%