Performance tests for graph databases (e.g. Neo4j, Memgraph, ArangoDB) The goal for this project is to create independent set of tests for different graph db providers.
Graphomance supports following graph databases:
- Neo4j 5.x
- Memgraph 2.x
- ArangoDB 3.11.x
Datasets used in performance tests:
This is crime investigation dataset, containing street-level crime in Greater Manchester. Dataset is originally published by data.gov.uk and transformed into independent dataset by Neo4j (see: https://github.com/neo4j-graph-examples/pole)
POLE stands for Person, Object, Location and Event. Dataset contains 61521 nodes and 104862 relationships.
Dataset is loaded to Neo4j and Memgraph automatically on docker-compose up
stage.
This is a synthetic fraud detection dataset generated and published by Neo4j on github repository at https://github.com/neo4j-graph-examples/fraud-detection/blob/main/README.adoc Queries used in test scenarios follow examples or they are inspired by documentation page
Fraud dataset contains 332973 nodes and 980857 relationships.
Dataset is loaded to Neo4j and Memgraph automatically on docker-compose up
stage.
Rename .env.local
file to .env
and run docker-compose up -d
. You may need to wait few minutes until all services are up and datasets data are loaded.
Before running all tests, make sure the host OS is not busy. Verify with CPU utilization that all services are just on standby. Assuming all graph database services are up and running, you can run all tests with just one script command:
./run.sh
However, you can also run group of tests individually, with gradle task. However, some environment variable are required then.
Following command runs all available tests:
DB_TYPE=<DB_TYPE> URL=bolt://localhost:<BOLT_PORT> gradle test -i --rerun
Replace DB_TYPE with one of possible db providers: NEO4J, MEMGRAPH
For example, to run all tests for Neo4j use this command:
DB_TYPE=NEO4J URL=bolt://localhost:7687 gradle test -i --rerun
Similarly, this command run all tests for Memgraph (please note a non-standard bolt port)
DB_TYPE=MEMGRAPH URL=bolt://localhost:7688 gradle test -i --rerun
Finally, the command to run all ArangoDB tests:
DB_NAME=test DB_TYPE=ARANGODB URL=localhost:8529 gradle test -i --rerun
The last step of the process combines all individual test results into master CSV file. File results/master-results.csv
contains all test results merged per each database provider. It can be used to generate chart. Master file can be regenerated after completing all results with the following command:
gradle combineResults
NEO4J_VERSION=5.12.0-enterprise
MEMGRAPH_VERSION=2.11.1
ARANGODB_VERSION=3.11.3
Here you can find MASTER RESULTS FILE
You can find individual test results in following sub folders:
- Properties on edged are disabled by default. You have run Memgraph with explicit setting
--storage-properties-on-edges=true
- POINT type is not supported in Memgraph, in which geolocation should be stored in the node with properties
lng
andlat
- LOAD CSV syntax is very similar, however different:
** file name following
WITH HEADER
(memgraph) vs opposite place,WITH HEADERS
following file name (Neo4j) - Different syntax for creating indexes:
** Neo4j syntax:
create index for (n:Location) ON (n.address);
** Memgraph syntax:create index on :Location(address);