Dasty is a performant dynamic taint analysis tool for the detection of prototype pollution gadgets in Node.js applications. It is a prototype implementation of the approach described in my Master's thesis and the paper. The implementation is based on instrumentation with NodeProf by Sun et al.
Dasty reports flows from pollutable prototype properties to potentially dangerous function calls. Concretely, Dasty
defines as sources every object property access (dot or brackets notation) that access Object.prototype
. Sinks
are defined as all Node.js API calls except the functions of the assert
module. A recorded flow contains the code
locations of the source, the sink, and the code flow, i.e. all operations it was propagated through. Dasy also records
flows that trigger universal gadgets (see the paper here).
The flows are stored in a MongoDB database and can be exported as Sarif files for convenient analysis.
Dasty can be run on a specific file or leverage the test suits of an application as a basis for the analysis. In addition, it provides a pipeline to automatically install and analyze npm packages.
A complete analysis consists of three phases:
- A pre-analysis that determines if the analyzed packages are intended to be used on the server by looking for Node.js
API calls. Only if this is the case the analysis is continued. This filtering can be skipped if not needed by
specifying the
--noPre
flag. - The unintrusive taint analysis runs the provided application once. It aims to record all gadgets that do not rely on control flow alteration.
- Finally, the forced branch execution taint analysis selectively conducts multiple runs in which the control flow of the program is altered based on polluted prototype properties. Through forced branch execution Dasty is able to detect flows that rely on multiple polluted properties.
Dasty utilizes NodeProf which is built on top of the Truffle Instrumentation Framework and the GraalVM. If you encounter any problems during the installation process please refer to their documentation.
- build-essential
- python3
The current implementation requires Node 18.12.1 installed via nvm and the NVM_DIR
environment variable to be set. If
you want to use another installation, you need to adapt the path to the node executable
in node.py
, node
and npm
.
However, we do not recommend to use a version other than 18.12.1.
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash`
nvm install 18.12.1
git clone https://github.com/graalvm/mx.git
export PATH=/path/to/mx/:$PATH
mx fetch-jdk --java-distribution labsjdk-ce-19
export JAVA_HOME=/path/to/labsjdk-ce-19-jvmci-23.0-b04
3. Install NodeProf
mkdir nodeprof-graalvm && cd nodeprof-graalvm
git clone https://github.com/KTH-LangSec/nodeprof.js.git
mx sforceimports
mx build
export GRAAL_NODE=/path/to/nodeprof-graalvm/graal/sdk/latest_graalvm_home/bin/node
export NODEPROF_HOME=/path/to/nodeprof-graalvm/nodeprof.js/
git clone https://github.com/pmoosi/Dasty.git
cd /path/to/dasty
npm install
2. Install MongoDB on your system. You can follow the installation guide here.
3. Configure pipeline/db/conn.js
To analyze a package run the index.js
file in the pipeline
directory.
node index.js [flags] <pkgName>
node index.js --fromFile [flags] /path/to/packages-list
The results are stored in MongoDB.
node index.js --sarif [flags] [pkgName]
Depending on the flags an export can create up to four files per package/application:
<name>.sarif
contains the found flows<name>-exceptions.sarif
contains potential pollutions that might have caused an exception/crash.<name>-all-taints
contains all sources and code flows independent of them reaching a sink or not. These are only recorded in the unintrusive run.<name>-branched-on
contains all sources that flowed into a conditional.
Analyze specific package:
node index.js express
Analyze a list of packages and skip already analyzed ones (stored in pipeline/package-data/already-analyzed.txt):
node index.js --fromFile --skipDone /path/to/packages-list
Ignore previous pre-analysis result and force full analysis
node index.js --force express
Analyze a specific file:
node index.js --execFile /path/to/file.js <name>
Export all sarif data of the last analysis for a specific package:
node index.js --sarif --allTaints --out /path/to/sarif.sarif express
Export all sarif data of the last analysis for all analyzed packages:
node index.js --sarif --allTaints --outDir /path/to/sarif-dir/
Flag | Description |
---|---|
--fromFile |
Use a list of packages from a file containing a list of package names separated by new line. |
--skipTo <package_name> |
Skip to the specified package in the provided list. |
--skipToLast |
Skip to the last analyzed package. |
--skipDone |
Skip all already analyzed packages |
Flag | Description |
---|---|
--forceBranchExec |
Enable force branch execution. |
--onlyForceBranchExec |
Only do force branch execution runs. Requires at least one previous unintrusive run. |
--execFile <file_path> |
Specify a file instead of a package to run. |
--noForIn |
Disable for..in injection run. |
--onlyPre |
Only run the pre-analysis phase. |
--noPre |
Skip the pre-analysis phase. |
--force |
Force analysis (ignore previous pre-analysis results) |
--collPrefix <prefix> |
Specify a prefix for the MongoDB collection containing the results for the run |
--forceBranchExecCollPrefix <prefix> |
Specify where the forced branch exec information should be obtained. Use only if it deviates from the current collection prefix. |
--processNr <n> |
Specify a unique number to run multiple analyses in parallel |
--forceProcess |
Forces the process to run |
--forceSetup |
Force the setup phase of a package. Usually the setup is skipped when the package is already present. |
--sinkAnalysis |
Enables a run identifying special sinks that trigger universal gadgets. |
--onlySinkAnalysis |
Run only the special sink analysis |
Flag | Description |
---|---|
--sarif |
Output results in SARIF format. |
--out <output_file_path> |
Specifies the path and file name for the output file. |
--outDir <output_dir_path> |
Specifies the directory for the output file. |
--allTaints |
Export all injected taints in addition to the flows (in a separate file). |
--exportRuns <n> |
Export the last runs (only works for flows and exceptions not for all taints and branchings) injections from the previous runs are skipped (default : 1). |
Information about analyzed packages are located in pipeline/package-data
:
already-analyzed.txt
: Contains all already analyzed packages (is used for--skipDone
)last-analyzed.txt
: Contains the name of the last analyzed package (is used for--skipToLast
)nodejs-packages.txt
: Contains all package names that passed the pre-analysisfrontend-packages.txt
: Contains all package names that failed the pre-analysiserr-packages.txt
: Contains all package names that crashed during the pre-analysis (with an uncaught exception). Note, that if a node API call was encountered it is still added tonodejs-packages.txt
non-instrumented-packages.txt
: Contains all package names that ran but were never instrumentedfiltered-packages
: Contains package names that were filtered out due to their name
If a package is either in nodejs-packages.txt
, frontend-packages.txt
, err-packages.txt
or non-instrumented-packages.txt
the pre-analysis is skipped (if --force
is not set).
If you encounter problems running the analysis you might want to try changing the pipeline filters which are in place to avoid unnecessary instrumentation and analysis runs. If the analyzed package matches any of the filters it won't be analyzed. The filters are defined in two locations:
DONT_ANALYSE
inpipeline/index.js
specifies package names that are not analyzed at all (i.e. known uninteresting packages)node.py
defines different allow- and blocklists specifying which processes are being run and instrumented
If the package is filtered out by the pre-analysis phase, you can try to skip it with the --noPre
flag.