This program executes and monitors another program, recording its inputs and outputs using $LD_PRELOAD.
These inputs and outputs can be joined in a provenance graph.
The provenance graph tells us where a particular file came from.
The provenance graph can help us re-execute the program, containerize the program, turn it into a workflow, or tell us which version of the data did this program use.
Containers are a special case. To install PROBE in a container, see ./docs/containers.md.
-
Install Nix with flakes. This can be done on any Linux (including Ubuntu, RedHat, Arch Linux, not just NixOS), MacOS X, or even Windows Subsystem for Linux. See Determinate Nix Installer documentation for more details.
curl -fsSL https://install.determinate.systems/nix | sh -s -- install -
Re-log-in or activate Nix in the current shell.
export PATH="${PATH}:/nix/var/nix/profiles/default/bin"
-
Optionally, use our public binary cache to speed up the installation.
nix profile add --accept-flake-config nixpkgs#cachix cachix use charmonium
-
Install PROBE.
nix profile add github:charmoniumQ/PROBE probe --help
-
Now you should be able to run
probe record [-f] [-o probe_log] <cmd...>.-fis needed to overwrite a pre-existingprobe_log.
probe record ./script.py --foo bar.txt
probe export debug-text-
To upgrade PROBE, run
nix profile upgrade PROBE && probe --version. -
To uninstall PROBE and Nix, follow the steps here.
The simplest invocation of the probe cli is:
probe record <CMD...>This will run <CMD...> under the benevolent supervision of libprobe, outputting the probe record to a temporary directory. Upon the process exiting, probe it will transcribe the record directory and write a probe log file named probe_log in the current directory.
If you run this again you'll notice it throws an error that the output file already exists, solve this by passing -o <PATH> to specify a new file to write the log to, or by passing -f to overwrite the previous log.
probe record does not pass your command through a shell, any subshell or environment substitutions will still be performed by your shell before the arguments are passed to probe. But it won't understand flow control statements like if and for, shell builtins like cd, or shell aliases/functions.
If you need these you can either write a shell script and invoke probe record on that, or else run:
probe record bash -c '<SHELL_CODE>'Any flag after the first positional argument is treated as an argument to the command, not probe.
This creates a file called probe_log. If you already have that file from a previous recording, give probe record -f to overwrite.
If you get tired of typing probe record ... in front of every command you wish to record, consider recording your entire shell session:
$ probe record bash
bash$ ls -l
bash$ # do other commands
bash$ exit
$ probe dump
<dumps history for entire bash session> That's a huge work in progress.
Try exporting to different formats.
probe py export dataflow-graph
# Others:
probe py export --help-
Follow the previous step to install Nix.
-
Acquire the source code:
git clone https://github.com/charmoniumQ/PROBE && cd PROBE -
Run
nix develop. This will leave you in a Nix development shell, with all the development tools you need to develop and build PROBE. It is like a virtualenv, in that it is isolated from your system's pre-existing tools. In the development shell, we all have the same version of Python with all the same packages. You can exit it by dypingexit. -
From within the development shell, type
just compile. This compiles the Rust, C, and generated-Python components. If you hack on either, runjust compileagain before continuing. -
The manually-written Python scripts should already be added to the
$PYTHONPATH. You should be able to edit them in place. -
Run
probe <args...>orpython -m probe_py.manual.cli <args...>to invoke the Rust or Python code respectively. -
Before submitting a PR, run
just pre-commitwhich will run pre-commit checks.
libprobe: Library that implements interposition (C, Make, Python; happens to be manual and code-gen).libprobe/include: Headers that will be used by the Rust wrapper to read PROBE data.libprobe/src: Main C sources oflibprobe.libprobe/generator: Python and C-template code-generator.libprobe/generated: (Generated, not committed to Git) output of code-generation.libprobe/Makefile: Makefile that runs all oflibprobe; runjust compile-clito invoke.
cli-wrapper: (Cargo workspace) code that wraps libprobe.cli-wrapper/cli: (Cargo crate) main CLI.cli-wrapper/lib: (Cargo crate) supporting library functions.cli-wrapper/macros: (Cargo crate) supporting macros; they use structs fromlibprobe/includeto create Rust structs and Python dataclasses.cli-wrapper/frontend.nix: Nix code that builds the Cargo workspace; Gets included inflake.nix.
probe_py: Python Code that implements analysis of PROBE data (happens to be manual and code-gen), should be added to$PYTHONPATHbynix developprobe_py/probe_py: Main package to be imported or run.probe_py/pyproject.toml: Definition of main package and dependencies.probe_py/tests: Python unittests, i.e.,from probe_py import foobar; test_foobar(); Runjust test-py.
tests: End-to-end opaque-box tests. They will be run with Pytest, but they will not test Python directly; they should alwayssubprocess.run(["probe", ...]). Additionally, some tests have to be manually invoked.docs: Documentation and papers.benchmark: Programs and infrastructure for benchmarking.benchmark/REPRODUCING.md: Read this first!
flake.nix: Nix code that defines packages and the devshell.setup_devshell.sh: Helps instantiate Nix devshell.Justfile: "Shortcuts" for defining and running common commands (e.g.,just --list).