Skip to content

How it works

Simon Hundsdorfer edited this page Mar 8, 2024 · 1 revision

How RustyRTS works internally

RustyRTS provides two ways of selecting tests:

  • cargo rustyrts dynamic instruments all binaries to trace which functions are executed during the tests
  • cargo rustyrts static creates a directed dependency graph via static analysis

The following sections describe in detail how RustyRTS detects which functions have changed and which tests are affected by that.

Checksums

RustyRTS (both static and dynamic) keeps track of modifications to the code by calculating and comparing checksums of MIR Bodys, which correspond to functions. When the checksum of a Body differs between old and new revision, it is considered changed.

Furthermore, the checksums of ConstAllocations corresponding to static var or static mut var are contributing to the checksum of every Body that accesses the respective variable. This enables dynamic RustyRTS to recognize changes in compile-time evaluation, where instrumentation for tracing is not possible. In static RustyRTS this allows to restrict the analysis to functions that are relevant at runtime (i.e. not only used in compile-time evaluation).

Lastly, the checksums of VtblEntrys (vtable entries) are contributing to the checksum of the function that they are pointing to. Assume a vtable entry that was pointing to a function a) in the old revision is now pointing to a different function b) in the new revision. Because static RustyRTS is working entirely on the graph data of the new revision, it is sufficient to consider function b) changed, as long as there is a continuous path from a corresponding test to function b). Dynamic RustyRTS is comparing traces originating from the old revision, which is why function a) would be considered changed. Because static RustyRTS can distinguish whether a function is called via dynamic or static dispatch, these additional checksums of vtable entries only contribute in the case of dynamic dispatch.

Dynamic

Dynamic RustyRTS collects traces containing the names of all functions that are called during the execution of a test. Some helper functions and global variables are used to obtain those traces:

  • a lock-free linked list (using nodes compiled into the binary as statics) for collecting names of traced functions (technically a set, since every node can appear at most once)
  • trace(input: &'static str, ..) is used to prepend input to the list, if it is not contained
  • pre_test() which clears the list
  • post_test(test_name: &str) which writes the content of the list, i.e. the traces to a file identified by the name of the test, where the traces can be inspected in the subsequent run

Further, on unix-like systems only:

  • post_main() which appends the content of the list to a file identified by the PPID of the currently running process
  • in both post_test(test_name: &str) and post_main() traces in files identified by the PID of the process (i.e. the PPID of any child process), are appended to the list before exporting the traces

During compilation, the MIR is modified, automatically generating MIR code that does not reflect in source code. Dynamic RustyRTS injects function calls into certain MIR Bodys:

  • a call to trace(<fn_name>) at the beginning of every MIR Body
  • a call to pre_test() at the beginning of every test function
  • a call to post_test(<test_name>) at the end of every test function

On unix-like systems only:

  • a call to post_main() at the end of every main function

Calls to post_test(test_name: &str) and post_main() are injected in such a way, that as long as the process terminates gracefully (i.e. either by exiting or by unwinding) the traces are written to the filesystem. A process crashing will result in the traces not being written!

On unix-like systems, a special test runner is used to fork for every test case, thus isolating the tests in their own process. Forking ensures that traces do not intermix, when executing tests in parallel. When executing tests sequentially, forking is not necessary and can be omitted.

During the subsequent run, the traces are compared to the set of changed Bodys. If these two sets overlap, the corresponding test is considered affected.

Static

Static RustyRTS analyzes the MIR during compilation, without modifying it, to build a (directed) dependency graph. The way this is done is derived from the algorithm used for monomorphization in rustc.

Edges are created according to the following criteria:

  1. EdgeType::Call function -> callee function (static dispatch)

  2. EdgeType::Contained function -> contained closure

    1. EdgeType::Unsize function -> function in the vtable of a type that is converted into a dynamic trait object (unsized coercion)
    1. EdgeType::Unsize function -> function in the vtable of a type that is converted into a dynamic trait object (unsized coercion) + !dyn
  3. EdgeType::Drop function -> destructor (drop() function) of types that are dropped (manually or automatically)

    1. EdgeType::Static function -> accessed static variable
    1. EdgeType::Static static variable -> static variable that is pointed to
    1. EdgeType::FnPtr static variable (see above) -> function that is pointed to
    1. EdgeType::ReifyPtr function -> function that is coerced to a function pointer
    1. EdgeType::ClosurePtr function -> closure that is coerced to a function pointer

The suffix !dyn is used to distinguish static and dynamic dispatch. Checksums from vtable entries only contribute to the function they are pointing to with suffix !dyn.

When there is a path from a test to a changed Body, the test is considered affected.

Clone this wiki locally