Implement a custom test runner for buck test #145

vmagro · 2021-07-20T16:56:47Z

buck test has provisions for running with a custom test runner (see https://buck.build/files-and-dirs/buckconfig.html#test.external_runner)

Benefits

A custom test runner would give us a lot of control over how tests are run

failing tests could be retried
consistently-failing tests can be marked as disabled (can be stored in a text file in the repo)
test parallelism can be controlled (eg 1 process per test case, up to $(nproc) at a time)

Implementation details

The buck config docs linked above describe the input that buck itself provides.
The test runner would need to parse the buck-provided JSON file, and then have a switch based on the type to determine how to execute the test.
The test runner should support pretty html output like we have today (e8bf770)
- this can be done by either implementing an output format understood by the workflow we use now https://github.com/dorny/test-reporter#supported-formats (jest-junit is probably the best documented)
- alternatively, you can just re-implement the functionality of that test reporter to create a status check with the github api and render some html however you want
Automatic disabler (the test state machine) can be implemented as a periodically scheduled GitHub Action that looks at the past N days of test workflow runs and marks tests that have failed 3 times in a row (or some other threshold) as disabled (it can either submit a PR to a text file in the repo, or use a DB in AWS to track state)

Desired features

Features can be added in stages, the bare minimum for the first revision is a test runner that implements Python unit tests, and outputs a single pass/fail for the test target based on the return code obtained from just running the pex.

Parse buck test json, run each target and produce an output entry for the entire test target based on the returncode
Parse the individual test target output to produce output entries for each individual test case
List tests out of the test target binaries before running and run them in parallel
cli flag to retry failed tests (this can be passed to the runner through buck test as described in the docs)
Runner should not execute disabled test cases unless given a special --run-disabled flag
GitHub Actions workflow to read old test result artifacts and submit a PR / write to an AWS DB to mark consistently failing tests as disabled

The text was updated successfully, but these errors were encountered:

vmagro · 2021-07-28T17:12:39Z

Like I said in the Zoom, just write your code with cargo new/build/run as if it was a normal rust project and I can convert your PR into the proper BUCK targets along with any dependencies you may add (it's just tedious and I've done so many of them that I'm very fast at it by now)

Some rust crates that are likely to be useful:

serde (for deserializing the buck test runner input / test suite outputs / serializing your final results to be uploaded as an artifact) https://serde.rs/
structopt (for parsing cli args to the test runner) https://docs.rs/structopt/0.3.22/structopt/
anyhow (for internal error handling with nicer messages) https://docs.rs/anyhow/1.0.42/anyhow/
rusqlite (if you end up using sqlite as an artifact format) https://docs.rs/rusqlite/0.25.3/rusqlite/
- there is also diesel.rs, but it would be a huge pain to integrate into antlir, so please don't use it :)

baioc · 2021-08-01T04:02:26Z

Features can be added in stages, the bare minimum for the first revision is a test runner that implements Python unit tests, and outputs a single pass/fail for the test target based on the return code obtained from just running the pex.

Parse buck test json, run each target and produce an output entry for the entire test target based on the returncode

I created a draft PR at #147 with this first stage (just not yet with the pretty html output).
Since it's my first time dealing with Rust outside tutorials, all critique is appreciated.

List tests out of the test target binaries before running and run them in parallel

When dealing with python tests I can imagine parsing source code looking for test-annotated functions, but I'm not sure how I would start in the case of binaries. Any ideas on how we could do this?

cli flag to retry failed tests (this can be passed to the runner through buck test as described in the docs)

Does something like a --max-retries option, which defaults to 0, sound good?

Runner should not execute disabled test cases unless given a special --run-disabled flag

GitHub Actions workflow to read old test result artifacts and submit a PR / write to an AWS DB to mark consistently failing tests as disabled

The order of these last two is probably switched ;)

vmagro · 2021-08-02T13:43:16Z

When dealing with python tests I can imagine parsing source code looking for test-annotated functions, but I'm not sure how I would start in the case of binaries. Any ideas on how we could do this?

Ah, I'm sorry I totally thought that this was a standard part of Python and not Facebook specific.
How buck does this (and re-using the code that buck already has is most likely the easiest path forward) is by writing a custom python test runner here. If you buck run a python_unittest target, for example //antlir:test-common, you can see there are a lot of options that the binary gives you to list test cases inside - --list-tests --list-format=buck seems the easiest to parse. And that will work when you just execute the binary that buck gives in the test runner json file.

Rust test binaries have a --list flag which lists tests and benchmarks, but we don't care about benchmarks so you can just support the unit tests

Does something like a --max-retries option, which defaults to 0, sound good?

Perfect

The order of these last two is probably switched ;)

Hmm, it could go either way right :) depending on how you implement the disabling. I would argue that implementing --run-disabled is actually probably easier to do first with some mechanism to manually disable tests (either writing to a db in aws or a file in the repo)

baioc · 2021-08-16T14:46:57Z

Just to keep you updated, the draft at #147 is now only missing the auto-disabler and items 5 and 6, which I'll implement as discussed last week

Summary: Custom buck test runner for #145 This PR provides an external test runner which is currently able to list out individual unit tests from Python and Rust binaries coming from buck tests, execute them in parallel and report results in JUnit's XML format, which should be CI-compatible (although integration with the [test reporter](https://github.com/dorny/test-reporter) is yet to be tested). It can also retry tests and will silently ignore those manually marked for exclusion (for instance, those with a `"disabled"` label). The next step after this is implementing the rest of #145, namely a way to keep track of state between test runs on the CI so as to automatically disable some tests. NOTE by vmagro: I am (temporarily) duplicating the `ci.yml` workflow to use this external runner, until it's at feature parity with the internal test runner that is already being used Pull Request resolved: #147 Test Plan: Was able to run this with `--config test.external_runner` on my desktop and got nice looking output from each test case Reviewed By: zeroxoneb Differential Revision: D30372373 Pulled By: vmagro fbshipit-source-id: 33a3aea346375a0a60b4cbc687e642f381f33d1f

baioc mentioned this issue Aug 1, 2021

Custom buck test runner #147

Closed

baioc mentioned this issue Aug 24, 2021

Add auto-disabling/enabling to the custom test runner #161

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a custom test runner for buck test #145

Implement a custom test runner for buck test #145

vmagro commented Jul 20, 2021 •

edited

Loading

vmagro commented Jul 28, 2021

baioc commented Aug 1, 2021

vmagro commented Aug 2, 2021 •

edited

Loading

baioc commented Aug 16, 2021 •

edited

Loading

Implement a custom test runner for buck test #145

Implement a custom test runner for buck test #145

Comments

vmagro commented Jul 20, 2021 • edited Loading

Benefits

Implementation details

Desired features

vmagro commented Jul 28, 2021

baioc commented Aug 1, 2021

vmagro commented Aug 2, 2021 • edited Loading

baioc commented Aug 16, 2021 • edited Loading

vmagro commented Jul 20, 2021 •

edited

Loading

vmagro commented Aug 2, 2021 •

edited

Loading

baioc commented Aug 16, 2021 •

edited

Loading