Skip to content

Distributed Parallel Tests on CI systems

Phileas Lebada edited this page Mar 3, 2021 · 6 revisions

Running parallel builds across multiple machines

Use the --only-group feature if you want parallel_tests to handle grouping your tests, but need to run your tests across multiple machines. Using this option, parallel_tests will group your tests (based on filesize) into the number of groups you specify with the -n option, but will only run the group specified with the only-group option. Files are grouped by filesize to ensure the grouping is consistent across different machines.

For example, let's say you have access to many small machines (read: single core) to run your testing suite. It doesn't make sense to run the tests in parallel on one machine, since it probably wont speed up much given the single core. Travis recommends using 2 processes for their 1.5 cores per box. In this scenario you can use the --only-group option to run the tests in parallel across a number of machines. So each machine would run a slightly different command:

Machine one:

parallel_test test -n 6 --only-group 1,2

Machine two:

parallel_test test -n 6 --only-group 3,4

Machine three:

parallel_test test -n 6 --only-group 5,6

Of course it's up to you to collect and aggregate the results of the tests at this point.

Note that enabling the --only-group option means that EVERY group is treated as being the first parallel test (and TEST_ENV_NUMBER is blank). This is because if you're running the tests on separate machines, there's no need to configure a database, etc. for every test group.

Travis CI Support

The --only-group option makes it extremely easy to parallelize your builds on Travis CI. Simply specify your matrix and script like so (example with rspec):

...
env:
  - "TEST_GROUP=1"
  - "TEST_GROUP=2"
  - "TEST_GROUP=3"
  - "TEST_GROUP=4"
  - "TEST_GROUP=5"
  - "TEST_GROUP=6"

script:
  - bundle exec parallel_test spec/ -n 6 --only-group $TEST_GROUP --group-by filesize --type rspec

Now parallel_tests will take care of grouping your tests for you, and run one group per build worker. You can also specify multiple groups per worker to execute 2 processes on travis at once (workers have ~1.5 cpus) with TEST_GROUP=1,2

for more details and code-coverage + runtime-logging Big and Fast Tests: Taking our Travis build from 4 hours to 13 minutes

GitLab CI Example

Leveraging GitLab-CI's parallel feature, introducing an advanced parallel_tests/rspec setup can be fairly straight forward.

.gitlab-ci.yml file:

parallel_tests:
  parallel: 8
  variables:
    # if you use https://docs.gitlab.com/ee/ci/yaml/#parallel
    # instead of running parallel per-core, this setting allows you
    # how many jobs should run per-container in parallel
    # fallback/default: PARALLEL_STEPS=2
    # which means that two parallel jobs are running per container
    PARALLEL_STEPS: ""
  image: ruby
  stage: test
  services:
    - name: postgres:13
      alias: postgresql
  script:
    - |
      set -x

      # CI_NODE_INDEX is a GitLab variable
      # telling you at what CI_NODE you are currently running
      # when you are using https://docs.gitlab.com/ee/ci/yaml/#parallel
      INDEX="${CI_NODE_INDEX:-}"
      # CI_NODE_TOTAL is always set by GitLab-CI-runner
      # if you have not configured 'parallel' CI_NODE_TOTAL=1
      # otherwise total will be either CPU cores or 1 if not parallelized
      TOTAL="${CI_NODE_TOTAL:-1}"
      # how many parallel_steps should run per CI_NODE-host
      steps="${PARALLEL_STEPS:-2}"
      # the index, corrected to count at 0 instead of 1
      index="$((${INDEX}-1))"


      # with steps we allow groups to run
      START="$(($steps * $index + 1))"
      STOP="$(($steps * $index + $step))"

      # generate a list of which range of groups should be run
      # generates something like "1,2" for the first two groups, "3,4"
      groups="$(seq -s, "${START}" 1 "${STOP}")"
      # the total amount of job we are running, potentially across multiple hosts
      TOTAL_JOBS="$(( ${TOTAL} * ${steps} ))"

      rake parallel:create["${steps}"]
      rake parallel:rake[db:structure:load,"${steps}"]
      rake parallel:seed["${steps}"]
      export DB_SEED_ALREADY_DONE=1
      rake parallel:rake[sphinx:parallel_setup,"${steps}"]

      parallel_rspec -n "${TOTAL_JOBS}" --only-group "${groups}" ./spec

( ͡° ͜ʖ ͡°) ¯_(ツ)_/¯

Clone this wiki locally