Skip to content

Reduce overhead in BorgJob's subprocess management#2419

Open
rzeigler wants to merge 5 commits intoborgbase:masterfrom
rzeigler:optimize-borg-job-io
Open

Reduce overhead in BorgJob's subprocess management#2419
rzeigler wants to merge 5 commits intoborgbase:masterfrom
rzeigler:optimize-borg-job-io

Conversation

@rzeigler
Copy link
Contributor

@rzeigler rzeigler commented Mar 5, 2026

Description

Modify BorgJob.run so that it only wakes up when there is data to read from the subprocess or the subprocess has exited. Furthermore, avoid reading from file descriptors that do not have any data ready.

Related Issue

#2413

Motivation and Context

The existing implementation spins continually looking for data and will always read from both file selectors. The former creates more system calls than are necessary and the later raises and catches exceptions continually. Since select will tell us which file descriptors have data to be read we can simply sleep until a file descriptor is ready indefinitely. Additionally, EOF is considered data to be read and this will arrive as the subprocess shuts down.

I have hypothesized that this would result in at least some efficiency gain though not necessarily less system calls due to the fact there may be data to be read while we are still handling the raised exception. In order to test this I created the profiling/borg_job_overhead.py script which attempts to exercise only the borg job directly and ran it 10x using linux's perf tool to create a repo and 1 archive using my local photos collection which is approximately 80k items.

The results are detailed in the following 2 gists:

before
after

The highest task clock measurements afterwards are similar to the lowest task clock measurements before. There are some differences that are surprising, for instance, there appear to be more instructions executed and high wall clock time afterwards. I believe this may be that more useful work is actually being done. Also interestingly, the trend was for there to be about 1/2 as many CPU migrations after the refactoring which I believe is a good thing since the non-shared L1/2 cache is more likely to be good.

Finally, some caveats; given that the BorgJob class calls out to the running application singleton I'm unsure how much of the measurements here are from partially bootstrapping the UI and I am a novice performance engineer so I am not an expert in interpreting this.

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I have read the CONTRIBUTING guide.
  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

I provide my contribution under the terms of the license of this repository and I affirm the Developer Certificate of Origin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant