-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Increase connection pool to allow for up to 70% speed increase on bundle install
#9087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Edouard-chin
wants to merge
1
commit into
ruby:master
Choose a base branch
from
Shopify:ec-connection-pool
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+99
−8
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Edouard-chin
commented
Nov 17, 2025
|
|
||
| RSpec.describe Bundler::Fetcher::GemRemoteFetcher do | ||
| describe "Parallel download" do | ||
| it "download using multiple connections from the pool" do |
Contributor
Author
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this patch, this test would deadlock
a1ca21a to
3110ae7
Compare
- ### TL;DR Bundler is heavily limited by the connection pool which manages a single connection. By increasing the number of connection, we can drastiscally speed up the installation process when many gems need to be downloaded and installed. ### Benchmark There are various factors that are hard to control such as compilation time and network speed but after dozens of tests I can consistently get aroud 70% speed increase when downloading and installing 472 gems, most having no native extensions (on purpose). ``` # Before bundle install 28.60s user 12.70s system 179% cpu 23.014 total # After bundle install 30.09s user 15.90s system 281% cpu 16.317 total ``` You can find on this gist how this was benchmarked and the Gemfile used https://gist.github.com/Edouard-chin/c8e39148c0cdf324dae827716fbe24a0 ### Context A while ago in ruby#869, Aaron introduced a connection pool which greatly improved Bundler speed. It was noted in the PR description that managing one connection was already good enough and it wasn't clear whether we needed more connections. Aaron also had the intuition that we may need to increase the pool for downloading gems and he was right. > We need to study how RubyGems uses connections and make a decision > based on request usage (e.g. only use one connection for many small > requests like bundler API, and maybe many connections for > downloading gems) When bundler downloads and installs gem in parallel https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/installer/parallel_installer.rb#L128 most threads have to wait for the only connection in the pool to be available which is not efficient. ### Solution This commit modifies the pool size for the fetcher that Bundler uses. RubyGems fetcher will continue to use a single connection. The bundler fetcher is used in 2 places. 1. When downloading gems https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/source/rubygems.rb#L481-L484 2. When grabing the index (not the compact index) using the `bundle install --full-index` flag. https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/fetcher/index.rb#L9 Having more connections in 2) is not any useful but tweaking the size based on where the fetcher is used is a bit tricky so I opted to modify it at the class level. I fiddle with the pool size and found that 5 seems to be the sweet spot at least for my environment.
3110ae7 to
6063fd9
Compare
Contributor
Author
Member
|
🙏 Thanks for detailed information. I would like to merge this until 4.0.0 final. But I'm in a situation where I don't have time to reviews. |
Contributor
Author
|
Thanks for all your work , no rush at all. Just wanted to provide this information. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.





What was the end-user or developer problem that led to this PR?
TL;DR
Bundler is heavily limited by the connection pool which manages a single connection. By increasing the number of connection, we can drastically speed up the installation process when many gems need to be downloaded and installed.
Benchmark
There are various factors that are hard to control such as compilation time and network speed but after dozens of tests I can consistently get aroud 70% speed increase when downloading and installing 472 gems, most having no native extensions (on purpose).
You can find on this gist how this was benchmarked and the Gemfile used https://gist.github.com/Edouard-chin/c8e39148c0cdf324dae827716fbe24a0
Context
A while ago in #869, Aaron introduced a connection pool which greatly improved Bundler speed. It was noted in the PR description that managing one connection was already good enough and it wasn't clear whether we needed more connections. Aaron also had the intuition that we may need to increase the pool for downloading gems and he was right.
When bundler downloads and installs gem in parallel most threads have to wait for the only connection in the pool to be available which is not efficient.
rubygems/bundler/lib/bundler/installer/parallel_installer.rb
Lines 122 to 124 in 4f85e02
What is your fix for the problem, implemented in this PR?
This commit modifies the pool size for the fetcher that Bundler uses. RubyGems fetcher will continue to use a single connection.
The bundler fetcher is used in 2 places.
rubygems/bundler/lib/bundler/source/rubygems.rb
Lines 481 to 484 in 4f85e02
bundle install --full-indexflag.rubygems/bundler/lib/bundler/fetcher/index.rb
Line 9 in 4f85e02
Having more connections in 2) is not any useful but tweaking the size based on where the fetcher is used is a bit tricky so I opted to modify it at the class level.
I fiddle with the pool size and found that 5 seems to be the sweet spot at least for my environment.
Make sure the following tasks are checked