Skip to content

Commit 3110ae7

Browse files
committed
Increase connection pool to allow for up to 70% speed increase:
- ### TL;DR Bundler is heavily limited by the connection pool which manages a single connection. By increasing the number of connection, we can drastiscally speed up the installation process when many gems need to be downloaded and installed. ### Benchmark There are various factors that are hard to control such as compilation time and network speed but after dozens of tests I can consistently get aroud 70% speed increase when downloading and installing 472 gems, most having no native extensions (on purpose). ``` # Before bundle install 28.60s user 12.70s system 179% cpu 23.014 total # After bundle install 30.09s user 15.90s system 281% cpu 16.317 total ``` You can find on this gist how this was benchmarked and the Gemfile used https://gist.github.com/Edouard-chin/c8e39148c0cdf324dae827716fbe24a0 ### Context A while ago in #869, Aaron introduced a connection pool which greatly improved Bundler speed. It was noted in the PR description that managing one connection was already good enough and it wasn't clear whether we needed more connections. Aaron also had the intuition that we may need to increase the pool for downloading gems and he was right. > We need to study how RubyGems uses connections and make a decision > based on request usage (e.g. only use one connection for many small > requests like bundler API, and maybe many connections for > downloading gems) When bundler downloads and installs gem in parallel https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/installer/parallel_installer.rb#L128 most threads have to wait for the only connection in the pool to be available which is not efficient. ### Solution This commit modifies the pool size for the fetcher that Bundler uses. RubyGems fetcher will continue to use a single connection. The bundler fetcher is used in 2 places. 1. When downloading gems https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/source/rubygems.rb#L481-L484 2. When grabing the index (not the compact index) using the `bundle install --full-index` flag. https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/fetcher/index.rb#L9 Having more connections in 2) is not any useful but tweaking the size based on where the fetcher is used is a bit tricky so I opted to modify it at the class level. I fiddle with the pool size and found that 5 seems to be the sweet spot at least for my environment.
1 parent 4f85e02 commit 3110ae7

File tree

8 files changed

+95
-8
lines changed

8 files changed

+95
-8
lines changed

bundler/lib/bundler/fetcher/gem_remote_fetcher.rb

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,12 @@
55
module Bundler
66
class Fetcher
77
class GemRemoteFetcher < Gem::RemoteFetcher
8+
def initialize(*)
9+
super
10+
11+
@pool_size = 5
12+
end
13+
814
def request(*args)
915
super do |req|
1016
req.delete("User-Agent") if headers["User-Agent"]
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# frozen_string_literal: true
2+
3+
require "rubygems/remote_fetcher"
4+
require "bundler/fetcher/gem_remote_fetcher"
5+
require_relative "../../support/artifice/helpers/artifice"
6+
require "bundler/vendored_persistent.rb"
7+
8+
RSpec.describe Bundler::Fetcher::GemRemoteFetcher do
9+
describe "Parallel download" do
10+
it "download using multiple connections from the pool" do
11+
require_relative "../../support/artifice/helpers/endpoint"
12+
concurrent_ruby_path = Dir[scoped_base_system_gem_path.join("gems/concurrent-ruby-*/lib/concurrent-ruby")].first
13+
$LOAD_PATH.unshift(concurrent_ruby_path)
14+
require "concurrent-ruby"
15+
16+
require_rack_test
17+
responses = []
18+
19+
latch1 = Concurrent::CountDownLatch.new
20+
latch2 = Concurrent::CountDownLatch.new
21+
previous_client = Gem::Request::ConnectionPools.client
22+
dummy_endpoint = Class.new(Endpoint) do
23+
get "/foo" do
24+
latch2.count_down
25+
latch1.wait
26+
27+
responses << "foo"
28+
end
29+
30+
get "/bar" do
31+
responses << "bar"
32+
33+
latch1.count_down
34+
end
35+
end
36+
37+
Artifice.activate_with(dummy_endpoint)
38+
Gem::Request::ConnectionPools.client = Gem::Net::HTTP
39+
40+
first_request = Thread.new do
41+
subject.fetch_path("https://example.org/foo")
42+
end
43+
second_request = Thread.new do
44+
latch2.wait
45+
subject.fetch_path("https://example.org/bar")
46+
end
47+
48+
[first_request, second_request].each(&:join)
49+
50+
expect(responses).to eq(["bar", "foo"])
51+
ensure
52+
Artifice.deactivate
53+
Gem::Request::ConnectionPools.client = previous_client
54+
end
55+
end
56+
end

lib/rubygems/remote_fetcher.rb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ def initialize(proxy = nil, dns = nil, headers = {})
8282
@proxy = proxy
8383
@pools = {}
8484
@pool_lock = Thread::Mutex.new
85+
@pool_size = 1
8586
@cert_files = Gem::Request.get_cert_files
8687

8788
@headers = headers
@@ -338,7 +339,7 @@ def proxy_for(proxy, uri)
338339

339340
def pools_for(proxy)
340341
@pool_lock.synchronize do
341-
@pools[proxy] ||= Gem::Request::ConnectionPools.new proxy, @cert_files
342+
@pools[proxy] ||= Gem::Request::ConnectionPools.new proxy, @cert_files, @pool_size
342343
end
343344
end
344345
end

lib/rubygems/request/connection_pools.rb

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,12 @@ class << self
77
attr_accessor :client
88
end
99

10-
def initialize(proxy_uri, cert_files)
10+
def initialize(proxy_uri, cert_files, pool_size = 1)
1111
@proxy_uri = proxy_uri
1212
@cert_files = cert_files
1313
@pools = {}
1414
@pool_mutex = Thread::Mutex.new
15+
@pool_size = pool_size
1516
end
1617

1718
def pool_for(uri)
@@ -20,9 +21,9 @@ def pool_for(uri)
2021
@pool_mutex.synchronize do
2122
@pools[key] ||=
2223
if https? uri
23-
Gem::Request::HTTPSPool.new(http_args, @cert_files, @proxy_uri)
24+
Gem::Request::HTTPSPool.new(http_args, @cert_files, @proxy_uri, @pool_size)
2425
else
25-
Gem::Request::HTTPPool.new(http_args, @cert_files, @proxy_uri)
26+
Gem::Request::HTTPPool.new(http_args, @cert_files, @proxy_uri, @pool_size)
2627
end
2728
end
2829
end

lib/rubygems/request/http_pool.rb

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,14 @@
99
class Gem::Request::HTTPPool # :nodoc:
1010
attr_reader :cert_files, :proxy_uri
1111

12-
def initialize(http_args, cert_files, proxy_uri)
12+
def initialize(http_args, cert_files, proxy_uri, pool_size)
1313
@http_args = http_args
1414
@cert_files = cert_files
1515
@proxy_uri = proxy_uri
16-
@queue = Thread::SizedQueue.new 1
17-
@queue << nil
16+
@pool_size = pool_size
17+
18+
@queue = Thread::SizedQueue.new @pool_size
19+
setup_queue
1820
end
1921

2022
def checkout
@@ -31,7 +33,8 @@ def close_all
3133
connection.finish
3234
end
3335
end
34-
@queue.push(nil)
36+
37+
setup_queue
3538
end
3639

3740
private
@@ -44,4 +47,8 @@ def setup_connection(connection)
4447
connection.start
4548
connection
4649
end
50+
51+
def setup_queue
52+
@pool_size.times { @queue.push(nil) }
53+
end
4754
end

test/rubygems/test_gem_request_connection_pools.rb

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,4 +148,16 @@ def test_thread_waits_for_connection
148148
end
149149
end.join
150150
end
151+
152+
def test_checkouts_multiple_connections_from_the_pool
153+
uri = Gem::URI.parse("http://example/some_endpoint")
154+
pools = Gem::Request::ConnectionPools.new nil, [], 2
155+
pool = pools.pool_for uri
156+
157+
pool.checkout
158+
159+
Thread.new do
160+
assert_not_nil(pool.checkout)
161+
end.join
162+
end
151163
end

tool/bundler/test_gems.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
gem "rb_sys"
1212
gem "fiddle"
1313
gem "rubygems-generate_index", "~> 1.1"
14+
gem "concurrent-ruby"
1415
gem "psych"
1516
gem "etc", platforms: [:ruby, :windows]
1617
gem "open3"

tool/bundler/test_gems.rb.lock

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ GEM
44
base64 (0.3.0)
55
builder (3.3.0)
66
compact_index (0.15.0)
7+
concurrent-ruby (1.3.5)
78
date (3.5.0)
89
date (3.5.0-java)
910
etc (1.4.6)
@@ -59,6 +60,7 @@ PLATFORMS
5960
DEPENDENCIES
6061
builder (~> 3.2)
6162
compact_index (~> 0.15.0)
63+
concurrent-ruby
6264
etc
6365
fiddle
6466
open3
@@ -75,6 +77,7 @@ CHECKSUMS
7577
base64 (0.3.0) sha256=27337aeabad6ffae05c265c450490628ef3ebd4b67be58257393227588f5a97b
7678
builder (3.3.0) sha256=497918d2f9dca528fdca4b88d84e4ef4387256d984b8154e9d5d3fe5a9c8835f
7779
compact_index (0.15.0) sha256=5c6c404afca8928a7d9f4dde9524f6e1610db17e675330803055db282da84a8b
80+
concurrent-ruby (1.3.5) sha256=813b3e37aca6df2a21a3b9f1d497f8cbab24a2b94cab325bffe65ee0f6cbebc6
7881
date (3.5.0) sha256=5e74fd6c04b0e65d97ad4f3bb5cb2d8efb37f386cc848f46310b4593ffc46ee5
7982
date (3.5.0-java) sha256=d6876651299185b935e1b834a353e3a1d1db054be478967e8104e30a9a8f1127
8083
etc (1.4.6) sha256=0f7e9e7842ea5e3c3bd9bc81746ebb8c65ea29e4c42a93520a0d638129c7de01

0 commit comments

Comments
 (0)