Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This gem is not thread safe, can you put a warning in the README? #131

Open
samuelpismel opened this issue Nov 17, 2016 · 6 comments
Open

Comments

@samuelpismel
Copy link

In the source code of lib/ethon/curl.rb is clear that the code is not thread-safe.
Can you put a warning in the readme?

ethon/lib/ethon/curl.rb

Lines 50 to 85 in ab052b6

# This function is not thread safe. You must not call it when any other thread in
# the program (i.e. a thread sharing the same memory) is running. This doesn't just
# mean no other thread that is using libcurl. Because curl_global_init() calls
# functions of other libraries that are similarly thread unsafe, it could conflict with
# any other thread that uses these other libraries.
#
# @raise [ Ethon::Errors::GlobalInit ] If Curl.global_init fails.
def init
@@curl_mutex.synchronize {
if not @@initialized
raise Errors::GlobalInit.new if Curl.global_init(GLOBAL_ALL) != 0
@@initialized = true
Ethon.logger.debug("ETHON: Libcurl initialized") if Ethon.logger
end
}
end
# This function releases resources acquired by curl_global_init.
# You should call curl_global_cleanup once for each call you make to
# curl_global_init, after you are done using libcurl.
# This function is not thread safe. You must not call it when any other thread in the
# program (i.e. a thread sharing the same memory) is running. This doesn't just
# mean no other thread that is using libcurl. Because curl_global_cleanup calls functions of other
# libraries that are similarly thread unsafe, it could conflict with
# any other thread that uses these other libraries.
# See the description in libcurl of global environment requirements
# for details of how to use this function.
def cleanup
@@curl_mutex.synchronize {
if @@initialized
Curl.global_cleanup()
@@initialized = false
Ethon.logger.debug("ETHON: Libcurl cleanup") if Ethon.logger
end
}
end

# This function is not thread safe. You must not call it when any other thread in
# the program (i.e. a thread sharing the same memory) is running. This doesn't just
# mean no other thread that is using libcurl. Because curl_global_init() calls
# functions of other libraries that are similarly thread unsafe, it could conflict with
# any other thread that uses these other libraries.
#
# @raise [ Ethon::Errors::GlobalInit ] If Curl.global_init fails.
def init
  @@curl_mutex.synchronize {
    if not @@initialized
      raise Errors::GlobalInit.new if Curl.global_init(GLOBAL_ALL) != 0
      @@initialized = true
      Ethon.logger.debug("ETHON: Libcurl initialized") if Ethon.logger
    end
  }
end

# This function releases resources acquired by curl_global_init.
# You should call curl_global_cleanup once for each call you make to
# curl_global_init, after you are done using libcurl.
# This function is not thread safe. You must not call it when any other thread in the
# program (i.e. a thread sharing the same memory) is running. This doesn't just
# mean no other thread that is using libcurl. Because curl_global_cleanup calls functions of other
# libraries that are similarly thread unsafe, it could conflict with
# any other thread that uses these other libraries.
# See the description in libcurl of global environment requirements
# for details of how to use this function.
def cleanup
  @@curl_mutex.synchronize {
    if @@initialized
      Curl.global_cleanup()
      @@initialized = false
      Ethon.logger.debug("ETHON: Libcurl cleanup") if Ethon.logger
    end
  }
end
@hanshasselberg
Copy link
Member

@samuelpismel thanks for reporting. You are right. Although the code you point out is thread safe (if there is no bug) since invoking of init is synchronised. But you are still correct, since there are gotchas we cannot protect the gem user from like controlling the same hydra from different threads which can cause problems.

Would you like to make a PR with the changes to the Readme you have in mind?

@hanshasselberg
Copy link
Member

(sorry for the late answer)

@DanDobrick
Copy link

DanDobrick commented Nov 20, 2018

@samuelpismel thanks for reporting. You are right. Although the code you point out is thread safe (if there is no bug) since invoking of init is synchronised. But you are still correct, since there are gotchas we cannot protect the gem user from like controlling the same hydra from different threads which can cause problems.

Is there a list of such problems? If Typhoeus is threadsafe for basic get/post/etc usage it would make sense to mention this in any README update.

@dbackeus
Copy link

Can someone give an example of how this can cause issues in practice?

@esse
Copy link

esse commented Feb 10, 2022

@dbackeus we had an issue when running requests using puma with Concurrent::Promise - puma threads sometimes got deadlocked forever.

Ditching Typhoeus in favour of Net:HTTP fixed our issue.

@uvlad7
Copy link

uvlad7 commented Sep 4, 2024

@dbackeus @DanDobrick sorry for necroposting, I wanted to create a new issue, but found this.
As libcurl states, handles can be shared between threads, but must not be used at the same time. You might think it's not a problem in Ruby due to GVL, but Curl.easy_perform releases it, so it's possible for another thread to call it too.

Simple example:

require 'ethon'

Ethon.logger.level = :debug
curl = Ethon::Easy.new(url: 'https://example.com/?q=1')
curl2 = Ethon::Easy.new(url: 'https://example.com/?q=2')

curl.perform
Ethon.logger.debug { "Starting threads" }
2.times.flat_map do
  [
    Thread.new do
      curl.perform
    end,
    Thread.new do
      curl2.perform
    end,
  ]
end.each(&:join)

produces

D, [2024-09-04T14:23:12.294458 #2952480] DEBUG -- : ETHON: Libcurl initialized
D, [2024-09-04T14:23:12.878454 #2952480] DEBUG -- : ETHON: performed EASY effective_url=https://example.com/?q=1 response_code=200 return_code=ok total_time=0.582519
D, [2024-09-04T14:23:12.878571 #2952480] DEBUG -- : Starting threads
D, [2024-09-04T14:23:12.886400 #2952480] DEBUG -- : ETHON: performed EASY effective_url=https://example.com/?q=1 response_code=200 return_code=ok total_time=0.582519
D, [2024-09-04T14:23:12.887194 #2952480] DEBUG -- : ETHON: performed EASY effective_url=https://example.com/?q=2 response_code=0 return_code=failed_init total_time=0.00203
D, [2024-09-04T14:23:12.996166 #2952480] DEBUG -- : ETHON: performed EASY effective_url=https://example.com/?q=1 response_code=200 return_code=ok total_time=0.582519
D, [2024-09-04T14:23:13.455464 #2952480] DEBUG -- : ETHON: performed EASY effective_url=https://example.com/?q=2 response_code=0 return_code=failed_init total_time=0.00203

As you can see, Curl.init is performed in the main - and the only at the moment - thread, but the second handle still fails. Change the number of threads to 1, and everything works fine.

So, Typhoeus is most likely thread-safe for simple Typhoeus.get/Typhoeus.post/etc, when you don't reuse handles, Typhoeus::Request is also appears to be safe, but things become tricky with Hydra, as it uses curl_multi handle and Hydra.run uses
select under the hood, which releases GVL.

I wasn't able to reproduce the problem (except on_complete callbacks called only once) with curb, but it uses multi for Easy#perform, and although it releases GVL, I'm not sure if this is OK or just less likely to fail.

require 'curb'
require 'logger'

logger = Logger.new(STDOUT)
logger.level = Logger::DEBUG

logger_proc = proc do |c|
  logger.debug { "CURB: performed EASY effective_url=#{c.last_effective_url} response_code=#{c.response_code} return_code=#{Curl::Easy.error(c.last_result).first.name[/Curl::Err::Curl(.*)/, 1].downcase} total_time=#{c.total_time}" }
end

curl = Curl::Easy.new('https://example.com/?q=1')
curl.on_failure { }
# curl.on_complete(&logger_proc)
curl2 = Curl::Easy.new('https://example.com/?q=2')
curl2.on_failure { }
# curl2.on_complete(&logger_proc)

curl.perform
logger_proc[curl]
logger.debug { 'Starting threads' }
2.times.flat_map do
  [
    Thread.new do
      curl.perform
      logger_proc[curl]
    end,
    Thread.new do
      curl2.perform
      logger_proc[curl2]
    end,
  ]
end.each(&:join)
D, [2024-09-04T14:47:12.661456 #2966038] DEBUG -- : CURB: performed EASY effective_url=https://example.com/?q=1 response_code=200 return_code=ok total_time=0.512427
D, [2024-09-04T14:47:12.661530 #2966038] DEBUG -- : Starting threads
D, [2024-09-04T14:47:12.833620 #2966038] DEBUG -- : CURB: performed EASY effective_url=https://example.com/?q=1 response_code=200 return_code=ok total_time=0.171178
D, [2024-09-04T14:47:12.833737 #2966038] DEBUG -- : CURB: performed EASY effective_url=https://example.com/?q=1 response_code=200 return_code=ok total_time=0.171178
D, [2024-09-04T14:47:13.173195 #2966038] DEBUG -- : CURB: performed EASY effective_url=https://example.com/?q=2 response_code=200 return_code=ok total_time=0.509828
D, [2024-09-04T14:47:13.273281 #2966038] DEBUG -- : CURB: performed EASY effective_url=https://example.com/?q=2 response_code=200 return_code=ok total_time=0.509828

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants