Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReusableGoroutinesPool: Fix datarace on Close #607

Merged
merged 2 commits into from
Oct 11, 2024

Conversation

julienduchesne
Copy link
Member

@julienduchesne julienduchesne commented Oct 10, 2024

Fixes this datarace in Mimir: grafana/mimir#9585, but I figured this went here. Both Go and Close functions are non-blocking, so this shouldn't deadlock

Q: Should this fail open instead? Have new Go calls start new goroutines when the pool is closed. Was that the intented behavior of the struct, or is a panic OK?

Without the mutex:

julienduchesne@triceratops dskit % go test ./concurrency -race
==================
WARNING: DATA RACE
Write at 0x00c00001e240 by goroutine 103:
  runtime.recvDirect()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:388 +0x7c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Close()
      /Users/julienduchesne/Repos/dskit/concurrency/worker.go:40 +0x1d4
  github.com/grafana/dskit/concurrency.TestReusableGoroutinesPool_Race()
      /Users/julienduchesne/Repos/dskit/concurrency/worker_test.go:89 +0x1c4
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x40

Previous read at 0x00c00001e240 by goroutine 106:
  runtime.chansend1()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:157 +0x2c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Go()
      /Users/julienduchesne/Repos/dskit/concurrency/worker.go:29 +0x48
  github.com/grafana/dskit/concurrency.TestReusableGoroutinesPool_Race.func1()
      /Users/julienduchesne/Repos/dskit/concurrency/worker_test.go:82 +0xfc

Goroutine 103 (running) created at:
  testing.(*T).Run()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x5e0
  testing.runTests.func1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:2168 +0x80
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.runTests()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:2166 +0x6e0
  testing.(*M).Run()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:2034 +0xb74
  go.uber.org/goleak.VerifyTestMain()
      /Users/julienduchesne/go/pkg/mod/go.uber.org/goleak@v1.2.0/testmain.go:53 +0x44
  github.com/grafana/dskit/concurrency.TestMain()
      /Users/julienduchesne/Repos/dskit/concurrency/limited_concurrency_singleflight_test.go:16 +0x130
  main.main()
      _testmain.go:85 +0x114

Goroutine 106 (running) created at:
  github.com/grafana/dskit/concurrency.TestReusableGoroutinesPool_Race()
      /Users/julienduchesne/Repos/dskit/concurrency/worker_test.go:74 +0x1ac
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x40
==================
--- FAIL: TestReusableGoroutinesPool_Race (0.01s)
    testing.go:1399: race detected during execution of test
FAIL
FAIL    github.com/grafana/dskit/concurrency    3.240s
FAIL

Copy link
Contributor

@seizethedave seizethedave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Go() is in the hot path of a bunch of services, what do you think about making the mutex an RWLock so that there's not an additional layer of serialization there? Go() can take the reader lock, Close() would take the exclusive.

@julienduchesne
Copy link
Member Author

As Go() is in the hot path of a bunch of services, what do you think about making the mutex an RWLock so that there's not an additional layer of serialization there? Go() can take the reader lock, Close() would take the exclusive.

Done. Also, I've made it spawn new goroutines if the channel is closed rather than panic. It seems like that's what @colega intended with the lib (#606 (comment))

@julienduchesne julienduchesne requested a review from a team October 11, 2024 16:38
@seizethedave seizethedave self-requested a review October 11, 2024 16:52
Copy link
Contributor

@seizethedave seizethedave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG2M!

@julienduchesne julienduchesne force-pushed the julienduchesne/reusable-pool branch from 0adccf4 to 2e255e5 Compare October 11, 2024 16:54
Without the mutex:
```
julienduchesne@triceratops dskit % go test ./concurrency -race
==================
WARNING: DATA RACE
Write at 0x00c00001e240 by goroutine 103:
  runtime.recvDirect()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:388 +0x7c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Close()
      /Users/julienduchesne/Repos/dskit/concurrency/worker.go:40 +0x1d4
  github.com/grafana/dskit/concurrency.TestReusableGoroutinesPool_Race()
      /Users/julienduchesne/Repos/dskit/concurrency/worker_test.go:89 +0x1c4
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x40

Previous read at 0x00c00001e240 by goroutine 106:
  runtime.chansend1()
      /opt/homebrew/opt/go/libexec/src/runtime/chan.go:157 +0x2c
  github.com/grafana/dskit/concurrency.(*ReusableGoroutinesPool).Go()
      /Users/julienduchesne/Repos/dskit/concurrency/worker.go:29 +0x48
  github.com/grafana/dskit/concurrency.TestReusableGoroutinesPool_Race.func1()
      /Users/julienduchesne/Repos/dskit/concurrency/worker_test.go:82 +0xfc

Goroutine 103 (running) created at:
  testing.(*T).Run()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x5e0
  testing.runTests.func1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:2168 +0x80
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.runTests()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:2166 +0x6e0
  testing.(*M).Run()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:2034 +0xb74
  go.uber.org/goleak.VerifyTestMain()
      /Users/julienduchesne/go/pkg/mod/go.uber.org/goleak@v1.2.0/testmain.go:53 +0x44
  github.com/grafana/dskit/concurrency.TestMain()
      /Users/julienduchesne/Repos/dskit/concurrency/limited_concurrency_singleflight_test.go:16 +0x130
  main.main()
      _testmain.go:85 +0x114

Goroutine 106 (running) created at:
  github.com/grafana/dskit/concurrency.TestReusableGoroutinesPool_Race()
      /Users/julienduchesne/Repos/dskit/concurrency/worker_test.go:74 +0x1ac
  testing.tRunner()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /opt/homebrew/opt/go/libexec/src/testing/testing.go:1743 +0x40
==================
--- FAIL: TestReusableGoroutinesPool_Race (0.01s)
    testing.go:1399: race detected during execution of test
FAIL
FAIL    github.com/grafana/dskit/concurrency    3.240s
FAIL
```
@julienduchesne julienduchesne force-pushed the julienduchesne/reusable-pool branch from 2e255e5 to 0313a72 Compare October 11, 2024 20:35
@julienduchesne julienduchesne merged commit cf9b0bc into main Oct 11, 2024
5 checks passed
@julienduchesne julienduchesne deleted the julienduchesne/reusable-pool branch October 11, 2024 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants