Skip to content

Comments

Copies references to data structures to avoid extra ARC traffic.#42

Merged
saeta merged 3 commits intomasterfrom
threadpool-reduce-arc-traffic
Jun 10, 2020
Merged

Copies references to data structures to avoid extra ARC traffic.#42
saeta merged 3 commits intomasterfrom
threadpool-reduce-arc-traffic

Conversation

@saeta
Copy link
Owner

@saeta saeta commented May 25, 2020

Note: this change (by itself) does not significantly reduce ARC traffic, but in concert
with UnmanagedBuffer (#41), we see a ~15x performance improvement in
the parallelFor benchmark.

Performance for NonBlockingThreadPool: parallel for, one level:

master (b5ab4b6)                                   336813.5 ns  ±  65.93 %   6526        
managed buffers (9e31153)                         5062117.0 ns  ±  16.01 %    280         
this change (1ae770d)                              210097.0 ns  ±  58.77 %   9347        
tmp branch with managed buffers + this change       22078.0 ns  ± 942.16 %   8510        

Note: the variability in timing increases substantially as performance improves. This is (I believe) due to the lower frequency of stealing.

Note: this change (by itself) does not reduce ARC traffic, but in concert
with `UnmanagedBuffer` (#41), we see a ~15x performance improvement in
the parallelFor benchmark.
@saeta saeta requested a review from dabrahams May 25, 2020 07:25
@saeta
Copy link
Owner Author

saeta commented May 25, 2020

#33

@dabrahams
Copy link
Collaborator

#33

A couple of words describing the relationship you're establishing by mentioning that issue here would be appreciated.

Copy link
Collaborator

@dabrahams dabrahams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say, the phrasing of the change description was rather non-obvious to me on its own. I think I roughly understand what you're doing but it's also very non-obvious why it makes to store shared data structures in something called PerThreadState. Maybe you could at least segregate those properties or comment them to make it clear what's happening. I approve of this change as a step along the path of progress, but I think once things settle down a bit it would be a good for us to do a full walkthrough of all of this code. In the meantime, how about turning on thread sanitizer for your tests of parallel components? ;-)


func spin() -> Task? {
let spinCount = pool.threads.count > 0 ? Constants.spinCount / pool.threads.count : 0
let spinCount = totalThreadCount > 0 ? Constants.spinCount / totalThreadCount : 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks suspect. Elsewhere you have replaced pool.totalThreadCount with totalThreadCount but here you're replacing pool.threads.count which is presumably a different quantity (if it isn't, I question the value of pool.totalThreadCount!)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye! This should be the number of worker threads and not the totalThreadCount. Thanks!

@saeta saeta merged commit 659b82b into master Jun 10, 2020
@saeta saeta deleted the threadpool-reduce-arc-traffic branch June 10, 2020 03:59
@dabrahams
Copy link
Collaborator

@saeta You didn't address my suggestion to turn on tsan. Building parallel stuff and not testing with tsan seems really super inadvisable. At least file an issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants