Skip to content

Conversation

djspiewak
Copy link
Member

This shifts to a fully bespoke implementation of parTraverseN and such. There are a few things left to clean up, such as a few more tests and running some comparative benchmarks, but early results are very promising. In particular, the failure case from #4434 appears to be around two to three orders of magnitude faster with this implementation (which makes sense, since it handles early abort correctly). Kudos to @SystemFw for the core idea which makes this possible.

One of the things I'm doing here is giving up entirely on universal fairness and merely focusing on in-batch fairness. A simpler way of saying this is that we are hardened against head of line blocking, both for actions and cancelation.

Fixes #4434

@djspiewak
Copy link
Member Author

Pros and cons on performance, though I think it's possible to do better here. It's a little bit slower than the previous implementation in the happy path, but it's several orders of magnitude faster in the error path so I'll call that a win.

Before

[info] Benchmark                             (cpuTokens)  (size)   Mode  Cnt    Score    Error  Units
[info] ParallelBenchmark.parTraverse               10000    1000  thrpt   10  292.924 ±  1.496  ops/s
[info] ParallelBenchmark.parTraverseN              10000    1000  thrpt   10  277.978 ±  1.280  ops/s
[info] ParallelBenchmark.parTraverseNCancel        10000    1000  thrpt   10    0.006 ±  0.001  ops/s
[info] ParallelBenchmark.traverse                  10000    1000  thrpt   10   48.015 ±  0.016  ops/s

After

[info] Benchmark                             (cpuTokens)  (size)   Mode  Cnt    Score   Error  Units
[info] ParallelBenchmark.parTraverse               10000    1000  thrpt   10  293.834 ± 1.152  ops/s
[info] ParallelBenchmark.parTraverseN              10000    1000  thrpt   10  233.868 ± 0.309  ops/s
[info] ParallelBenchmark.parTraverseNCancel        10000    1000  thrpt   10    7.859 ± 0.014  ops/s
[info] ParallelBenchmark.traverse                  10000    1000  thrpt   10   48.059 ± 0.014  ops/s

@djspiewak
Copy link
Member Author

So I haven't golfed the failure down yet, but it really looks like we're hitting a bug in Scala.js, probably stemming from the "null safe" test. @durban you may be amused

I think we could just remove the null safe test now since we're not using an ArrayBuffer internally, but it's kind of a neat surprise.

@durban
Copy link
Contributor

durban commented Jul 23, 2025

Well, "amused" is one word for it :-) So it's not a bug in Scala.js, as in, it behaves as documented: dereferencing null is undefined behavior in Scala.js (LOL, what? Seriously.), so literally any behavior is "behaving as documented". Apparently scalaJSLinkerConfig could be configured to behave properly for nulls. But removing that very specific test is also fine I think.

@djspiewak
Copy link
Member Author

Well that's fun. I actually thought we had some special checking for when the cur0 action became null in the runloop, but apparently not.

@durban
Copy link
Contributor

durban commented Jul 23, 2025

@djspiewak
Copy link
Member Author

Ahhhhhhh that makes sense. Okay, by that token, I think it's fair to say that a lot of our combinators just aren't null-safe and that's how it's going to be. :P

@durban
Copy link
Contributor

durban commented Aug 9, 2025

It's annoying, because in Scala they are null safe. (The test passed before, it just failed on JS.) We'd have to do something like this (everywhere), to make it work on JS:

def combinator(fa: F[A], ...) = {
  if (fa eq null) throw new NullPointerException
}

Which is (1) annoying, (2) very redundant, except on Scala.js, and (3) apparently has performance problems in Scala.js (or maybe that's only the linker setting?).

I don't propose we do this. There is a Scala.js linker setting which fixes the problem. In Scala and Scala Native it works by default.

*/
def parTraverseN[T[_]: Traverse, A, B](n: Int)(ta: T[A])(f: A => F[B]): F[T[B]] = {
require(n >= 1, s"Concurrency limit should be at least 1, was: $n")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scaladoc needs an update above.

*/
def parTraverseN_[T[_]: Foldable, A, B](n: Int)(ta: T[A])(f: A => F[B]): F[Unit] = {
require(n >= 1, s"Concurrency limit should be at least 1, was: $n")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scaladoc above.

case None =>
F.uncancelable { poll =>
F.deferred[Outcome[F, E, B]] flatMap { result =>
val action = poll(sem.acquire) >> f(a)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally >> and not *>? So that evaluating the pure f is also restricted by the semaphore? (In my opinion it doesn't need to be, but it's okay that it is.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intentional. I should probably comment it as such. I think most users probably believe that even the pure part of the function is parallelized (and subject to the semaphore).

result
.get
.flatMap(_.embed(F.canceled *> F.never))
.onCancel(fiber.cancel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this onCancel necessary? Wouldn't the guaranteeCase below cancel everything in supervision anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not necessary. I've been building this up a bit incrementally so there's some overlapping logic I need to deduplicate due to the number of corner cases this function has.

// we block until it's all done by acquiring all the permits
F.race(preempt.get *> cancelAll, sem.acquire.replicateA_(n)) *>
// if we hit an error or self-cancelation in any effect, resurface it here
// note that we can't lose errors here because of the permits: we know the fibers are done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may be a race here:

  1. The very last task fails with an error, and releases its permit (sem.release above in wrapped).
  2. Acquiring all the permits here wins the F.race here (just above).
  3. Just below we preempt.tryGet, and read None, and complete with F.unit.
  4. The task completes preempt with the error (above in wrapped).
  5. (But no one will see that any more.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good point. I do this in both implementations. My thinking was that it increases parallelism somewhat (releasing the permit asap), but it does general this race condition. I'll fix it in both.

@durban
Copy link
Contributor

durban commented Aug 9, 2025

(Just some context about the null test you've removed: I've added that on my old branch, because I've had a bug previously in my implementation, where it didn't handle correctly if f(a) was null. I don't remember exactly, but I think it just ignored it. So I've added the test to make sure we see the NPE. As I've said, I think it's fine to remove it here.)

@mr-git
Copy link
Contributor

mr-git commented Sep 29, 2025

Could this fix hit 3.6.4?

@djspiewak
Copy link
Member Author

Could this fix hit 3.6.4?

There are a couple failing tests related to early termination that I'm still trying to track down. Am trying to find the spare time needed to push on it. Help definitely welcome! Otherwise I'll probably get to it within the next few weeks. Sorry :(

@domaspoliakas
Copy link
Contributor

I see that last CI is green, do you mean that you want to reintroduce the tests removed in this commit? 599b790

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants