Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for int64 and bigints in indexed for loops #876

Open
5 tasks done
cartermp opened this issue Jun 6, 2020 · 12 comments
Open
5 tasks done

Allow for int64 and bigints in indexed for loops #876

cartermp opened this issue Jun 6, 2020 · 12 comments

Comments

@cartermp
Copy link
Member

cartermp commented Jun 6, 2020

Title of Suggestion

I propose we allow for loops of the following form:

// int64
for y = 0L to x do ()

// bigint
for y = 0I to x do ()

The existing way of approaching this problem in F# is to use the in..do syntax:

// int64
for y in 0L..x do ()

// bigint
for y in 0I..x do ()

Though this does make some things awkward because the indexer style may be preferred in some situations.

Pros and Cons

The advantages of making this adjustment to F# are:

  • Consistency
  • No enumerator allocation and usage

The disadvantages of making this adjustment to F# are:

  • It's work

Extra information

These kinds of loops work in C#, so it's certainly possible to also allow them in F#.

Estimated cost (XS, S, M, L, XL, XXL): S-M

Related suggestions: #55

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I or my company would be willing to help implement and/or test this
@Happypig375

This comment has been minimized.

@abelbraaksma
Copy link
Member

abelbraaksma commented Jun 7, 2020

I'm sure he meant:

// bigint
for y = 0I to x do ()

The range version for in is extremely slow with this scenario as I believe it creates an enumerable (which ik sure we could detect and optimize away, but that's another suggestion), while for to creates optimized code. I'd welcome to extend the language to allow other primitives and bigint there.

@cartermp
Copy link
Member Author

cartermp commented Jun 7, 2020

Fixed

Yes, the current form allocates and uses an enumerator, which results in about a 6x performance degradation compared to a loop or tail recursive function. Note that, as is often the case in F#, some compiler trickery is played and when using ints the enumerator stuff gets optimized away. But not with longs or bigints.

@kerams
Copy link

kerams commented Nov 4, 2022

Implementation-wise I assume this would mean using TOp.IntegerForLoop for int64 and bigint (though to be fair I don't know in what world you'd ever expect to loop more than Int64.MaxValue times).

Yes, the current form allocates and uses an enumerator, which results in about a 6x performance degradation compared to a loop or tail recursive function

The allocation cost is amortized away with increasing iteration count, no? If you're going to use many short-lived for loops, you don't need support for larger ranges in the first place. Of course, each iteration is faster when you don't have to use the enumerator, so that would still be a win for tight loops with billions of iterations.

@abelbraaksma
Copy link
Member

abelbraaksma commented Nov 7, 2022

The allocation cost is amortized away with increasing iteration count, no?

Yes. Instead, It is the overhead of a non-inlineable method call that is the actual cost here. A simple inc vs MoveNext() is the big win here.

Implementation-wise, as long as the value of the iteration is not used, this could be implemented with int32 for most cases, but I’m not sure at what point the compiler knows that a loop var is unused in the following block.

Though more likely than not, a user choosing bigint is gonna do so because of downstream usability of that type. Either way, any solution based on int will be a significant improvement over the enumerator.

@kerams
Copy link

kerams commented Nov 7, 2022

as long as the value of the iteration is not used, this could be implemented with int32 for most cases

This could only work if start and finish were constants. Otherwise you can't tell at compile time if the range fits into an int32.

Though more likely than not, a user choosing bigint is gonna do so because of downstream usability of that type.

You mean in the body of the loop? You might as well use int32 in the loop and then convert it to a biginteger. Shouldn't be any slower than having bigintegers in the loop (without an enumerator, that is), incremented in the background.

@abelbraaksma
Copy link
Member

Otherwise you can't tell at compile time if the range fits into an int32.

It doesn’t have to fit. But you’re right, the extra checks and corner cases that arise are probably not worth the effort.

You might as well use int32 in the loop and then convert it to a biginteger. Shouldn't be any slower than having bigintegers in the loop

Not sure. That depends on how bigint is implemented. Unless it has changed over time (it probably has), conversion was more expensive than (partial) mutation. To find out, we’d probably need to test.

But thinking again about that approach, I wonder if it’s worth the effort. If a programmer wants speed over convenience, they’d like use some way of double nested int32 loop anyway.

@brianrourkeboll
Copy link

brianrourkeboll commented Feb 2, 2025

One option would be to treat SynExpr.For as SynExpr.ForEach over a range internally. The optimizations from dotnet/fsharp#16650, etc., would then automatically kick in for all primitive integral times.

I have tried this out locally and it does indeed work.

A couple of callouts:

  • We did not optimize System.Numerics.BigInteger ranges in Better integral range lowering: start..finish, start..step..finish dotnet/fsharp#16650, etc. Treating SynExpr.For as SynExpr.ForEach internally would let for n = 1I to 10I do … compile, but it would still use the same enumerable approach as the for-each version. We could add that optimization separately and ideally apply it to both for and for-each loops.
  • With this approach, downto would not work for unsigned types, just as, e.g., for n in 10uy .. -1uy .. 1uy do … is not valid. We could of course translate downto as subtracting $1$ instead of adding $-1$, but that would require new codegen and would differ from the behavior of ranges. What do we want to do here?

@T-Gro
Copy link

T-Gro commented Feb 3, 2025

The start..finish optimization applied to bigint would resolve the use case by making for y in 0I..x do () efficient.

I am not sure I like universally treating For like ForEach for the unexpected impact on un-optimized code, and for possible unforeseen impact this might have on tooling.

@brianrourkeboll
Copy link

The start..finish optimization applied to bigint would resolve the use case by making for y in 0I..x do () efficient.

Right, I guess it depends whether this suggestion was really more about (1) applying the optimizations that existed at the time for for-loops but not for for-each loops to more types, or (2) allowing more types to use the for-loop syntax.

I would say that even though we have solved (1) for int64, (2) is still a good idea, because right now for-loops only work with int32—except in computation expressions, where they can work with any type.

The start..finish optimization applied to bigint would resolve the use case by making for y in 0I..x do () efficient.

I am not sure I like universally treating For like ForEach for the unexpected impact on un-optimized code

I don't think there would be any performance impact on any existing code, since for-loops currently only allow int32 anyway.

and for possible unforeseen impact this might have on tooling.

We would need to default to int32 in the absence of other type information for backwards compatibility in scenarios like

let f x y = for n = x to y do

I can't think of any other interactions this would have with tooling, though, since, again, the same syntax is already supported in computation expressions with any type.

The only question I would have would be what error message to show if we disallowed unsigned types with downto (which would be necessary if we reused the for-each over range optimizations).

@brianrourkeboll
Copy link

@T-Gro This is what I meant: dotnet/fsharp#18301.

@T-Gro
Copy link

T-Gro commented Feb 10, 2025

Ok that makes sense - you still try tryTcStartAndFinishAsInt32 as before, and only fallback to foreach to newly supported types. Got it.
(My initial fear was from converting everything to foreach, incl. int32 etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants