Skip to content

Conversation

@JamesWrigley
Copy link
Member

On 1.12 this prevents the majority of invalidations from ChunkSplitters (dependency of OhMyThreads), going from:

1-element Vector{SnoopCompile.MethodInvalidations}:
 inserting enumerate(c::ChunkSplitters.Internals.AbstractChunks) @ ChunkSplitters.Internals ~/.julia/packages/ChunkSplitters/SGtAq/src/internals.jl:123 invalidated:
   backedges: 1: superseding enumerate(iter) @ Base.Iterators iterators.jl:191 with MethodInstance for enumerate(::Any) (242 children)

To:

1-element Vector{SnoopCompile.MethodInvalidations}:
 inserting enumerate(c::ChunkSplitters.Internals.AbstractChunks) @ ChunkSplitters.Internals /path/to/julia-depot/packages/ChunkSplitters/SGtAq/src/internals.jl:123 invalidated:
   backedges: 1: superseding enumerate(iter) @ Base.Iterators iterators.jl:191 with MethodInstance for enumerate(::Any) (3 children)

Cthulhu showed that most invalidations were coming from this method in TOML.jl:

print_inline_table(f::Union{Nothing, Function}, io::IO, value::AbstractDict, sorted::Bool) @ TOML.Internals.Printer ~/.julia/juliaup/julia-1.12.1+0.x64.linux.gnu/share/julia/stdlib/v1.12/TOML/src/print.jl:120
120 function print_inline_table(f::Function::MbyFunc, io::IOBuffer::IO, value::AbstractDict::AbstractDict, sorted::Bool::Bool)::Core.Const(nothing)
121     vkeys::Any = collect(keys(value::AbstractDict)::Any)::Any
122     if sorted
123         sort!(vkeys::Any)
124     end
125     Base.print::Core.Const(print)(io, "{")
126     for (i::Int64, k::Any) in enumerate(vkeys::Any)::Union{ChunkSplitters.Internals.Enumerate, Base.Iterators.Enumerate}::Union{Nothing, Tuple{Tuple{Int64, Any}, Int64}, Tuple{Tuple{Int64, Any}, Tuple{Int64, Any}}}
127         v::Any = value::AbstractDict[k::Any]::Any
128         (i::Int64 != 1)::Bool && Base.print::Core.Const(print)(io, ", ")
129         printkey(io::IOBuffer, [String(k::Any)::Any]::Vector)
130         Base.print::Core.Const(print)(io, " = ")
131         printvalue(f::Function, io::IOBuffer, v::Any, sorted::Bool)
132     end
133     Base.print::Core.Const(print)(io, "}")
134 end

vkeys is inferred as Any so on line 126 enumerate(::Any) is called, which is invalidated by ChunkSplitters defining its own Base.enumerate method: https://github.com/JuliaFolds2/ChunkSplitters.jl/blob/715991816504a40cbbe21cf057f29f1b82a7c349/src/internals.jl#L123

It could be fixed by asserting vkeys::Vector, but I figured it's better to type assert collect(itr) since that's guaranteed to return an Vector anyway.

@jakobnissen
Copy link
Member

I don't think it'd guaranteed to return a Vector - if the iterator has a shape, it can return another Array type.

@JamesWrigley
Copy link
Member Author

Oops, yes you're quite right:

julia> collect(Iterators.product(1:3, 1:4))
3×4 Matrix{Tuple{Int64, Int64}}:
 (1, 1)  (1, 2)  (1, 3)  (1, 4)
 (2, 1)  (2, 2)  (2, 3)  (2, 4)
 (3, 1)  (3, 2)  (3, 3)  (3, 4)

Changed it to Array in bdba131.

@LilithHafner
Copy link
Member

This would be very nice. Then I'd finally be able to turn collections into Arrays. I think many folks would appreciate this. However, it is a wee bit breaking. Let's see how breaking it is:

@nanosoldier runtests()

@JamesWrigley
Copy link
Member Author

Hmm, doesn't the docstring already guarantee that it returns an Array?

collect(iterator)

Return an Array of all items in a collection or iterator.

@jakobnissen
Copy link
Member

jakobnissen commented Oct 23, 2025

See Jeff's comment here: #50051

I strongly disagree with him - more abstraction and looser contracts don't always make function better - they make them harder to get correct and be performant.
See also the discussion in #47777, which suggests that collect was originally designed to just return a Vector, but then got retconned.

The issues with collect goes deeper, though. Famously, it's one of the functions in Base that explicitly relies on type inference, and whose return type depends on inference. Julia's policy on whether that is allowed is quite vague, since on one hand, the core devs explicitly treat inference as an implementation detail, but on the other hand, the use of inference for return types makes some inference changes (such as the proposed reduction of world splitting) obviously breaking. It feels like one of those areas that never really got designed, Julia just evolved to this point.
So: One might ask: If type inference can change the type from Vector{Any} to Vector{Foo}, can it also change the type from Vector{Foo} to OtherVector{Foo}?

Also see: https://discourse.julialang.org/t/new-package-collects-jl-meant-to-improve-upon-and-generalize-the-interface-of-collect/129468

@JamesWrigley
Copy link
Member Author

Looks like this is a breaking change so we can't do it, at least not in a minor release. Example from the tests:

julia> using OffsetArrays

julia> x = zeros(5);

julia> a = OffsetArray(x, ntuple(Returns(-1), ndims(x)));

julia> collect(Broadcast.instantiate(Broadcast.broadcasted(+, a, 5)))
5-element OffsetArray(::Vector{Float64}, 0:4) with eltype Float64 with indices 0:4:
 5.0
 5.0
 5.0
 5.0
 5.0

Asserting it to be an AbstractArray would also solve the invalidations. Hope there aren't any edge-cases with that 😅 Alternatively, if folks think it's safer I could move the assertion to TOML.jl instead?

@jakobnissen
Copy link
Member

Better to move it to the caller, IMO, since the caller has more context (e.g. in TOML, I'd expect it really is always a Vector). I'm also skeptical that we, the Julia community, should cover the Julia codebase in ::ReturnType by playing invalidation whack-a-mole until the end of time. This problem needs to be solved for real (by disabling world splitting).

@JamesWrigley JamesWrigley changed the title Type-assert the return type of collect(itr) Type-assert the return type of collect(...) in TOML Oct 23, 2025
@JamesWrigley
Copy link
Member Author

Fair enough, moved it in 58498f9.

I'm also skeptical that we, the Julia community, should cover the Julia codebase in ::ReturnType by playing invalidation whack-a-mole until the end of time. This problem needs to be solved for real (by disabling world splitting).

Agreed, see also #59888 (comment). I personally don't intend to try fixing all invalidations, just the most egregious cases in packages I use to help TTFX until it's fixed properly.

@JamesWrigley
Copy link
Member Author

I believe the test failure is unrelated.

@KristofferC KristofferC added the backport 1.12 Change should be backported to release-1.12 label Oct 23, 2025
@JeffBezanson
Copy link
Member

Can we assume that collect(keys(::AbstractDict)) will always return AbstractVector or even Vector?

@JamesWrigley
Copy link
Member Author

Realistically that's probably always the case, but I don't see anything in the docs that guarantees it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.12 Change should be backported to release-1.12

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants