Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sort for NTuples #54494

Merged
merged 20 commits into from
Dec 8, 2024
Merged

Add sort for NTuples #54494

merged 20 commits into from
Dec 8, 2024

Conversation

LilithHafner
Copy link
Member

Fixes #54489

This is partially a reland of #46104, but without the controversial sort(x) = sort!(copymutable(x)) and with some extensibility improvements.

…ble and DefaultUnstable dispatchable, and add tests
@LilithHafner LilithHafner added sorting Put things in order re-land This relands a PR that was previously merged but was later reverted. labels May 16, 2024
@LilithHafner LilithHafner changed the title Add sort for NTuples, make it extensible by packages, make DefaultStable and DefaultUnstable dispatchable, and add tests Add sort for NTuples, make it extensible by packages, and make DefaultStable and DefaultUnstable dispatchable May 16, 2024
base/sort.jl Outdated Show resolved Hide resolved
base/sort.jl Outdated Show resolved Hide resolved
base/sort.jl Outdated Show resolved Hide resolved
Co-authored-by: Lars Göttgens <lars.goettgens@rwth-aachen.de>
@KristofferC
Copy link
Member

What does the "make it extensible by packages" part mean?

@LilithHafner
Copy link
Member Author

One can specialize on the alg keyword argument and on the eltype of the NTuple with this implementation.

With the last implementation one could only specialize on eltype of the NTuple.

@nsajko nsajko added feature Indicates new feature / enhancement requests needs news A NEWS entry is required for this change labels Nov 3, 2024
nsajko

This comment was marked as outdated.

@LilithHafner LilithHafner removed the needs news A NEWS entry is required for this change label Nov 3, 2024
Copy link
Contributor

@nsajko nsajko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining issues:

  • Most of the PR is not really related to adding support for sorting tuples. It'd be nice if the refactoring of Sort, user extensibility changes, and the feature addition were separated into multiple PRs.
  • Instead of hardcoding a cutoff (9), use Base.Any32.
  • The PR is incomplete without support for heterogeneous tuples, and can't be said to fix Feature request: Non allocating sort for tuple #54489. I understand that the restriction to homogeneous tuples is backed by a triage decision, but IMO this is an unnatural, unprecedented and arbitrary limitation. Tuple is a heterogeneous collection type, and I sometimes need to sort heterogeneous tuples. Yeah, that's not a type stable operation in general, so what? That's just the nature of tuples.

Alternative PR here: #56425

@LilithHafner
Copy link
Member Author

  • Most of the PR is not really related to adding support for sorting tuples. It'd be nice if the refactoring of Sort, user extensibility changes, and the feature addition were separated into multiple PRs.

I don't want to add a new, non-extensible feature. If it can't be extended by users, the feature is incomplete. Implementing a non-extensible sorting system and then retrofitting extensibility has been a significant problem in the past that I don't want to recreate.

  • Instead of hardcoding a cutoff (9), use Base.Any32.

I benchmarked the best place to put the cutoff and a good value is 9; 32 is way too high. See the comment linked to in the source code next to the choice of cutoff for the exact benchmarks I ran to determine that 9 is a good choice. Using 32 would result in multiple seconds of compile time.

That issue requests "Would it makes sense to add fast NTuple sort to Base?". This PR adds fast NTuple sort to Base.

  • I understand that the restriction to homogeneous tuples is backed by a triage decision, but IMO this is an unnatural, unprecedented and arbitrary limitation. Tuple is a heterogeneous collection type, and I sometimes need to sort heterogeneous tuples. Yeah, that's not a type stable operation in general, so what? That's just the nature of tuples.

It wouldn't be breaking to add non-NTuple sort to Base later. It would be easy to retrofit this PR to support non-homogeneous tuples if there were motivating usecases that come up, but I'm hesitant to block this feature on a request for an additional feature. When have you needed to sort heterogeneous tuples?

@mcabbott mcabbott mentioned this pull request Nov 8, 2024
@LilithHafner LilithHafner added the triage This should be discussed on a triage call label Nov 9, 2024
@oscardssmith
Copy link
Member

from triage:

Most of the PR is not really related to adding support for sorting tuples. It'd be nice if the refactoring of Sort, user extensibility changes, and the feature addition were separated into multiple PRs.

Triage agrees this should be split into two PRs

Instead of hardcoding a cutoff (9), use Base.Any32.

the compile time implications aren't great for this, but either way this isn't a triage decision.

The PR is incomplete without support for heterogeneous tuples

That sounds like a followup PR, but it's fine to merge homogenous tuple support separately

@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Nov 21, 2024
LilithHafner added a commit that referenced this pull request Nov 26, 2024
…ut internals (#56661)

Previously, `DEFAULT_STABLE` was a giant chain of algorithms reflecting
the full sorting dispatch system. Now, it's simply `DefaultStable()`.
This has a few minor impacts:

Previously, the public binding `Base.Sort.DEFAULT_STABLE` documented
non-public implementation details of sorting dispatch in its extended
help with a caviat that they are internal. Now,
`Base.Sort.DEFAULT_STABLE` refers to the non-public binding
`Base.Sort.DefaultStable` and implementation details are documented
there with a warning that they are non-public.

Previously, dispatching on `Base.Sort.DEFAULT_STABLE` required writing
`::typeof(Base.Sort.DEFAULT_STABLE)` whereas now one could alternatively
dispatch on the (internal) type `Base.Sort.DefaultStable`.

Previously `Base.Sort.DEFAULT_STABLE === Base.Sort.DEFAULT_UNSTABLE` so
when writing sorting algorithms for custom collections it was impossible
to determine if the user asked for a stable algorithm. Now
`DEFAULT_STABLE` is `DefaultStable()` and `DEFAULT_UNSTABLE` is
`DefaultUnstable()`. Both the algorithms expand to the same large chain
of algorithms `_DEFAULT_ALGORITHMS_FOR_VECTORS` but it is possible to
intercept them before that happens.

`Base.Sort.DEFAULT_STABLE` now prints as `DefaultStable()` instead of

```julia-repl
julia> Base.Sort.DEFAULT_STABLE
Base.Sort.SubArrayOptimization(
    Base.Sort.MissingOptimization(
        Base.Sort.BoolOptimization(
            Base.Sort.Small{10}(
                Base.Sort.InsertionSortAlg(),
                Base.Sort.IEEEFloatOptimization(
                    Base.Sort.IsUIntMappable(
                        Base.Sort.Small{40}(
                            Base.Sort.InsertionSortAlg(),
                            Base.Sort.CheckSorted(
                                Base.Sort.ComputeExtrema(
                                    Base.Sort.ConsiderCountingSort(
                                        Base.Sort.CountingSort(),
                                        Base.Sort.ConsiderRadixSort(
                                            Base.Sort.RadixSort(),
                                            Base.Sort.Small{80}(
                                                Base.Sort.InsertionSortAlg(),
                                                Base.Sort.ScratchQuickSort(missing, missing,
                                                    Base.Sort.InsertionSortAlg()))))))),
                        Base.Sort.StableCheckSorted(
                            Base.Sort.ScratchQuickSort(missing, missing,
                                Base.Sort.InsertionSortAlg()))))))))
```

Factored out of #54494 at Triage's request (the git history reflects
this history).

---------

Co-authored-by: Lars Göttgens <lars.goettgens@rwth-aachen.de>
@LilithHafner LilithHafner changed the title Add sort for NTuples, make it extensible by packages, and make DefaultStable and DefaultUnstable dispatchable Add sort for NTuples Nov 26, 2024
@LilithHafner
Copy link
Member Author

IMO this is ready to merge.

base/sort.jl Outdated Show resolved Hide resolved
base/sort.jl Show resolved Hide resolved
base/sort.jl Show resolved Hide resolved
@LilithHafner LilithHafner requested a review from nsajko December 8, 2024 14:16
Copy link
Contributor

@nsajko nsajko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the triage decision, I think this should be merged.

NB: if there's really no desire for sorting heterogeneous tuples, and assuming in the future we'll be able to count on escape analysis to allow stack allocation of small Memory values, it could make sense to use something like the code here, avoiding recursion. That's probably a future PR.

@LilithHafner LilithHafner merged commit e0656ac into master Dec 8, 2024
5 of 7 checks passed
@LilithHafner LilithHafner deleted the lh/sort-tuple branch December 8, 2024 21:48
end
# Folks who want to hack internals can define a new _sort(x::NTuple, ::TheirAlg, o::Ordering)
# or _sort(x::NTuple{N, TheirType}, ::DefaultStable, o::Ordering) where N
function _sort(x::NTuple, a::Union{DefaultStable, DefaultUnstable}, o::Ordering, kw)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be public if you want people to overload it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"hack internals" sounds like it shouldn't be public?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lilith mentioned earlier that a motivation for the design was allowing people to dispatch on the eltype and algorithm. I guess they changed their mind given the comment, but I still think it'd be nice if things like this were made to be dispatchable API when possible. It's a very annoying downside of kwargs in public APIs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _sort! method is designed to be extensible and will eventually be made public (maybe renamed to sort_impl! or something). However, as a complicated API, I wanted to give it a couple of years of usage within Base and packages that choose to hack internals before making it public. IMO it's better to have good features a while from now than have to choose between sub-optimal API and breaking changes.

stevengj pushed a commit that referenced this pull request Jan 2, 2025
This is partially a reland of #46104, but without the controversial `sort(x) = sort!(copymutable(x))` and with some extensibility improvements. Implements #54489.
@@ -1482,8 +1482,9 @@ InitialOptimizations(next) = SubArrayOptimization(
`DefaultStable` is an algorithm which indicates that a fast, general purpose sorting
algorithm should be used, but does not specify exactly which algorithm.

Currently, it is composed of two parts: the [`InitialOptimizations`](@ref) and a hybrid of
Radix, Insertion, Counting, Quick sorts.
Currently, when sorting short NTuples, this is an unrolled mergesort, and otherwise it is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a compat notice indicating what versions of Julia support NTuple sorting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though it should live in ?sort, not ?Base.Sort.DEFAULT_STABLE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests re-land This relands a PR that was previously merged but was later reverted. sorting Put things in order
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Non allocating sort for tuple
7 participants