From d2b2efa5f0983dedffdfa3143ffd721a208449ba Mon Sep 17 00:00:00 2001 From: Shreyas Khandekar <60454060+ShreyasKhandekar@users.noreply.github.com> Date: Tue, 17 Sep 2024 17:33:10 -0700 Subject: [PATCH 1/3] Update Sort module documentation Update the sort module documentation based on the work done in the sort module stabilization subteam. On a high level - Added a short intro with a few examples of using `proc sort`, before this change it was a bit wierd to just see Comparators as the first thing in the sort module docs. - Edited the comparators section to mention the new prescribed way to create and use custom comparators using the newly created interfaces. - Went over the rest of the doc to update other things that have changed like argument names This includes work done in the following PRs: chapel-lang/chapel#25863 chapel-lang/chapel#25852 chapel-lang/chapel#25821 chapel-lang/chapel#25817 chapel-lang/chapel#25813 chapel-lang/chapel#25807 chapel-lang/chapel#25705 chapel-lang/chapel#25703 chapel-lang/chapel#25699 chapel-lang/chapel#25698 chapel-lang/chapel#25586 Signed-off-by: Shreyas Khandekar <60454060+ShreyasKhandekar@users.noreply.github.com> --- modules/standard/List.chpl | 6 ++ modules/standard/Sort.chpl | 172 +++++++++++++++++++++++++++---------- 2 files changed, 135 insertions(+), 43 deletions(-) diff --git a/modules/standard/List.chpl b/modules/standard/List.chpl index 29ff0f66c076..a2bce3ee6bd6 100644 --- a/modules/standard/List.chpl +++ b/modules/standard/List.chpl @@ -40,6 +40,12 @@ - clear - sort + .. warning:: + + :proc:`list.sort` is deprecated - please use the + :proc:`sort(x: list)` procedure from the + :mod:`Sort` module instead + Additionally, all references to list elements are invalidated when the list is deinitialized. diff --git a/modules/standard/Sort.chpl b/modules/standard/Sort.chpl index 480100d2486b..1092a234fd96 100644 --- a/modules/standard/Sort.chpl +++ b/modules/standard/Sort.chpl @@ -24,7 +24,45 @@ // TODO -- performance test sort routines and optimize (see other TODO's) /* -Supports standard algorithms for sorting data. +The sort module provides functions to sort arrays and lists. It is designed to +be flexible and efficient, allowing the user to define custom comparators to +sort any data type, as long as the comparator implements the appropriate +sorting interface. + +The simples way to sort an array is to call the :proc:`sort` function on the +array. The sort function will use a default comparator to sort the array in +ascending order. + +.. code-block:: chapel + + var Array = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]; + + sort(Array); + + // This will output: 1, 1, 2, 3, 3, 4, 5, 5, 5, 6, 9 + writeln(Array); + + +The sort function can also accept a region argument to sort a subset of an +array. This is offered as an optimization over using an array slice which may +have performance overhead. + + +.. code-block:: chapel + + var Array = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]; + + // Sort only the elemets in the range 1..5 + // Same as sort(Array[1..5]); + sort(Array, region=1..5); + + // This will output: 3, 1, 1, 4, 5, 9, 2, 6, 5, 3, 5 + writeln(Array); + + +The sort function can also be called on a list, be stable or unstable, and +accept a custom comparator. +See the :proc:`sort(x: list)` function for details. .. _comparators: @@ -32,37 +70,57 @@ Comparators ----------- Comparators allow sorting data by a mechanism other than the -default comparison operations between array elements. To use a comparator, -define a record or a class with an appropriate method and then pass -an instance of it to the sort function. Examples are shown below. +default comparison operations between array elements. + +The :proc:`sort` function can accept a comparator argument, which defines how +the data is sorted. If no comparator is passed, the default comparator is +used. -Comparators need to include at least one of the following methods: +Reverse sorting is handled by the :record:`ReverseComparator`. +See :ref:`Reverse Comparator` for details. - * ``key(a)`` -- see `The .key method`_ - * ``compare(a, b)`` -- see `The .compare method`_ - * ``keyPart(a, i)`` -- see `The .keyPart method`_ -See the section below for discussion of each of these methods. +To use a custom comparator, define a record or a class which implements the +appropriate sorting interface. -A comparator can contain both ``compare`` and ``keyPart`` methods. In that -event, the sort algorithm will use whichever is appropriate for the algorithm -and expect that they have consistent results. +Comparators need to implement one, and only one, of the following interfaces +as well as at least one of their associated methods: + + * :interface:`keyComparator` -- see `The keyComparator interface`_ + * :interface:`relativeComparator` -- see `The relativeComparator interface`_ + * :interface:`keyPartComparator` -- see `The keyPartComparator interface`_ + +See the section below for discussion of each of these interfaces and methods. + +*Future:* + + Provide a unified ``sortComparator`` interface, which can represent an + exclusive or (XOR) of the three interfaces above. + + +The keyComparator interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``keyComparator`` interface is used to sort data by a key value. Records +implementing this interface must define a ``key`` method. It is an error for a comparator to contain a ``key`` method as well as one of -the other methods. +the other methods that are part of the ``relativeComparator`` or +``keyPartComparator`` interfaces. + The .key method -~~~~~~~~~~~~~~~ +*************** -The ``key(a)`` method accepts 1 argument, which will be an element from the +The ``key(elt)`` method accepts 1 argument, which will be an element from the array being sorted. The default key method would look like this: .. code-block:: chapel - proc DefaultComparator.key(a) { - return a; + proc DefaultComparator.key(elt) { + return elt; } @@ -73,11 +131,11 @@ elements, the user can define a comparator with a key method as follows: var Array = [-1, -4, 2, 3]; - // Empty record serves as comparator - record Comparator { } + // Empty record serves as comparator, implements the keyComparator interface + record Comparator : keyComparator { } // key method maps an element to the value to be used for comparison - proc Comparator.key(a) { return abs(a); } + proc Comparator.key(elt) { return abs(elt); } var absComparator: Comparator; @@ -86,56 +144,63 @@ elements, the user can define a comparator with a key method as follows: // This will output: -1, 2, 3, -4 writeln(Array); -The return type of ``key(a)`` must support the ``<`` +The return type of ``key(elt)`` must support the ``<`` operator, which is used by the base compare method of all sort routines. If the ``<`` operator is not defined for the return type, the user may define it themselves like so: .. code-block:: chapel - operator <(a: returnType, b: returnType): bool { + operator <(x: returnType, y: returnType): bool { ... } +The relativeComparator interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``relativeComparator`` interface is used to sort data by comparing two +elements directly. Records implementing this interface must define a +``compare`` method. + The .compare method -~~~~~~~~~~~~~~~~~~~ +******************* -The ``compare(a, b)`` method accepts 2 arguments, which will be 2 elements from +The ``compare(x, y)`` method accepts 2 arguments, which will be 2 elements from the array being sorted. The return value should be a numeric signed type -indicating how a and b compare to each other. The conditions between ``a`` and -``b`` should result in the following return values for ``compare(a, b)``: +indicating how x and y compare to each other. The conditions between ``x`` and +``y`` should result in the following return values for ``compare(x, y)``: ============ ========== Return Value Condition ============ ========== - ``> 0`` ``a > b`` - ``0`` ``a == b`` - ``< 0`` ``a < b`` + ``> 0`` ``x > y`` + ``0`` ``x == y`` + ``< 0`` ``x < y`` ============ ========== The default compare method for a signed integral type can look like this: .. code-block:: chapel - proc DefaultComparator.compare(a, b) { - return a - b; + proc DefaultComparator.compare(x, y) { + return x - y; } The absolute value comparison example from above can alternatively be -implemented with a compare method: +implemented with a ``relativeComparator`` as follows: .. code-block:: chapel var Array = [-1, -4, 2, 3]; // Empty record serves as comparator - record Comparator { } + record Comparator : relativeComparator { } // compare method defines how 2 elements are compared - proc Comparator.compare(a, b) { - return abs(a) - abs(b); + proc Comparator.compare(x, y) { + return abs(x) - abs(y); } var absComparator: Comparator; @@ -145,8 +210,20 @@ implemented with a compare method: // This will output: -1, 2, 3, -4 writeln(Array); + +The keyPartComparator interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The keyPartComparator interface defines how a comparator should sort parts of +a key using the ``keyPart`` method. This is used for certain sort algorithms. +Records implementing this interface must define a ``keyPart`` method. + +A comparator implementing this interface can contain both ``compare`` and +``keyPart`` methods. In that event, the sort algorithm will use whichever is +appropriate for the algorithm and expect that they have consistent results. + The .keyPart method -~~~~~~~~~~~~~~~~~~~ +******************* A ``keyPart(elt, i)`` method returns *parts* of key value at a time. This interface supports radix sorting for variable length data types, such as @@ -220,19 +297,23 @@ To reverse the sort order of a user-defined comparator, pass the user-defined comparator to the initializer of the module-defined :record:`ReverseComparator` record, which can be passed to the sort function. +For this example, we will reverse the absolute value comparison from above +using the ``relativeComparator`` interface, although the same can be done with +the ``keyComparator`` interface. + .. code-block:: chapel var Array = [-1, -4, 2, 3]; // Empty record serves as comparator - record Comparator { } + record Comparator : relativeComparator{ } // compare method defines how 2 elements are compared - proc Comparator.compare(a, b) { - return abs(a) - abs(b); + proc Comparator.compare(x, y) { + return abs(x) - abs(y); // ascending order } - var absReverseComparator: ReverseComparator(Comparator); + var absReverseComparator: ReverseComparator(Comparator); // reverse order sort(Array, comparator=absReverseComparator); @@ -240,6 +321,7 @@ comparator to the initializer of the module-defined writeln(Array); */ + module Sort { private use List; @@ -637,7 +719,8 @@ The choice of sorting algorithm used is made by the implementation. .. note:: This function currently either uses a parallel radix sort or a parallel - improved quick sort. The algorithms used will change over time. + improved quick sort. For stable sort, it uses timsort. + The algorithms used will change over time. It currently uses parallel radix sort if the following conditions are met: @@ -683,7 +766,7 @@ proc sort(ref x: [], comparator:? = new DefaultComparator(), Sort the elements in the list ``x``. After the call, ``x`` will store elements in sorted order. -See the :proc:`sort` declared just above for details. +See :proc:`sort` declared above for details. .. warning:: @@ -765,7 +848,8 @@ The choice of sorting algorithm used is made by the implementation. .. note:: This function currently either uses a parallel radix sort or a parallel - improved quick sort. The algorithms used will change over time. + improved quick sort. For stable sort, use :proc:`sort` with ``stable=true``. + The algorithms used will change over time. It currently uses parallel radix sort if the following conditions are met: @@ -806,6 +890,8 @@ proc sort(ref Data: [?Dom] ?eltType, comparator:?rec=defaultComparator, chpl_check_comparator(comparator, eltType); if stable { + // TODO: we already have a stable sort, but it is not called here + // maybe we should call it here, even though this one is deprecated // TODO: implement a stable merge sort with parallel merge // TODO: create an in-place merge sort for the stable+minimizeMemory case // TODO: create a stable variant of the radix sort From 852cf0a2b75402eaaabf9267290dc35bdfd6bf6f Mon Sep 17 00:00:00 2001 From: Shreyas Khandekar <60454060+ShreyasKhandekar@users.noreply.github.com> Date: Wed, 18 Sep 2024 10:33:51 -0700 Subject: [PATCH 2/3] Changes based on review Part 1 Signed-off-by: Shreyas Khandekar <60454060+ShreyasKhandekar@users.noreply.github.com> --- modules/standard/Sort.chpl | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/modules/standard/Sort.chpl b/modules/standard/Sort.chpl index 1092a234fd96..7fbc2f8104d3 100644 --- a/modules/standard/Sort.chpl +++ b/modules/standard/Sort.chpl @@ -24,13 +24,14 @@ // TODO -- performance test sort routines and optimize (see other TODO's) /* -The sort module provides functions to sort arrays and lists. It is designed to +This module supports standard algorithms for sorting data. +It is designed to be flexible and efficient, allowing the user to define custom comparators to sort any data type, as long as the comparator implements the appropriate sorting interface. -The simples way to sort an array is to call the :proc:`sort` function on the -array. The sort function will use a default comparator to sort the array in +The simplest way to sort an array is to call the :proc:`sort` function on the +array. The sort function will use the default comparator to sort the array in ascending order. .. code-block:: chapel @@ -132,14 +133,14 @@ elements, the user can define a comparator with a key method as follows: var Array = [-1, -4, 2, 3]; // Empty record serves as comparator, implements the keyComparator interface - record Comparator : keyComparator { } + record absComparator : keyComparator { } // key method maps an element to the value to be used for comparison - proc Comparator.key(elt) { return abs(elt); } + proc absComparator.key(elt) { return abs(elt); } - var absComparator: Comparator; + var absoluteComparator: absComparator; - sort(Array, comparator=absComparator); + sort(Array, comparator=absoluteComparator); // This will output: -1, 2, 3, -4 writeln(Array); @@ -196,16 +197,16 @@ implemented with a ``relativeComparator`` as follows: var Array = [-1, -4, 2, 3]; // Empty record serves as comparator - record Comparator : relativeComparator { } + record absComparator : relativeComparator { } // compare method defines how 2 elements are compared - proc Comparator.compare(x, y) { + proc absComparator.compare(x, y) { return abs(x) - abs(y); } - var absComparator: Comparator; + var absoluteComparator: absComparator; - sort(Array, comparator=absComparator); + sort(Array, comparator=absoluteComparator); // This will output: -1, 2, 3, -4 writeln(Array); @@ -218,8 +219,8 @@ The keyPartComparator interface defines how a comparator should sort parts of a key using the ``keyPart`` method. This is used for certain sort algorithms. Records implementing this interface must define a ``keyPart`` method. -A comparator implementing this interface can contain both ``compare`` and -``keyPart`` methods. In that event, the sort algorithm will use whichever is +A comparator implementing this interface can optionally also provide a +`compare` method. In that event, the sort algorithm will use whichever is appropriate for the algorithm and expect that they have consistent results. The .keyPart method @@ -253,7 +254,7 @@ This ``keyPart`` method supports sorting tuples of 2 integers: proc keyPart(elt: 2*int, i: int) { if i > 1 then - return (-1, 0); + return (keyPartStatus.pre, 0); // second value is not used return (keyPartStatus.returned, elt(i)); } @@ -306,14 +307,14 @@ the ``keyComparator`` interface. var Array = [-1, -4, 2, 3]; // Empty record serves as comparator - record Comparator : relativeComparator{ } + record absComparator : relativeComparator{ } // compare method defines how 2 elements are compared - proc Comparator.compare(x, y) { + proc absComparator.compare(x, y) { return abs(x) - abs(y); // ascending order } - var absReverseComparator: ReverseComparator(Comparator); // reverse order + var absReverseComparator: ReverseComparator(absComparator); // reverse order sort(Array, comparator=absReverseComparator); @@ -719,7 +720,7 @@ The choice of sorting algorithm used is made by the implementation. .. note:: This function currently either uses a parallel radix sort or a parallel - improved quick sort. For stable sort, it uses timsort. + improved quick sort. For stable sort, it uses Timsort. The algorithms used will change over time. It currently uses parallel radix sort if the following conditions are met: From 28641b9a05c4244c84976149fef38b783427cf6c Mon Sep 17 00:00:00 2001 From: Shreyas Khandekar <60454060+ShreyasKhandekar@users.noreply.github.com> Date: Wed, 18 Sep 2024 11:57:00 -0700 Subject: [PATCH 3/3] Feedback from review Part 2 Signed-off-by: Shreyas Khandekar <60454060+ShreyasKhandekar@users.noreply.github.com> --- modules/standard/Sort.chpl | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/modules/standard/Sort.chpl b/modules/standard/Sort.chpl index 7fbc2f8104d3..0e017afd247b 100644 --- a/modules/standard/Sort.chpl +++ b/modules/standard/Sort.chpl @@ -105,10 +105,10 @@ The keyComparator interface The ``keyComparator`` interface is used to sort data by a key value. Records implementing this interface must define a ``key`` method. -It is an error for a comparator to contain a ``key`` method as well as one of -the other methods that are part of the ``relativeComparator`` or -``keyPartComparator`` interfaces. - +Today, it is an error for a comparator implementing the ``keyComparator`` +interface to contain a ``key`` method as well as one of the other methods +that are part of the ``relativeComparator`` or ``keyPartComparator`` +interfaces. This restriction might be lifted in future releases. The .key method ***************