Replies: 3 comments 14 replies
-
You've given multisets a prominent place here (and multimaps in rebellion), which suggests that you or others find them useful a lot more than I have. Can you give some examples of use cases for them that you've run into? |
Beta Was this translation helpful? Give feedback.
-
Where is a builder needed? I can imagine using a mutable collection at first and then creating an immutable copy of it once all the elements are in place. Would a builder be more efficient or something?
Views are a bit unusual to me. Even though I might have an example in mind. I've lately been thinking of views as a compelling technique for working with strings and other data of arbitrary size without doing so many copies. (In particular, I'm thinking of using views in Punctaffy so that I can iterate over the faces of a higher-dimensional snippet without constructing each face as I go along. With a view approach, the faces could be cursors referring to landmarks in the snippet's representation, rather than being fully copied snippets of their own.) Perhaps builders are justified for the same reason: Once you're done with a builder, you can get an immutable collection from it without doing a deep copy of its contents; you just move its whole internal buffer over. |
Beta Was this translation helpful? Give feedback.
-
Il giorno 3 gennaio 2022, alle ore 13:08, Ross Angle
Assuming that we can use mutable collections as builders this way, what advantage do we get by adding a dedicated builder type to the mix? It sounds like @jackfirth's saying there are some collection types that have builders that are even more efficient than mutable collecitons since they don't have to allow read access until the collection is fully built, but I've had trouble coming up with an example of that.
For example, a binary heap can be built at once in O(n) time, in contrast to inserting elements one by one in a mutable data structure which requires O(n log n) time.
|
Beta Was this translation helpful? Give feedback.
-
Rhombus needs proper collections, such as lists and sets. This discussion is an informal summary of my design thoughts thus far and the direction I'm heading in.
Terminology
When speaking of collections generally and not specific Racket APIs, here's what I mean by the following terms:
Background
Today Racket has the following collection abstractions:
prop:sequence
struct type property, notracket/generic
) that's similar to the definition of sequence I used above, except Racket sequences allow each "element" to be an arbitrary number of values, similar to how functions can return multiple values by using the(values ...)
form.'(1 2 3 . 4)
. Their existence mainly arose as an artifact of Racket's representation of lists as cons pairs without any enforcement that thecdr
of the last cons pair is null. They have all the problems of Racket lists and a handful more and come with very few advantages to make up for it.mlist
s — mutable linked lists that exist mostly as a legacy API for porting Scheme code. Like ordinary racket lists, random access is linear.gvector
s, or growable vectors — mutable contiguous lists that support efficient insertion at the end of the list via amortization.racket/generic
) for mutable and immutable sets. The default implementation is an immutable set backed by a hash table.Racket lists, improper lists, vectors,
mlist
s,gvector
s, and sets can only be manipulated generically through theprop:sequence
protocol andracket/sequence
APIs. This protocol only allows iterating list elements one at a time, so determining the size of a list can't be done efficiently through the sequence APIs.Racket's list implementations also force the user to choose between efficient random access and immutability. In theory one could use immutable vectors, but all of Racket's vector APIs produce mutable vectors by default. Even if you stick with mutable vectors, you're still forced to choose between vectors and
gvector
s. The latter supports insertion, but the former is better supported. For example, there's nogvector-sort!
operation likevector-sort!
, so if you want to collect an unknown number of elements one at a time and then sort them, you have to copy the elements from a gvector into a vector partway through.Proposed interfaces
My general proposal for Rhombus is that we define a few collection interfaces and offer multiple implementations. To start with, I propose this interface hierarchy:
To address mutability, I propose that each collection type —
List
,Set
, andMultiset
— come in four flavors: an immutable interface, a mutable interface, a read-only view interface, and a write-only builder interface. The division of operations between these interfaces would look like this:builder.build()
operation creates an immutable collection from a builder. Builders offer the benefits of immutable collections without the performance costs of persistent updates, provided users are fine with performing all of their insertions in the collection upfront.For lists, this results in an interface hierarchy like so:
Further thoughts
There's a lot more I've put thought into but haven't written down here yet. Most of these I've explored or implemented in my Rebellion package, so I'll include links to the relevant bits for each item:
rebellion/collection/sorted-set
for examples.rebellion/collection/multidict
for examples.list.sublist(3, 8)
should return aList
implementation backed by the original list.map.keys()
would similarly return aSet
implementation backed by the map. Views make sorted collections far more powerful; seesorted-subset
in Rebellion for an example. Read-only views of mutable collections are another example of usefulness here.[3, 8)
,[10, 14]
, and(18, 25)
) and maps from ranges to values. Seerebellion/collection/range-set
for examples.rebellion/streaming/transducer
) and reducers (seerebellion/streaming/reducer
).for
comprehensions.Entry
API.Beta Was this translation helpful? Give feedback.
All reactions