Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normative: Switch to class + iterable semantics. #42

Closed
wants to merge 2 commits into from
Closed

Conversation

Jack-Works
Copy link
Member

@Jack-Works Jack-Works commented Sep 15, 2020

close #41, close #22, close #17

I used the approach I have described in 3.1.b (non-global non-constructible class + creator function) of #41.

The biggest change to the developers is that they have to add a .values() before they use the range with Iterator Helpers.

- Number.range(0, 5).take(2)
+ Number.range(0, 5).values().take(2)

Please choose "hide white space changes" when viewing the diff, which will make the review easier.

Other changes:

  • Add a new getter: .type
  • Rename inclusiveEnd to isInclusiveEnd

@devsnek
Copy link
Member

devsnek commented Sep 15, 2020

should be new Number.Range() to avoid confusion.

@Jack-Works
Copy link
Member Author

should be new Number.Range() to avoid confusion.

Please read #41 for explanation

@devsnek
Copy link
Member

devsnek commented Sep 15, 2020

You could also do NumberRange.from() or Number.Range.from or something like that as well. Not following existing patterns does nothing except make it harder for people to learn the language.

@Jack-Works
Copy link
Member Author

Now the underlying class is "NumericRange" and works for both number and bigint. It requires a "type" to indicates it is a number range or bigint range.
That's what be strange to have Number.Range but actually points to NumericRange. (Or maybe subclassing it to provide the first parameter implicitly?)

Another approach is your said, but not a "from", it needs to distinguish number and bigint so maybe it needs each method for each number type.

@@ -77,6 +73,7 @@ for (const i of BigInt.range(0n, 43n)) console.log(i) // 0n to 42n

// With iterator helper proposal
Number.range(0, Infinity)
.values()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an example of poor ergonomics caused by this change. Why is this needed?

Copy link
Member Author

@Jack-Works Jack-Works Sep 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required cause we changed to iterable because iterator helpers doesn't work with iterable classes.

This is the most elegant way in the current limitations and constraints.

Iterator.from(range).take

Or

range[Symbol.iterator]()

is worth than range.values()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most elegant way is to return an iterator directly. This is strictly less elegant/ergonomic, as a tradeoff for whatever benefits are perceived from using an iterable (which i still don’t agree with).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no change in the main for ... of ... case. For the Iterator Helper's case, it is a bit longer but I have tried to make the inconvenient minimal.
A chained call values() can be written easily (compare than Iterator.from or [Symbol.iterator]()). With auto-complete in code editors, it will be even easier to write.

It's acceptable for me to have this extra cost to prevent the #17 problem. And as I said in #41 (comment), the iterator way is preventing to add helper methods in an elegant way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"easily" is subjective, but i think it's pretty objective that "you have to type nothing" is easier than "you have to type something".

I don't see #17 as a problem, and a returned iterator would inherit from some kind of RangeIteratorPrototype, on which helper methods could be added precisely as easily as with a class (which could override/shadow IteratorPrototype methods), so that's an utter non-issue.

@devsnek
Copy link
Member

devsnek commented Sep 15, 2020

you have to pass numeric values for the start and end, seems pretty unambiguous to me.

@Jack-Works
Copy link
Member Author

you have to pass numeric values for the start and end, seems pretty unambiguous to me.

Do you mean to infer the type of the NumericRange from the start? (end can be infinity in bigint).

@devsnek
Copy link
Member

devsnek commented Sep 16, 2020

that's one option, yes

@domenic
Copy link
Member

domenic commented Sep 20, 2020

This is really disappointing. Many people made strong comments in #17 explaining how the idiomatic pattern for the language is to return an iterator. I'm sad to see this change and think that in practice this makes the proposal much less useful and usable. In fact, I'd say that this change would make the proposal not worth including in the language, since it causes the complexity-vs.-benefit tradeoff to become too high. (Introducing new or deprecated notions, like hidden/non-constructible/new-less classes, is a big part of this.)

I'd urge you to reconsider, and keep the existing semantics, per the cogent arguments from @devsnek, @tabatkins, @ljharb, and others in #17.

@Jack-Works Jack-Works marked this pull request as draft September 21, 2020 01:51
@hax
Copy link
Member

hax commented Sep 21, 2020

@domenic I think there is a disagreement that whether the idiomatic pattern for the language is to return an iterator. And as the research @Jack-Works did almost all programming languages choose expose iterable-like semantic instead of iterator-like semantic for range-like API. We need a very strong reason to deviate from that.

@devsnek
Copy link
Member

devsnek commented Sep 21, 2020

those languages don't use js's iterator patterns, it seems undesirable to draw a comparison with them.

@hax
Copy link
Member

hax commented Sep 22, 2020

@devsnek What languages? What's the difference??

@devsnek
Copy link
Member

devsnek commented Sep 22, 2020

@hax the table of languages you linked to in your comment above. this is something we went over a lot in the iterator helper proposal, with people wanting it to be for iterables instead. There are lots of closed issues on that repo (for example issue 8) with info on this.

@hax
Copy link
Member

hax commented Sep 22, 2020

I don't see there are any essential concept difference between js iterator vs for example python iterator.

@devsnek
Copy link
Member

devsnek commented Sep 22, 2020

I'm not an expert on python's iteration design, but I do know that range() in python is a class, not a function (what i was suggesting above).

@hax
Copy link
Member

hax commented Sep 22, 2020

Python do not need new keyword, so it could have same surface syntax like global function. Personally I'm fine with new Number.Range(). I think @Jack-Works try to minimize the userland code, so he try to make Number.range() as a factory method which returns a Number.Range instance.

Anyway , the key point here is range-like things are iterable-like not iterator-like in most other languages, and I don't think there is any essential differences of iterable/iterator concept between those languages and JS. If anyone think there are, pls explain.

@devsnek
Copy link
Member

devsnek commented Sep 22, 2020

If anyone think there are, pls explain.

I'm not going to rewrite this here, but I encourage you to look through the closed issues on https://github.com/tc39/proposal-iterator-helpers because this stuff is discussed in depth.

@domenic
Copy link
Member

domenic commented Sep 22, 2020

Yes. I think this difference is really well established in the language now, and was also covered in #17.

@hax
Copy link
Member

hax commented Sep 22, 2020

To be clear, I'm not arguing whether iterator helpers should be iterable helpers (I'm ok that iterator helpers use iterators), I'm asking for what reason range should be iterator, especially range-like thing in most other languages are iterable not iterator.

@domenic
Copy link
Member

domenic commented Sep 22, 2020

I keep referring you to #17, but maybe it would help to refer you to some specific comments, which explain why JS is different from other languages. Some of my favorites are the first paragraph of #17 (comment), the second paragraph of #17 (comment), the explanation that (unlike other langauges) generators return iterators in #17 (comment), and more.

@Jack-Works
Copy link
Member Author

It seems like we ended in an endless argue in two sides 😥

Sometimes I think Oh why not have an auto-clone @@iterator or make it return a higher order function with @@iterator on it.
If we're in the user land, it really a creative way to resolve this problem. But we cannot have it in the language...
It's extremely hard to choose one route cause both routes seems reasonable to me and both routes have their own fatal problems.

Is questionnaire or research deus ex machina? I don't know. Never mind, just some nonsense.

@Jack-Works
Copy link
Member Author

I'd like to ping the research call members maybe @codehag? Do we have any on-going research/questionnaire about which route is less error-prone? (though I believe the iterator one is error-prone, I cannot prove it) 👀

@codehag
Copy link

codehag commented Jan 29, 2021

Sure! Let's unblock this, one way or the other. I am a little slow to respond, sorry if i missed previous pings. Next one is in exactly a month. Does the time on the calendar work for you or should we shift?

@Jack-Works
Copy link
Member Author

Is there any public survey for normal developers has been published?
I think that will help us to figure out how developers thought on this topic.

@codehag
Copy link

codehag commented Jan 29, 2021

We've done a few surveys, and there are a number precedents, but the tool should fit the question, and the question needs to be well scoped. If, for example, you want to test error proneness of the two approaches (doing something similar to A/B testing), we could design something around that. On the other hand, if you want to see which facets of a given design stand out, you can do something like Felienne's research on block based languages: https://www.researchgate.net/publication/336187404_XLBlocks_a_Block-based_Formula_Editor_for_Spreadsheet_Formulas -- this uses the CDN framework which has a well defined survey for running such studies.

So, those are two starting points. But what we really need is a good understanding of the disagreement here. I only have a high level understanding so far (it looks like language consistency and ergonomics issues, with ergonomics being broadly defined?), I would need to dig a bit deeper to be of any help there, so that we know what we are actually investigating. We want to be careful not to encode our biases into a study.

For example, if we take the assertion that "error-proneness concern is coming from the refactoring hazard" made at the beginning of the thread in (i think) #17: That is something we can check, but we have to also verify that the example is realistic -- a survey run with an encoded bias will return for you the bias that you want. It is easy to make the numbers say what you want, but this would be poor methodology. A way to avoid this, is when designing the study - try to disprove your hypothesis. Another tool for building something like this is "what would convince me otherwise".

@codehag
Copy link

codehag commented Jan 29, 2021

Having read some of the issues in a bit more depth, I am not sure that this will be addressed by the two studies kinds I listed above. One part of that is because it looks like the study, if applied to this draft, would be testing something that is outside of the categories that would be accepted by the committee. So it doesn't make sense to do that. The other issue is that it looks like the use-case for reusable number.range hasn't been proved, it was just suggested as part of the design space. Testing alternatives without proving the use case might be getting ahead of ourselves. Determining the importance of it may help resolve this issue. Would it be right to say that that is the determining question?

Let's discuss this in depth in the research call so we know what we are really trying to answer.

@Jack-Works
Copy link
Member Author

A personal recap with my preferring.

  1. Egnormics in normal use cases (No problem)

This is no longer a problem for any route we have.
For iterators Number.range(a, b) is good.
For iterables we can make Number.range(...) a shortcut to new NumberRange(...) so they are exactly the same.

  1. Egnormics with Iterator helper (No problem)

This is no longer a problem for any route we have.
Iterators work with iterator helper naturally so no problem.
Though range[Symbol.iterator]() is a big problem, hax (IIRC) suggests to add .values() (like Array#values) so there is no problem either.
.values() are very easy to write (w/ or wo/ the editor with LSP support).

  1. Refactoring hazard (!!!)

For iterables there is no problem.

Developer might want to refactor this code because there is duplication.

for (const a of Number.range(0, 10)) {
    // 0 to 10
}
for (const a of Number.range(0, 10)) {
    // 0 to 10
}

But once you did that, the program brokes.

const range = Number.range(0, 10)
for (const a of range) {
    // 0 to 10
}
for (const a of range) {
    // No! The range has already consumed! Nothing yielded.
}

The correct way to do that is:

const range = () => Number.range(0, 10)
for (const a of range()) {
    // 0 to 10
}
for (const a of range()) {
    // Got 0 to 10 again
}
  1. Extensibility

We need to consider how to add new utils on the return value (iterator or a class instance) in the future.

=============

This is what problems we have I can recall from the previous discussions and with my personal ideas (and as you can see I'm preferring iterables currently due to reasons 3 and 4).

I think the key point is to investigate how will people use the range and if they will go into the refactor trap I mentioned before. If so, I think the iterator way is error-prone.

@ljharb
Copy link
Member

ljharb commented Jan 30, 2021

In general, that is a not a safe refactor to make, and it would be a bad idea imo to encourage people to falsely think it’s a safe refactor. Iterators simply are not reusable generically, even if some are.

@domenic
Copy link
Member

domenic commented Jan 30, 2021

Exactly. If you replaced Number.range(0, 10) in your example with someMap.values(), it would also not be a safe refactor. Making Number ranges reusable would hurt people's intuitions for the rest of the language.

@Jack-Works
Copy link
Member Author

So the problem is that we need to make sure if devs will do this in wild, that is what we need to investigate. I'm not encouraging devs to do refactor like this but they may do it.

@hax
Copy link
Member

hax commented Jan 31, 2021

@domenic

If you replaced Number.range(0, 10) in your example with someMap.values()

As I explained many times, people normally don't need to factor out someMap.values() because someMap already be a reusable point. On the other side, people likely to factor out Number.range(from, start) to a variable, because there is no reusable point, u need to repeat yourself every time.

The simple test is, if someone don't want repeat code, how will they do?

Do they write let r = Number.range(0, 10) or let r = () => Number.range(0, 10)? My experiences tell the the former, especially this is how things work in many other languages.

@codehag
Copy link

codehag commented Jan 31, 2021

I would like to have a better understanding of the usecase -- this is the research that I was proposing to @Jack-Works -- it's a loose corpus study -- in case you want to get an idea of how that works. I used python in the context of mozilla to validate it as the language has had this feature for a long time.

The example of Mozilla's python code is not representative, but as it reuses a number of python libraries and is easily searchable, nor is this exhaustive, I just did this in a few hours -- I think it gives us a decent idea of what we are working with.

Some on the fly vocabulary:
Bare assignment -- Ranges that have not been wrapped or cast to another type.
Cast -- Ranges that have been wrapped or cast into another type.
Throw-away -- Ranges that are not assigned, but re-initialized each time even when the same value might be used.

In a review of searchfox, I found that the usage of range as a throw away value exceeded the use of "bare range" being assigned to a value significantly. Total usage of range was appx. 1500. There were 112 bare range assignments, which I used as a good indicator of potential problematic reuse. This is compared to throw-away usage which was 1197, and finally there is the category of cast range, which accounts for ~170 cases. Bare range assignment accounts for about 7%. Total assignment of range for potential reuse, cast or bare, numbers about 300, or appx. 11%. I came across cases where a range was passed as an argument, but for most of these it was single use, or as a cast and I didn't investigate deeply. This is prior to verification of true reuse.

I categorized different kinds of bare usage, as well as true reuse according to frequency. In this segment I focus exclusively on Bare range assignement that does not qualify as true reuse can be categorized as follows:

True reuse occurs in the following forms:

  • reuse of range cast into another type. Can be done with throw-away. -- Accounting for ~170 cases, I did not verify all off them, but every one that I looked at was a true reuse. This is the largest category of range reuse.
  • reuse in a test. Cannot be done with throw-away. The most common case for bare re-use, counting appx 48 cases of the total 1400 uses of range. As these are test cases, and most of them are testing the implementation of iterators themselves, I do not consider this to be an important use case.
  • reuse within a function. Cannot be done with throw-away. -- closest approximation to the above discussion. Two such uses, discussed in more detail below.

Vague:

I would like to raise one specific instance I found of bare range reuse. The second use case of true bare reuse, involves code where a list of columns is replaced by a range of columns, and then that range or list is cast back into a list. This appears to be a case where the language was too flexible, and the reusability of range actually poses a risk of error proneness due to a conflation of types on the same variable. This is something that we might avoid by not making range reusable.

To summarize -- Reuse is rare in the case of ranges, and it appears that the refactoring hazard case is unrealistic (raised just before by @hax). Designing for this use case at the cost of language consistency is therefore not advisable. Additionally, the reusability of range may be error prone as shown by the conflation of types on the same variable. Finally, it highlights that we are spending time discussing a use-case that happens very infrequently, having been found in it's potential hazard form < 10 times in a codebase with a sample of ~1500 uses of range in python.

I would like to focus on one of the findings was that range would sometimes be reused in the following form:

list(range(10))
enumerate(range(10))
iter(range(10))
etc.

This form of cast reuse, as noted above, significantly outnumbered true bare range reuse. This is very interesting, and perhaps my most important finding so far. In a language that provides this capability of bare range reuse, it is largely not used.

I would emphasize that–given that reuse is rare, when it is reused it is usually cast to another type, and that the most common form is as a throw away value–designing for that use-case at the expense of language consistency is a poor language design choice. We would be prioritizing a very rare use over the learnability of the language.

However, we can still make the uncommon case meaningful. In fact, the JavaScript language already has idioms that support this. The best format for this has already been suggested, in my opinion:

let x = Iterator.from(Number.range(0, 10));
let y = Array.from(Number.range(0, 10)); // probably rare, but possible

Works very nicely for reuse and addresses most of the use cases. It reflects the reuse of range in a meaningful way -- we had a "range", and it can be an iterator, or an array -- whatever the use-case for the developer. For example if they needed an array specifically, or were simply more comfortable with it. Number.range would function equally in both cases, while maintaining itself as effectively a throw away value (the common case). In fact, this form would more closely mirror what python programmers tend to do than what is being proposed here, which is shoehorn range into being an iterator with an irregular syntax. There is much to be lost in trying to force JavaScript to be something it is not.

This is verbose, but terse code isn't better code -- it is often more confusing and prone to errors, and does not do self-teaching to the programmer. By using Iterator.from, we can communicate what is being built while maintaining existing categorizations of behaviour in the language -- the only difference is that it is longer. There is no need to invent something new for this proposal. It undermines an otherwise good proposal.

My observation here is that we are stuck not on usability as usability does not exist without a use case, but on aesthetics of what now appears to be an uncommon case. Aesthetics, which very important, is not the same as usability and has less impact without it. I believe we will be more productive if we let this go. We are talking in circles about a hypothetical situation that is rare even in languages that had this built in from the start.

There may be reason to continue discussing ergonomics and communication of intent, but it is not this refactoring hazard. I would like it to pivot it towards readability and communication, away from this semantics discussion. I believe we should think more creatively of how to communicate to programmers rather than privilege the rare case over the common one. It is an exercise we will likely need to do anyway: think about how we are naming this. range is a noun here. We could change to a verb such as count or iterate. This would be a more productive direction than going in circles. By using a verb, the consumption of the value will become more self-evident.

Edit: Not an expert here, so replaced iterable with throw-away to be more specific.

@Jack-Works
Copy link
Member Author

Thanks, @codehag, this is very useful information to know, I guess the refactoring hazard is not a problem for iterator way now.

range used across a language boundary actually surprise me 😜

I need to mention that Python actually using the iterable semantics so reuse is actually no bugs.

x = range(0, 5)
x.__iter__()
>>> <range_iterator object at 0x0000021487858A10>
x.__class__
>>> <class 'range'>

I'd like to do a temp check, what semantics are you support (in a general design pattern, not the design details), exactly? Just in case to prevent we already reached an agreement but we don't know.

  • 🚀 For Iterator (for things like Number.range(0, 10).map(x => x + 5))
  • 👀 For iterables (for things like new NumericRange(0, 10).values().map(x => 5))

@Jack-Works
Copy link
Member Author

It's ok for me and hax if we rename this to Iterator.range (Iterator.numericRange) to make it super clear about it is an iterator

@Andrew-Cottrell
Copy link

Andrew-Cottrell commented Feb 1, 2021

It's ok for me and hax if we rename this to Iterator.range (Iterator.numericRange) to make it super clear about it is an iterator

They could also be renamed to Number.rangeIterator and BigInt.rangeIterator for the same clarity; and this has an obvious continuation to Date.rangeIterator, etc. if considered useful in a future proposal.


I think I may have misunderstood. You were asking about renaming this proposal's name rather than any spec'd API?

@Alhadis
Copy link

Alhadis commented Nov 15, 2021

What's the rationale for returning an iterator instead of a plain Array? Is it performance? Or is all this just so users can return an infinite range? Because the latter seems like an edge-case to me: I honestly can't foresee any realistic scenarios where one would need an infinite range that isn't part of a generator. Consider the example in the readme:

README.md example:Simplified equivalent:
function* even(){
	for(const i of Number.range(0, Infinity))
		if(i % 2 === 0) yield i;
}
function* even(){
	for(let i = 0; ; ++i)
		if(i % 2 === 0) yield i;
}

Conversely, returning a pre-populated list of values is pretty much what I'd expect a method named .range() to return:

const digits = Number.range(0, 9);
1 === digits[1];
9 === digits.pop();

It's also objectively more ergonomic, given the majority of users will probably need to convert the result to an array anyway:

const uppercase = Number.range(65, 90).map(n => String.fromCharCode(n));
const lowercase = Number.range(97, 122).map(n => String.fromCharCode(n));
// Versus:
const uppercase = Array.from(Number.range(65, 90)).map(n => String.fromCharCode(n));
const lowercase = Array.from(Number.range(97, 122)).map(n => String.fromCharCode(n));

@devsnek
Copy link
Member

devsnek commented Nov 15, 2021

you should open a new issue

@Alhadis
Copy link

Alhadis commented Nov 26, 2021

@devsnek Done. See #53.

@Jack-Works Jack-Works closed this Dec 3, 2022
@Jack-Works Jack-Works deleted the class branch December 3, 2022 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants