Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use web spec best practices #450

Closed
zolkis opened this issue Aug 15, 2023 · 49 comments
Closed

Use web spec best practices #450

zolkis opened this issue Aug 15, 2023 · 49 comments

Comments

@zolkis
Copy link
Collaborator

zolkis commented Aug 15, 2023

Originally posted by @inexorabletash in #446 (comment)

Drive by comment: the first step here (If a or b is not an instance of MLOperand, then throw a "TypeError"...) is unnecessary, and I see this repeated throughout the newly introduced algorithms for methods ("operations" in Web IDL). The Web IDL bindings already take care validating input types and throwing TypeError.

Further, even in cases where an explicit error must be thrown, the intrinsic type TypeError should be thrown, not a DOMException of type "TypeError" (which is not present in https://webidl.spec.whatwg.org/#idl-DOMException-error-names)

(Issue created to document this change).

@zolkis zolkis changed the title Unnecessary checks in algorithms Unnecessary checks in algorithms, use TypeError simple exception Aug 15, 2023
zolkis added a commit to zolkis/webnn that referenced this issue Aug 15, 2023
Signed-off-by: Zoltan Kis <zoltan.kis@intel.com>
@inexorabletash
Copy link
Member

Another example:

in constant() - If type is undefined, let type be "float32". - this is handled automatically by having the default in the Web IDL.

@inexorabletash
Copy link
Member

Also (and this is not new), dictionary members are more normally references in prose using the infra ordered map syntax: options["foo"]

By the time an algorithm "sees" the dictionary, it's gone through the bindings layer and is an ordered map - https://webidl.spec.whatwg.org/#idl-dictionaries

@inexorabletash
Copy link
Member

throw a "TypeError" simple exception

Just write: throw a TypeError.

In bikeshed markdown:

... [=exception/throw=] a {{TypeError}}.

@inexorabletash
Copy link
Member

And (this new is in PR 446) the entire "check linear options" algorithm is unnecessary; the dictionary bindings for ECMAScript take care of all of that, including default values.

@inexorabletash
Copy link
Member

(apologies for random feedback; I may have more time later this week to do a more thorough pass)

zolkis added a commit to zolkis/webnn that referenced this issue Aug 15, 2023
Signed-off-by: Zoltan Kis <zoltan.kis@intel.com>
@inexorabletash
Copy link
Member

Not new: In "create MLActivation" steps

If name is undefined or null, throw a "TypeError" and abort these steps.

It looks like all call sites are internal to the spec and so this could just be an assert.

And this is more off topic, but: none of the call sites for "create MLActivation" actually pass builder - that should be fixed at the call sites. I also don't see any of the call sites passing init-steps but maybe I'm missing something.

@inexorabletash
Copy link
Member

Not new: in input(name, descriptor):

If name is undefined or an empty string, then throw a TypeError and stop.

name is defined as DOMString; passing literal undefined from script will yield the string "undefined". So this can be reduced to just a test for an empty string.

(This is a gotcha for web developers since an error would be better, but that's how the Web IDL bindings work.)

@inexorabletash
Copy link
Member

Not new: in conv2d(input, filter, options)

If options is undefined, let options be an empty object.

The options argument is defined as optional MLConv2dOptions options = {} so it will always be present.

If options.padding is undefined, ...

It's more correct to write:

If options["padding"] exists, ...

in Bikeshed markdown:

If |options|[{{MLConv2dOptions/"padding"}}] [=map/exists=] ...

@inexorabletash
Copy link
Member

Not new:

Let ... be the first argument.
Let ... be the second argument.

This is unnecessary, arguments are named. I see 5 uses of "first argument", and 2 of "second argument". Just drop these.

zolkis added a commit to zolkis/webnn that referenced this issue Aug 16, 2023
…d by WebIDL bindings

Signed-off-by: Zoltan Kis <zoltan.kis@intel.com>
zolkis added a commit to zolkis/webnn that referenced this issue Aug 16, 2023
…ove default inits

Signed-off-by: Zoltan Kis <zoltan.kis@intel.com>
@inexorabletash
Copy link
Member

Not new: Similar to a nitpick above, for "create MLOperand", the second step is:

If desc is not an object that implements MLOperandDescriptor ...

Since this is called internally with this everywhere, the validation is unnecessary.

Also, total nitpick, I noticed that some of the call sites are phrased as:

... the result of invoking the create MLOperand steps given this ...

and some are:

... the result of invoking the create MLOperand steps with this ...

This applies to other invocations as well ("with" vs. "given") - ideally the phrasing would be consistent, but there's not consistency in phrasing across specs so either is fine.

@inexorabletash
Copy link
Member

Optional style suggestion:

If you rename the algorithms from (for example) "create MLOperand" to "create an MLOperand" (which is better grammar anyway), then you can use simplified phrasing when invoking algorithms.

Before:

Let operand be the result of invoking the create MLOperand steps given this and descriptor.

After:

Let operand be the result of creating an MLOperand given this and descriptor.

Note that Bikeshed is smart enough to link "creating an MLOperand" to a definition named "create an MLOperand". More and more specs are using the shorter form - less text is generally more readable.

@inexorabletash
Copy link
Member

Not new: There are a couple checks for "... is not an an object that implements MLOperandDescriptor ..." - in both cases, the value is supplied from an internal slot or explicitly passed via the algorithm, so this validation seems unnecessary.

@inexorabletash
Copy link
Member

New (mostly) in 446: When initializing lists:

... set it to [0, 0, 0, 0].

Technically at this point the Web IDL sequence has become a Infra list and should be initialized using this literal syntax:

... set it to « 0, 0, 0, 0 ».

Super nitpicky but it can make it clearer to implementers that by this point we're talking about abstract types, not a Javascript Array.

@inexorabletash
Copy link
Member

Meta: if reporting style issues that I don't think should hold up PR 446 here is working for you, I'm happy to keep doing so. But I'm also happy to leave comments directly on the PR, file separate issues, etc.

@inexorabletash
Copy link
Member

Not new: There are a few places where a record is being iterated, with:

For each key -> value of ...

This should use a proper right arrow (→) instead of the digraph, and can be written in markdown as:

[=map/For each=] |key| &rarr; |value| of ...

Or use a literal → in the source if you don't mind non-ASCII.

@inexorabletash
Copy link
Member

Not new: Many places use "... throw ... and stop." or "... throw ... and abort these steps."

Per https://infra.spec.whatwg.org/#algorithm-control-flow this isn't necessary - throwing terminates the algorithm.

I know it's tempting to be explicit, but shorter is usually better!

zolkis added a commit to zolkis/webnn that referenced this issue Aug 17, 2023
Signed-off-by: Zoltan Kis <zoltan.kis@intel.com>
@zolkis
Copy link
Collaborator Author

zolkis commented Aug 17, 2023

Thanks a lot for all these suggestions, they are much appreciated!
Listing them in this issue makes sense, at least we have all suggestions in one place for future reference. Actually we should move the other similar suggestions here, too.

So far I applied all of these, except for:

  • the dict members vs ordered map syntax (let's discuss this in a call, as I find it easier/more concise to write and read the dict members syntax);
  • and removing the "and stop" clauses, which I will do later (just too much and mostly manual work). Will need less busy time with lots of caffein motivators. :)

@zolkis
Copy link
Collaborator Author

zolkis commented Aug 17, 2023

Originally posted by @inexorabletash in #446 (comment)

Regarding using TypeError vs. a DataError DOMException:

  1. If you mix and match exception types, the order of the checks becomes important. From experience (e.g. with https://w3c.github.io/IndexedDB/#add-or-put) this has some benefit (e.g. developers can ask a more specific question on StackOverflow) but in practice it turns into an implementation burden - implementations must align on the sequence of input validation steps exactly.
  2. For modern specs where we expect non-web-browser implementations to exist (e.g. for Node.js) then the preference is to use JS-intrinsic types (TypeError, RangeError) everywhere, rather than DOMException. See https://fetch.spec.whatwg.org/ https://encoding.spec.whatwg.org/ etc.

Either way, you'll want Web Platform Tests that exercise each failure path and validate the correct exception type is thrown.

@inexorabletash
Copy link
Member

the dict members vs ordered map syntax

Yeah, we can discuss. But queuing up thoughts, it helps to distinguish when you're referencing an interface member - usually only via named slots - vs. a dictionary member. And when things get wordy like: options["bias"].[[descriptor]]["dimensions"] it usually helps to use a temporary, e.g "Let bias be options["bias"]"

@inexorabletash
Copy link
Member

Some more things that can be linkified to make the meaning clear:

  • ... be the size of ...[=list/size=]
    • And so far as I can tell, all of the places that use .length are referring to the size of a list (what a sequence<> turns into) and should use [=list/size=] instead.
  • link enum members when you can, e.g. "nchw" as {{MLInputOperandLayout/"nchw"}} if possible
  • link rank everywhere if you can, e.g. in a number between 0 and the rank of input

@inexorabletash
Copy link
Member

inexorabletash commented Aug 17, 2023

reshape() is defined as:

 MLOperand reshape(MLOperand input, sequence<unsigned long?> newShape);

The steps have:

If newShape is a scalar number, set outputShape to « 1 ».
Otherwise, if newShape is an array of unsigned long:

That won't apply - newShape will always be a list, where the items are unsigned long or null.

EDIT to add:

Same thing applies to transpose() and permutation; this definition:

  sequence<unsigned long> permutation;

...means this is unnecessary:

If options.permutation is not a sequence of unsigned long, then throw a TypeError and stop.

And constant() has:

 MLOperand constant(double value, optional MLOperandType type = "float32");

... which means neither of these are needed:

If value is not a number, then throw a TypeError and stop.
Otherwise, if type is not one of MLOperandType, then throw a TypeError and stop.

gru() is defined as:

  MLRecurrentNetworkDirection direction = "forward";
  MLGruWeightLayout layout = "zrn";

... which means these are not necessary:

If options.direction is not one of MLRecurrentNetworkDirection, then throw a TypeError and stop.
If options.layout is not one of MLGruWeightLayout, then throw a TypeError and stop.
... and ditto for gruCell().

There are similar unnecessary enum value checks throughout that can all be removed.

(The whole point of Web IDL is that these checks are taken care of for you automatically!)

@inexorabletash
Copy link
Member

In the "create an MLOperand", "copy an MLOperand", and "create an MLActivation" steps there are lines:

Let thing be a new object.

This should reference the specific type, e.g. "new MLOperand"

Right now, the term "object" is linking to FileAPI, which is probably not desired. It's worth looking at the Terms defined by reference section Bikeshed generates for you, to ensure you're getting the dependencies you expect (e.g. you probably aren't meaning to use CSS4's definition for "number")

inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
* MLConvTranspose2dFilterOperandLayout
* MLConv2dFilterOperandLayout

For webmachinelearning#450
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
* MLDeviceType
* MLGruWeightLayout
* MLInterpolationMode
* MLLstmWeightLayout
* MLOperandDataType
* MLPaddingMode
* MLRecurrentNetworkDirection
* MLRoundingType

For webmachinelearning#450
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
Rather than `"opname"` just use "opname" consistently, which was more
common. These are internal identifiers, not exposed to script.

In the future it might be worth introducing a typographic convention
and/or just a markup/markdown style for op names, just to make it
easier to distinguish them.

For webmachinelearning#450
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
With Bikeshed and autolinks you can just write "result of xxxing"

For webmachinelearning#450
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Dec 19, 2023
If Bikeshed [=autolinks=] are used instead of HTML <a>markup</a> then
Bikeshed can do a better job of consistency checking.

For webmachinelearning#450
@inexorabletash
Copy link
Member

I lied - I put up a PR. But I'm about to go out-of-office so I'll let the editors make use of it or wait until January.

Happy Solstice, everyone!

inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 16, 2024
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 17, 2024
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 17, 2024
In algorithms that mint MLActivations and MLOperands, construct
the actual type not a generic "object". Also, use Infra wording
around optional paramters to those algorithms.

For webmachinelearning#450
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 17, 2024
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 18, 2024
"Let ... be ..." is reserved for declaring variables.

For webmachinelearning#450
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 18, 2024
In algorithms that mint MLActivations and MLOperands, construct
the actual type not a generic "object". Also, use Infra wording
around optional paramters to those algorithms.

For webmachinelearning#450
wchao1115 pushed a commit that referenced this issue Jan 20, 2024
In algorithms that mint MLActivations and MLOperands, construct
the actual type not a generic "object". Also, use Infra wording
around optional paramters to those algorithms.

For #450
@inexorabletash
Copy link
Member

inexorabletash commented Feb 17, 2024

My skim of this issue - looking to see if we can close it yet - turns up these potentially unresolved issues:

Otherwise I think @zolkis and I have tackled it all, and so we can close this out now or soon.

inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 20, 2024
For operations defined in bulk - element-wise binary, unary, and
logical ops, plus pooling and reduction ops - make the styling
consistent:

- A single <div class=algorithm-steps> around all definitions
- A separate <div algorithm> around each op's steps.

This was already done for the element-wise ops so already the majority
style. Pooling ops now have just a single "Algorithm" watermark
(instead of 3), and reduction ops now have a watermark instead of
none.

For webmachinelearning#450
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 20, 2024
For operations defined in bulk - element-wise binary, unary, and
logical ops, plus pooling and reduction ops - make the styling
consistent:

- A single <div class=algorithm-steps> around all definitions
- A separate <div algorithm> around each op's steps.

This was already done for the element-wise ops so already the majority
style. Pooling ops now have just a single "Algorithm" watermark
(instead of 3), and reduction ops now have a watermark instead of
none.

For webmachinelearning#450
@inexorabletash
Copy link
Member

Closing out now that outstanding issues are tracked elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants