Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial draft of outlining the larger problem space #4

Merged
merged 12 commits into from
Oct 13, 2020

Conversation

domenic
Copy link
Collaborator

@domenic domenic commented Oct 5, 2020

The biggest uncertainty I have about this is how I've doubled down on prerendering, whereas Jeremy's original doc seems to want to treat prerendering and prefetching on almost-equal footing. Thoughts welcome.

I highly recommend reviewing this with the "Split" diff mode, or better yet looking at the rendered output, since the README.md diff is just a mess.

/cc @domfarolino @yoavweiss @kinu @ianvollick

Copy link
Collaborator

@mfalken mfalken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, just had some non-blocking comments as I read it.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved

__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why opt-in is required for same-origin. If there's no opt-in, could it make sense to do automatic prerendering for a same-origin link and just use the same restrictions as cross-origin prerendering?

I guess we assume if the site bothered to put opt-in on the content response, it would have bothered to put opt-in on the referring page if it were worthwhile, so there's no need to try automatic prerendering?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right; we could do same-origin prerendering with only destination-side opt-in. (Would we also omit credentials? Hmm.)

Since this section is supposed to be comprehensive, I think I should mention that, even if we end up deciding to build something that sticks to the simpler rule I propose here. (I.e., sticks to triggering-page opt-in for same-origin, destination-page opt-in for cross-origin.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Destination side opt-in only is a problem because you don't know if the GET will have side effects, which is especially acute in the credentialed (same-origin) case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would a side-effecting page opt-in?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot tell if a destination page has opted in until after you have already fetched it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we could do same-origin destination-side only opt-in as long as we applied the same restrictions that we do for cross-origin, i.e. no credentials/storage access/etc. Right? I'll push a commit mentioning that possibility, but saying that we're not currently pursuing it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if mentioning same-origin destination-side opt-in case is important or necessary at this point, it may just confuse readers? One thing that's clear is that we'll start with referrer side API (either in implicit or explicit)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I saw the new text below. I'm good with the current framing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I'd prefer to err on the side of not sending a GET unless we have some sort of indication that it's not going to have side effects. Being uncredentialed lowers the bar, but I'm not sure I'm ready to lower the bar to zero.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The framing as an Aside looks good to me, thanks.

README.md Outdated

### Document policy
As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is opt-in needed if there were no credentials when the prerender started?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still unclear to me. There's talk of "adaptive prerendering" which I have a hard time understanding, and could use some help incorporating into this document.

In particular, a model that tries to prerender without opt-in runs into the following issues:

  • Does that break impression tracking?
  • What happens if the page tries to write to storage?
    • If it fails, then we're likely to break pages. So an opt-in would be better.
    • If it goes to partitioned storage, then we have to figure out how to join the partitioned storage and the unpartitioned storage after transition to a real top-level browsing context. Which seems basically impossible to do automatically. So an opt-in, plus custom page logic to handle the transition, makes more sense.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also unsure how we want to reason about adaptive prerendering long-term, and what we can/cannot do without modification.

@kinu

Copy link

@kinu kinu Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, in a world where everyone opts-in we don't need this.

My reasoning is that it shouldn't have a tough conflict as far as it can go without requiring cookies (while the site can still opt-out and drop the request). For actions like prefetching this feels even more plausible and valid. (One of the previous prefetch discussion that is linked from the opt-in explicitly mentions this case)

No storage writes should happen, they will be deferred as much as possible, or if sync access happens the prerender can bail out.

Exception is the 1p cookie, they will be written back and merged, and this is possible (and we have code) because we're talking about no-cookies/no-storage cases.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adaptive one is an exploration of what we can do without conflicting with what we want to be doing. Therefore there shouldn't be many that need to be incorporated explicitly for adaptive cases, while, one exception is the first main resource response. If UA has no cookies it doesn't feel that the necessity to have response opt-in is a bit slim.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I had to struggle to come up with a possible alternative text (and the reasonable behavior). I think I'm also okay to drop this eventually, but not yet fully comfortable saying that "we require opt-ins for all responses, otherwise UA does nothing".

What do we think if we tweak this like following:

'... are performed without credentials. This means that the target site may get uncredentialed requests even when there are credentials stored in UA. If the response that comes back does not have the opt-in in such cases, the result is discarded.'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinu's suggestion works for me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated; thanks!

browsing-context.md Outdated Show resolved Hide resolved
@@ -1,168 +1,156 @@
# Alternate Loading Modes
# Prerendering, revamped
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm uncertain how heavily we want to lean into prerendering specifically. About the first 3/5 of it is relevant to prefetching alone.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to use a more general term like "speculation" instead, that captures dns-prefetch, preconnect, prefetch, and prerender. Obviously some pieces are prerender-specific. But obviously the more general a term we use the less clear it will be at first glance what we mean.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think the ladder of speculation is interesting. If there are too many links that would be okay to prerender and we're not sure, I think we should be free to prefetch the 3 most likely options instead, for instance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take was that we should focus this around prerendering, and say that we might capture some value for prefetching etc. along the way. But it's easier to paint a coherent picture as to why these all belong as part of the same effort, if that effort is prerendering.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also interested in how one mode for prerendering can fall back to another mode for prefetching, while it kinda agree that putting all these stories together around a concrete feature name (i.e. prerendering) seems to make this really tractable and understandable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a commit that expands the "Prefetching" section to explain this a bit more.


__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot tell if a destination page has opted in until after you have already fetched it.


Since existing web pages are unlikely to behave well with these restrictions today, and it is impractical for user agents to distinguish such pages, we propose a lightweight way for a page to declare that it is prepared for this and will, if necessary, upgrade itself when it gains access to unpartitioned storage and other privileges.
* [**Prerendered content opt-in**](./opt-in.md), which allows pages to opt in to being prerendered by other cross-origin pages.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does confuse the issue here a bit, but FWIW the reason the opt-in is described a little funny is to allow room for fenced frames to have a separate loading mode with different restrictions if they need that in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what do you mean? In particular, the opt-in description doesn't seem strange to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly it's a little abstract because it wants to allow the possibility of other loading modes with restrictions other than those for prerendering, in the future.


Some text from Jeremy's doc:

First, consider the __fetch__ of the resource. User agents would ideally prefetch the content in a way that does not identify the user. For example, the user agent could:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about separating this so much from the discussion of what the consequences of this are. opt-in.md still discusses it a little, but in a way that might be too cursory to be useful to understand without clicking through.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that this will eventually grow to a more detailed explanation of the changes and how they flow through the system. As such, these seem like the kernel of such a detailed description, and I moved them here.

My intention for opt-in.md is that you're supposed to arrive at that document taking for granted that an opt-in is needed, and if you want to find out why, then you'll have to click through to fetch.md or maybe README.md. (Probably some of https://github.com/WICG/portals#privacy-threat-model-and-restrictions needs to make its way here...)

README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved

It might be possible for the user agent to augment author declarations with a list of origins or documents known to behave well, as a browser feature. This would depend on identifying such sites in a fairly reliable way and providing a mechanism for users to reload if they observe brokenness. However, such a mechanism would have limitations and would necessarily not behave the same in all browsers. The web platform should provide a way to more predictably get the desired behavior.
and then decorate certain `<a>` elements with `class="high-likelihood-prerender"`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting, of course, that it would be totally legitimate to split off selector support into a followup, if we're worried about it. (My doc describes it because I do think it might be useful, and because it demonstrates extensibility.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in general I'm not sure about these "scenarios" sections. They're currently more detailed than the sub-documents they link to. Maybe when we have the full explainer, we can change this to reference a "potential feature", linking to an explanation of it as being an extension point.

README.md Outdated

### Document policy
As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also unsure how we want to reason about adaptive prerendering long-term, and what we can/cannot do without modification.

@kinu

README.md Show resolved Hide resolved
opt-in.md Show resolved Hide resolved
Copy link

@kinu kinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. I'm still reading but pushing some comments


* send a request without credentials (e.g., no `Cookie` or `Authorization` request header)
* establish the connection from a different client IP address (e.g., using a proxy server or virtual private network, if available)
* use a previously fetched response, including one previously fetched by a third party if it can be authenticated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'if it can be authenticated' this part was not too clear to me, might be easier to follow if some quick example is given

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a handwave about caching, including SXG caching.

@@ -1,168 +1,156 @@
# Alternate Loading Modes
# Prerendering, revamped
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also interested in how one mode for prerendering can fall back to another mode for prefetching, while it kinda agree that putting all these stories together around a concrete feature name (i.e. prerendering) seems to make this really tractable and understandable.

README.md Outdated
* send a request without credentials (e.g., no `Cookie` or `Authorization` request header)
* establish the connection from a different client IP address (e.g., using a proxy server or virtual private network, if available)
* use a previously fetched response, including one previously fetched by a third party if it can be authenticated
This repository contains a set of explainers and (eventually) specifications which, combined, give a rigorous model for performing such prerendering and prefetching of content, in an interoperably-implementable way.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it can also briefly mention that each part is designed to be composeable so that a different combination of some parts should be able to be used for other use cases, or something like that (so that some of these can be used for fenced frames etc)


__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if mentioning same-origin destination-side opt-in case is important or necessary at this point, it may just confuse readers? One thing that's clear is that we'll start with referrer side API (either in implicit or explicit)


__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I saw the new text below. I'm good with the current framing.

README.md Outdated

__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
Although these explainers focus largely on prerendering, we expect some of the work they produce to be useful for _prefetching_ as well. Prefetching currently exists in [`<link rel="prefetch">`](https://w3c.github.io/resource-hints/#dfn-prefetch), but as with prerendering, it is underspecified, and its current implementations have potential privacy issues which will require some work to address.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large missing part in prefetch is for cross-origin navigation (that is what I'm hoping that this one can just address/subsume), while same-origin subresource prefetches are relatively in a okay state (though there're still various underspecified parts there too, yes)

],
"disallow": [
{
"action": "prefetch",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can "action" have multiple actions? (Related to the resolved comment thread, but I'm a bit concerned about introducing ordering or inclusion relationships between multiple actions)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could (though alternatively you could just repeat the whole rule).

I would really like to keep some form of ordering because I really think we should be able to prefetch but not prerender if we have network bandwidth available but don't want to waste CPU/memory, or if we're uncertain between various possible outgoing links. It's definitely possible to make this less footgun-y.

Copy link

@kinu kinu Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, while ordering is a hard problem in general. In my ideal mental model prerendering can be decomposed into { connect, fetch-main, parse-html, fetch-sub, run-javascripts, render ... } or something, but it may be too detailed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @domenic and I are discussing some of the options here in the internal doc (which will be probably published to this repo shortly).

@kinu
Copy link

kinu commented Oct 13, 2020

Added one more suggestion / possible alternative text for the opt-in requirement, but otherwise this LGTM. Thanks!

README.md Outdated Show resolved Hide resolved
README.md Outdated

The largest drawback here is one of adoption. In order to make as much content as possible available for uncredentialed prerendering, we would like to make it as easy as possible for authors to mark eligible content. We have heard from developers that many of them find it much easier to deploy changes that only affect content than changes which also require server behavior changes, even relatively straightforward ones. For example, these may be managed by different teams or not be possible at all. One key example here is that GitHub Pages doesn't allow users to set response headers.
As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, we expect browsers to prerender all links and drop the ones that don't opt-in? Or use some magical heuristics to guess opt-ins ahead of sending out requests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be up to the UA.

opt-in.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@mfalken mfalken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really lucid explanations, thanks for writing this up.


__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The framing as an Aside looks good to me, thanks.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated

### Document policy
As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinu's suggestion works for me.

browsing-context.md Outdated Show resolved Hide resolved
@@ -25,4 +25,4 @@ if (!document.loadingMode || document.loadingMode.type === 'default') {
}
```

Script can also observe this by using APIs particular to the behavior they are interested in. For instance, the [`document.hasStorageAccess()`][storage-access] can be used in supporting browsers to observe whether unpartitioned storage is available.
Script can also observe this by using APIs particular to the behavior they are interested in. For instance, the [`document.hasStorageAccess()`][https://github.com/privacycg/storage-access] API can be used in supporting browsers to observe whether unpartitioned storage is available.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This broke because it was previously written as a reference link instead of an inline link (the bottom of the .md linked out). If you'd rather write this inline, I think the correct Markdown syntax is [link text](url) with parentheses.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks, fixed.

* deny scripted access to unpartitioned storage, such as cookies and IndexedDB
* deny permission to invoke `window.alert`, autoplay audio, and other APIs inappropriate at this time

JS API probably belongs here (maybe it should use page visibility API instead):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, possibly. I think the utility of a dedicated API basically depends on how many different kinds of modes there are that authors need to distinguish.

@domenic
Copy link
Collaborator Author

domenic commented Oct 13, 2020

I don't have write access, so @jeremyroman, would you mind merging?

@jeremyroman jeremyroman merged commit 6a3a1bd into WICG:gh-pages Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants