Initial draft of outlining the larger problem space #4

domenic · 2020-10-05T21:11:31Z

The biggest uncertainty I have about this is how I've doubled down on prerendering, whereas Jeremy's original doc seems to want to treat prerendering and prefetching on almost-equal footing. Thoughts welcome.

I highly recommend reviewing this with the "Split" diff mode, or better yet looking at the rendered output, since the README.md diff is just a mess.

/cc @domfarolino @yoavweiss @kinu @ianvollick

mfalken

This is great, just had some non-blocking comments as I read it.

README.md

mfalken · 2020-10-06T15:06:34Z

README.md


-__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
+The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.


I don't quite understand why opt-in is required for same-origin. If there's no opt-in, could it make sense to do automatic prerendering for a same-origin link and just use the same restrictions as cross-origin prerendering?

I guess we assume if the site bothered to put opt-in on the content response, it would have bothered to put opt-in on the referring page if it were worthwhile, so there's no need to try automatic prerendering?

I think you're right; we could do same-origin prerendering with only destination-side opt-in. (Would we also omit credentials? Hmm.)

Since this section is supposed to be comprehensive, I think I should mention that, even if we end up deciding to build something that sticks to the simpler rule I propose here. (I.e., sticks to triggering-page opt-in for same-origin, destination-page opt-in for cross-origin.)

Destination side opt-in only is a problem because you don't know if the GET will have side effects, which is especially acute in the credentialed (same-origin) case.

Why would a side-effecting page opt-in?

You cannot tell if a destination page has opted in until after you have already fetched it.

So we could do same-origin destination-side only opt-in as long as we applied the same restrictions that we do for cross-origin, i.e. no credentials/storage access/etc. Right? I'll push a commit mentioning that possibility, but saying that we're not currently pursuing it.

I'm not sure if mentioning same-origin destination-side opt-in case is important or necessary at this point, it may just confuse readers? One thing that's clear is that we'll start with referrer side API (either in implicit or explicit)

Ah I saw the new text below. I'm good with the current framing.

Right, but I'd prefer to err on the side of not sending a GET unless we have some sort of indication that it's not going to have side effects. Being uncredentialed lowers the bar, but I'm not sure I'm ready to lower the bar to zero.

The framing as an Aside looks good to me, thanks.

mfalken · 2020-10-06T15:12:52Z

README.md


-### Document policy
+As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.


Is opt-in needed if there were no credentials when the prerender started?

This is still unclear to me. There's talk of "adaptive prerendering" which I have a hard time understanding, and could use some help incorporating into this document.

In particular, a model that tries to prerender without opt-in runs into the following issues:

Does that break impression tracking?

What happens if the page tries to write to storage?

If it fails, then we're likely to break pages. So an opt-in would be better.

If it goes to partitioned storage, then we have to figure out how to join the partitioned storage and the unpartitioned storage after transition to a real top-level browsing context. Which seems basically impossible to do automatically. So an opt-in, plus custom page logic to handle the transition, makes more sense.

I'm also unsure how we want to reason about adaptive prerendering long-term, and what we can/cannot do without modification.

@kinu

Note, in a world where everyone opts-in we don't need this.

My reasoning is that it shouldn't have a tough conflict as far as it can go without requiring cookies (while the site can still opt-out and drop the request). For actions like prefetching this feels even more plausible and valid. (One of the previous prefetch discussion that is linked from the opt-in explicitly mentions this case)

No storage writes should happen, they will be deferred as much as possible, or if sync access happens the prerender can bail out.

Exception is the 1p cookie, they will be written back and merged, and this is possible (and we have code) because we're talking about no-cookies/no-storage cases.

Adaptive one is an exploration of what we can do without conflicting with what we want to be doing. Therefore there shouldn't be many that need to be incorporated explicitly for adaptive cases, while, one exception is the first main resource response. If UA has no cookies it doesn't feel that the necessity to have response opt-in is a bit slim.

Sorry, I had to struggle to come up with a possible alternative text (and the reasonable behavior). I think I'm also okay to drop this eventually, but not yet fully comfortable saying that "we require opt-ins for all responses, otherwise UA does nothing".

What do we think if we tweak this like following:

'... are performed without credentials. This means that the target site may get uncredentialed requests even when there are credentials stored in UA. If the response that comes back does not have the opt-in in such cases, the result is discarded.'

kinu's suggestion works for me.

Incorporated; thanks!

browsing-context.md

jeremyroman · 2020-10-06T20:53:56Z

README.md

@@ -1,168 +1,156 @@
-# Alternate Loading Modes
+# Prerendering, revamped


Yeah, I'm uncertain how heavily we want to lean into prerendering specifically. About the first 3/5 of it is relevant to prefetching alone.

One option is to use a more general term like "speculation" instead, that captures dns-prefetch, preconnect, prefetch, and prerender. Obviously some pieces are prerender-specific. But obviously the more general a term we use the less clear it will be at first glance what we mean.

I also think the ladder of speculation is interesting. If there are too many links that would be okay to prerender and we're not sure, I think we should be free to prefetch the 3 most likely options instead, for instance.

My take was that we should focus this around prerendering, and say that we might capture some value for prefetching etc. along the way. But it's easier to paint a coherent picture as to why these all belong as part of the same effort, if that effort is prerendering.

I'm also interested in how one mode for prerendering can fall back to another mode for prefetching, while it kinda agree that putting all these stories together around a concrete feature name (i.e. prerendering) seems to make this really tractable and understandable.

I've pushed a commit that expands the "Prefetching" section to explain this a bit more.

jeremyroman · 2020-10-06T21:07:29Z

README.md


-__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
+The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.


You cannot tell if a destination page has opted in until after you have already fetched it.

jeremyroman · 2020-10-06T21:08:56Z

README.md


-Since existing web pages are unlikely to behave well with these restrictions today, and it is impractical for user agents to distinguish such pages, we propose a lightweight way for a page to declare that it is prepared for this and will, if necessary, upgrade itself when it gains access to unpartitioned storage and other privileges.
+* [**Prerendered content opt-in**](./opt-in.md), which allows pages to opt in to being prerendered by other cross-origin pages.


It does confuse the issue here a bit, but FWIW the reason the opt-in is described a little funny is to allow room for fenced frames to have a separate loading mode with different restrictions if they need that in the future.

Hmm, what do you mean? In particular, the opt-in description doesn't seem strange to me.

Mainly it's a little abstract because it wants to allow the possibility of other loading modes with restrictions other than those for prerendering, in the future.

jeremyroman · 2020-10-06T21:14:54Z

fetch.md

+
+Some text from Jeremy's doc:
+
+First, consider the __fetch__ of the resource. User agents would ideally prefetch the content in a way that does not identify the user. For example, the user agent could:


I'm not sure about separating this so much from the discussion of what the consequences of this are. opt-in.md still discusses it a little, but in a way that might be too cursory to be useful to understand without clicking through.

The idea is that this will eventually grow to a more detailed explanation of the changes and how they flow through the system. As such, these seem like the kernel of such a detailed description, and I moved them here.

My intention for opt-in.md is that you're supposed to arrive at that document taking for granted that an opt-in is needed, and if you want to find out why, then you'll have to click through to fetch.md or maybe README.md. (Probably some of https://github.com/WICG/portals#privacy-threat-model-and-restrictions needs to make its way here...)

README.md

jeremyroman · 2020-10-06T21:20:33Z

README.md


-It might be possible for the user agent to augment author declarations with a list of origins or documents known to behave well, as a browser feature. This would depend on identifying such sites in a fairly reliable way and providing a mechanism for users to reload if they observe brokenness. However, such a mechanism would have limitations and would necessarily not behave the same in all browsers. The web platform should provide a way to more predictably get the desired behavior.
+and then decorate certain `<a>` elements with `class="high-likelihood-prerender"`.


Noting, of course, that it would be totally legitimate to split off selector support into a followup, if we're worried about it. (My doc describes it because I do think it might be useful, and because it demonstrates extensibility.)

Yeah, in general I'm not sure about these "scenarios" sections. They're currently more detailed than the sub-documents they link to. Maybe when we have the full explainer, we can change this to reference a "potential feature", linking to an explanation of it as being an extension point.

jeremyroman · 2020-10-06T21:22:10Z

README.md


-### Document policy
+As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.


I'm also unsure how we want to reason about adaptive prerendering long-term, and what we can/cannot do without modification.

@kinu

README.md

opt-in.md

kinu

This is looking great. I'm still reading but pushing some comments

kinu · 2020-10-07T13:59:48Z

fetch.md

+
+* send a request without credentials (e.g., no `Cookie` or `Authorization` request header)
+* establish the connection from a different client IP address (e.g., using a proxy server or virtual private network, if available)
+* use a previously fetched response, including one previously fetched by a third party if it can be authenticated


'if it can be authenticated' this part was not too clear to me, might be easier to follow if some quick example is given

This is a handwave about caching, including SXG caching.

kinu · 2020-10-07T14:52:25Z

README.md

@@ -1,168 +1,156 @@
-# Alternate Loading Modes
+# Prerendering, revamped


I'm also interested in how one mode for prerendering can fall back to another mode for prefetching, while it kinda agree that putting all these stories together around a concrete feature name (i.e. prerendering) seems to make this really tractable and understandable.

kinu · 2020-10-07T14:58:57Z

README.md

-* send a request without credentials (e.g., no `Cookie` or `Authorization` request header)
-* establish the connection from a different client IP address (e.g., using a proxy server or virtual private network, if available)
-* use a previously fetched response, including one previously fetched by a third party if it can be authenticated
+This repository contains a set of explainers and (eventually) specifications which, combined, give a rigorous model for performing such prerendering and prefetching of content, in an interoperably-implementable way.


Maybe it can also briefly mention that each part is designed to be composeable so that a different combination of some parts should be able to be used for other use cases, or something like that (so that some of these can be used for fenced frames etc)

kinu · 2020-10-08T03:03:08Z

README.md


-__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
+The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.


I'm not sure if mentioning same-origin destination-side opt-in case is important or necessary at this point, it may just confuse readers? One thing that's clear is that we'll start with referrer side API (either in implicit or explicit)

kinu · 2020-10-08T03:06:33Z

README.md


-__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
+The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.


Ah I saw the new text below. I'm good with the current framing.

kinu · 2020-10-08T03:08:08Z

README.md


-__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
+Although these explainers focus largely on prerendering, we expect some of the work they produce to be useful for _prefetching_ as well. Prefetching currently exists in [`<link rel="prefetch">`](https://w3c.github.io/resource-hints/#dfn-prefetch), but as with prerendering, it is underspecified, and its current implementations have potential privacy issues which will require some work to address.


Large missing part in prefetch is for cross-origin navigation (that is what I'm hoping that this one can just address/subsume), while same-origin subresource prefetches are relatively in a okay state (though there're still various underspecified parts there too, yes)

kinu · 2020-10-08T03:34:09Z

README.md

+  ],
+  "disallow": [
+    {
+      "action": "prefetch",


Can "action" have multiple actions? (Related to the resolved comment thread, but I'm a bit concerned about introducing ordering or inclusion relationships between multiple actions)

It could (though alternatively you could just repeat the whole rule).

I would really like to keep some form of ordering because I really think we should be able to prefetch but not prerender if we have network bandwidth available but don't want to waste CPU/memory, or if we're uncertain between various possible outgoing links. It's definitely possible to make this less footgun-y.

Sure, while ordering is a hard problem in general. In my ideal mental model prerendering can be decomposed into { connect, fetch-main, parse-html, fetch-sub, run-javascripts, render ... } or something, but it may be too detailed.

Yeah, @domenic and I are discussing some of the options here in the internal doc (which will be probably published to this repo shortly).

kinu · 2020-10-13T10:59:17Z

Added one more suggestion / possible alternative text for the opt-in requirement, but otherwise this LGTM. Thanks!

README.md

yoavweiss · 2020-10-13T10:10:29Z

README.md


-The largest drawback here is one of adoption. In order to make as much content as possible available for uncredentialed prerendering, we would like to make it as easy as possible for authors to mark eligible content. We have heard from developers that many of them find it much easier to deploy changes that only affect content than changes which also require server behavior changes, even relatively straightforward ones. For example, these may be managed by different teams or not be possible at all. One key example here is that GitHub Pages doesn't allow users to set response headers.
+As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.


In practice, we expect browsers to prerender all links and drop the ones that don't opt-in? Or use some magical heuristics to guess opt-ins ahead of sending out requests?

That'd be up to the UA.

opt-in.md

mfalken

Really lucid explanations, thanks for writing this up.

mfalken · 2020-10-13T13:44:54Z

README.md


-__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
+The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.


The framing as an Aside looks good to me, thanks.

README.md

mfalken · 2020-10-13T14:06:24Z

README.md


-### Document policy
+As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.


kinu's suggestion works for me.

browsing-context.md

jeremyroman · 2020-10-13T17:56:50Z

browsing-context.md

@@ -25,4 +25,4 @@ if (!document.loadingMode || document.loadingMode.type === 'default') {
 }
 ```

-Script can also observe this by using APIs particular to the behavior they are interested in. For instance, the [`document.hasStorageAccess()`][storage-access] can be used in supporting browsers to observe whether unpartitioned storage is available.
+Script can also observe this by using APIs particular to the behavior they are interested in. For instance, the [`document.hasStorageAccess()`][https://github.com/privacycg/storage-access] API can be used in supporting browsers to observe whether unpartitioned storage is available.


nit: This broke because it was previously written as a reference link instead of an inline link (the bottom of the .md linked out). If you'd rather write this inline, I think the correct Markdown syntax is [link text](url) with parentheses.

Oops, thanks, fixed.

jeremyroman · 2020-10-13T17:57:59Z

browsing-context.md

+* deny scripted access to unpartitioned storage, such as cookies and IndexedDB
+* deny permission to invoke `window.alert`, autoplay audio, and other APIs inappropriate at this time
+
+JS API probably belongs here (maybe it should use page visibility API instead):


Yeah, possibly. I think the utility of a dedicated API basically depends on how many different kinds of modes there are that authors need to distinguish.

domenic · 2020-10-13T19:52:29Z

I don't have write access, so @jeremyroman, would you mind merging?

Initial draft of outlining the larger problem space

30798e4

mfalken reviewed Oct 6, 2020

View reviewed changes

jeremyroman reviewed Oct 6, 2020

View reviewed changes

domenic added 5 commits October 7, 2020 14:30

Fix typos

5a3acdd

Summarize cross- vs. same- better, and explain potential mixing

3bd0781

Fix disallow to be prefetch

9ab085c

Show use of storage access API instead

8e2e546

Expand on prefetching

ad2eccd

kinu reviewed Oct 8, 2020

View reviewed changes

domenic added 2 commits October 12, 2020 13:51

Some additions from kinu's review

de1c0ec

Example tweaks

c9902f1

yoavweiss reviewed Oct 13, 2020

View reviewed changes

mfalken approved these changes Oct 13, 2020

View reviewed changes

domenic added 3 commits October 13, 2020 10:48

Yoav's review

b659ded

Matt's review

dee2a29

Kinu's credentials suggestion

8257906

jeremyroman approved these changes Oct 13, 2020

View reviewed changes

Oops, fixed link syntax

ffdcade

jeremyroman merged commit 6a3a1bd into WICG:gh-pages Oct 13, 2020


		__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
		The tradeoff is that we now require opt-in from the referring page. The user agent cannot just heuristically prefetch or prerender any same-origin links that it sees; doing so would have bad consequences for links like `<a href="/logout">`.


		### Document policy
		As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.

		@@ -1,168 +1,156 @@
		# Alternate Loading Modes
		# Prerendering, revamped


		Since existing web pages are unlikely to behave well with these restrictions today, and it is impractical for user agents to distinguish such pages, we propose a lightweight way for a page to declare that it is prepared for this and will, if necessary, upgrade itself when it gains access to unpartitioned storage and other privileges.
		* [Prerendered content opt-in](./opt-in.md), which allows pages to opt in to being prerendered by other cross-origin pages.


		Some text from Jeremy's doc:

		First, consider the __fetch__ of the resource. User agents would ideally prefetch the content in a way that does not identify the user. For example, the user agent could:


		It might be possible for the user agent to augment author declarations with a list of origins or documents known to behave well, as a browser feature. This would depend on identifying such sites in a fairly reliable way and providing a mechanism for users to reload if they observe brokenness. However, such a mechanism would have limitations and would necessarily not behave the same in all browsers. The web platform should provide a way to more predictably get the desired behavior.
		and then decorate certain `<a>` elements with `class="high-likelihood-prerender"`.


		__Credentialed requests__ allow fetching the right resource for the user, but the web is increasingly moving toward a model whereby an origin cannot trigger credentialed requests to another user except as part of a committed action (i.e., a navigation). Forward-looking prerendering should prefer to use uncredentialed requests to reduce the risk of identifying the user when doing so is undesirable, and should also deny access to unpartitioned storage under those circumstances.
		Although these explainers focus largely on prerendering, we expect some of the work they produce to be useful for _prefetching_ as well. Prefetching currently exists in [`<link rel="prefetch">`](https://w3c.github.io/resource-hints/#dfn-prefetch), but as with prerendering, it is underspecified, and its current implementations have potential privacy issues which will require some work to address.


		The largest drawback here is one of adoption. In order to make as much content as possible available for uncredentialed prerendering, we would like to make it as easy as possible for authors to mark eligible content. We have heard from developers that many of them find it much easier to deploy changes that only affect content than changes which also require server behavior changes, even relatively straightforward ones. For example, these may be managed by different teams or not be possible at all. One key example here is that GitHub Pages doesn't allow users to set response headers.
		As in the last example, based on its heuristics or informed by prerender triggers, the browser can attempt to prerender these linked-to news articles. Since they are cross-origin, however, the process is more restricted. The initial fetch, as well as any subresource fetches, are performed without credentials. If the response that comes back does not have the opt-in, then the result is discarded.

Initial draft of outlining the larger problem space #4

Initial draft of outlining the larger problem space #4

Conversation

domenic commented Oct 5, 2020 • edited Loading

mfalken left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kinu Oct 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kinu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kinu Oct 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kinu commented Oct 13, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfalken left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

domenic commented Oct 13, 2020

domenic commented Oct 5, 2020 •

edited

Loading

kinu Oct 8, 2020 •

edited

Loading

kinu Oct 9, 2020 •

edited

Loading