Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require reload after granting requestStorageAccess to get at unpartitioned storage #62

Closed
jkarlin opened this issue Sep 11, 2020 · 14 comments
Assignees

Comments

@jkarlin
Copy link

jkarlin commented Sep 11, 2020

I don't see a clean path from partitioned storage to unpartitioned storage within a single navigation. I'd like to propose that we don't allow it, and instead force the frame to reload to get its unpartitioned storage post grant. This leads to a simpler and safer browser implementation, and a cleaner developer mental model.

While a reload may not be ideal from the third party's UX perspective, it should be a rare event since I believe all of the browsers that support requestStorageAccess preserve the access for some number of days. And there could be some ways to improve the UX (e.g., load the unpartitioned frame behind the first until it's ready).

@johnwilander @hober @annevk Thoughts?

@johnwilander
Copy link
Collaborator

johnwilander commented Sep 11, 2020

We (WebKit) have actually talked about it in the context of site isolation. Given that granted storage access can move embedded content from unauthenticated state to authenticated state, it could serve as a trigger for process isolation in cases where process isolation for all embedded content is deemed too costly. Reload would be a must if switching process.

However, we need to acknowledge that storage access is granted per-page which means all matching iframes on that page would need to be reloaded on granted storage access.

@johnwilander johnwilander added the agenda+F2F Request to add this issue or PR to the agenda for our upcoming F2F. label Sep 11, 2020
@annevk
Copy link
Collaborator

annevk commented Sep 14, 2020

We have this transition implemented sans-reloading in Firefox and it works. We don't have a corresponding theoretical model written down yet though (see Storage Standard issues for work in progress) and are pretty flexible on changing the details. Requiring reloads does not seem like it would give a good user experience and not something we would want to be stuck with long term.

(As for site isolation, as sketched in https://bugzilla.mozilla.org/show_bug.cgi?id=1646047 I don't foresee issues there.)

cc @bakulf @artines1

@jkarlin
Copy link
Author

jkarlin commented Sep 15, 2020

I like the simplicity of the reload model. And note that you can still have the smooth UX without a reload. For instance, you could create a new hidden frame that has unpartitioned storage access and get your data from that. I could even imagine someone making an async js library to make that relatively straight-forward:

e.g.,

storage = new UnpartitionedStorage();
data = await storage.localStorage.getItem('foo');

With this approach it's abundantly clear to the developer where the data is coming from, unlike in the case where we transition from partitioned to unpartitioned. And we don't have a weird transition from subresource request headers suddenly changing their cookies.

@johnwilander
Copy link
Collaborator

I like the simplicity of the reload model. And note that you can still have the smooth UX without a reload. For instance, you could create a new hidden frame that has unpartitioned storage access and get your data from that.

With the per-page storage access Mozilla and Microsoft convinced us of, there will be no simultaneous access to partitioned and unpartitioned storage, unless you’re thinking of different windows talking to each other.

I could even imagine someone making an async js library to make that relatively straight-forward:

e.g.,

storage = new UnpartitionedStorage();
data = await storage.localStorage.getItem('foo');

With this approach it's abundantly clear to the developer where the data is coming from, unlike in the case where we transition from partitioned to unpartitioned. And we don't have a weird transition from subresource request headers suddenly changing their cookies.

@jkarlin
Copy link
Author

jkarlin commented Sep 15, 2020

Ah, that's interesting. I was unaware of that. Hmm. So the entire top frame would have to reload in order for the third party frame to get its unpartitioned storage? Because otherwise, if there are multiple frames for the same third party they'd each need to be reloaded simultaneously. That seems quite bad.

@annevk
Copy link
Collaborator

annevk commented Sep 21, 2020

I think there are basically two questions here:

  1. What can we compatibly deploy?
  2. What's the long term story for web developers?

To 1, a concern was brought up at the meeting about A embedding multiple Bs that are unaware of each other whereby one of the Bs getting non-partitioned storage also implying the others would get non-partitioned storage might be problematic. In that the unaware B is suddenly doing storage operations against a different backend.

A different scenario might be A embedding a single B that itself embeds another B. In that case the Bs are likely aware of each other and might not realize what global is in charge when doing storage operations.

And again, we're trying this out on Firefox Nightly and haven't run into significant issues thus far as I understand it.

cc @englehardt @artines1

@annevk annevk removed the agenda+F2F Request to add this issue or PR to the agenda for our upcoming F2F. label Sep 21, 2020
@annevk
Copy link
Collaborator

annevk commented Sep 24, 2020

We discussed this more internally and remain unconvinced that a change in approach is needed here:

  1. No reported problems against Firefox Nightly about this behavior.
  2. "Unrelated" same-origin documents mutating storage can already cause problems for each other, which is why they have to coordinate such actions, e.g., through the storage event.
  3. In the cases under discussion these documents are not really unrelated as they have direct script access to each other.

Note that

With the per-page storage access Mozilla and Microsoft convinced us of, there will be no simultaneous access to partitioned and unpartitioned storage, unless you’re thinking of different windows talking to each other.

is not quite true with Firefox's approach. If you have an ongoing transaction against a partitioned database that is allowed to continue. Once all those transactions are finished the partitioned storage is no longer available and can be wiped. We could potentially build on that model and give developers an event before we do the swap from which they could start transactions to migrate data as needed. As well as an event after as discussed in #55. (Those events seem reasonable as documents can already do as much through other means.)

@jkarlin
Copy link
Author

jkarlin commented Sep 24, 2020

No reported problems against Firefox Nightly about this behavior.

That's good news. I do worry about the sampling here. requestStorageAccess doesn't appear to be called often on Firefox Nightly (e.g., 1.33k requests in Nightly 79 out of 88 million page loads). That is likely not high enough for 3p developers to take notice of.

"Unrelated" same-origin documents mutating storage can already cause problems for each other, which is why they have to coordinate such actions, e.g., through the storage event.

True, though this also affects same-site embeds right? Not just same-origin.

In the cases under discussion these documents are not really unrelated as they have direct script access to each other.

docs.google.com and music.google.com don't have script access to each other. And there are going to be cases where embeds have no notion of each other's existence. The first party can embed whatever it wants on a page.

I prefer a predictable web, where the developer is in control of where their data is coming from and writing to by default. It seems highly developer unfriendly to me that one's frame could instantly be transitioned from one storage bucket to another, depending on what other frames of your site have been embedded on the given page (which is out of the developer's control). It opens up a threat vector that an attacker can try to coordinate which bucket data gets written to. I don't know how problematic that is in practice, but it's something else to worry about.

Whereas we could instead allow per-document access to storage by calling requestStorageAccess. If it's granted, then create a new same-origin frame and get unpartitioned storage from that. If it had been granted before this load, then the underlying bucket is already correct.

I know another area of discussion is what to do with cookies once requestStorageAccess is granted. Today I believe they're granted per-page. e.g., main frame third-party requests will have cookies after requestStorageAccess is granted in a sub-frame for said site. I also find this strange. What is the rationale for this? So that sites can continue on as if they always had storage access in the first place? That seems vulnerable to breakage (e.g., some of my responses are customized to me and some are not). Though maybe it works well in practice. Unclear to me.

The approaches that I think make sense are:

  1. Make requestStorageAccess per frame. Perhaps it could resolve with document.cookie so the frame at least instantly learns who it represents. Require a reload (or a new frame) to get the rest of unpartitioned storage access. The top frame third-party requests won't have cookie access until next load.
  2. Similar to the above, but have requestStorageAccess return the unpartitioned storage bucket, so the developer has to explicitly read/write to it. Maybe broadcastChannel could be accessible on this bucket but I don't think we'd want to mess with service workers here. For this reason I prefer the first approach. It's also simpler.

@jeremyroman
Copy link

Another perspective here: we're working on building an uncredentialed form of prerendering in which we load eligible top-level content without access to its unpartitioned storage (in fact, no preexisting storage at all) and at navigation time gains access to its unpartitioned storage (i.e., the access it would in an ordinary navigation). This looks a lot like the case of 3P frames (a transition to unpartitioned storage upon some sort of user action).

This motivates some goals I think are worth considering:

  1. Transparency after transition: after suitable incantations, the page behaves the same as it would have had it had storage access since load (i.e., all undecorated APIs and newly initiated network fetches use unpartioned storage). For prerendering, this makes eligibility a lot easier: content can defer the loading of any component or library unaware of storage partitioning/blocking until storage becomes unpartitioned, and then it will behave normally, without modification. This lets content do better and better at prerendering as its components become storage-access-aware.

  2. Avoiding single privileged responders: to a greater extent than embedded widgets (but I suspect it does happen there too), top-level pages often contain a number of components and libraries from different teams/vendors (even if they are not loaded in separate frames). APIs which broadcast information about changes allow independent registration by each component/library; APIs which require a single privileged responder to funnel these through a single point are harder to coordinate and may invite brittle monkey-patching.

  3. Don't reload: naturally, prerendering loses most of its appeal if the prerendered page must be discarded.

To me these ideally suggest when a page is eligible for storage access (because it has been granted through a prompt in a suitable frame, or because it is becoming the main frame, which is entitled to it), it should be upgraded (possibly with an option to delay that upgrade until it can finish pending activity and fetch any data it needs to migrate).

Making the upgrade optional (i.e., telling the page it is pre-approved for storage access but not changing its storage access) would be an option, too, but would require each page to have a privileged entity that makes the decision of when that happens, which may complicate multi-vendor documents.

Of course, this is balanced against the complexity of changing out the available storage in-flight, which others have eloquently explained.

@johnwilander
Copy link
Collaborator

Hi @jeremyroman! There’s a lot going on in your comment. Just so I understand, are you suggesting that a partitioned third-party can cache/preload content that will subsequently be used when it is first party or third party under a different partition?

@jeremyroman
Copy link

Not exactly. I'm suggesting that a page can cause another page to be loaded and prerendered (in a fashion similar to <link rel=prerender>), so that the user can navigate to it.

The straightforward version of this would imply that as it's top-level, the prerendered page should have unpartitioned storage. However, in the cross-site case this could provide capabilities similar to a third-party frame. So we want to restrict its access to unpartitioned storage until the user navigates to it (e.g., by clicking a link), at which point the prerendered page is presented to the user (which makes this navigation nearly instantaneous). At that point, unpartitioned storage may be used, just like after any other top-level navigation.

That is, I'm not suggesting a new capability for third-party frames, but a new capability for top-level pages which has certain similarities to third-party frames.

@johnwilander johnwilander added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Oct 5, 2020
@TanviHacks TanviHacks removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Oct 26, 2020
@hober
Copy link
Member

hober commented Feb 22, 2022

Is the right solution here to never switch out the storage mechanism from out under the bare storage API calls, and to instead offer access via some kind of explicit storage bucket api? @annevk @johnwilander

@annevk
Copy link
Collaborator

annevk commented Mar 2, 2022

Yeah, given the security (and some implementation) issues with switching all of storage that seems like a better alternative.

@johannhof johannhof added future Will consider for a future revision and removed future Will consider for a future revision labels Mar 22, 2022
@johannhof
Copy link
Member

It probably makes sense to talk about storage bucket integration in a separate issue. @johnwilander will file one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants