-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation Plan: Proxying frontend API requests through the Nuxt server #5265
base: main
Are you sure you want to change the base?
Conversation
Full-stack documentation: https://docs.openverse.org/_preview/5265 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. New files ➕: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments suggesting clarifications or alternatives. Happy to chat more about this. Excited to see this project come to light, I think it will massively improve the user experience for a major category of Openverse users (in education) as well as make the project's ingress more sustainable and manageable for the maintainers. Kudos on the IP!
- From the | ||
[k6 implementation](https://github.com/WordPress/openverse/pull/4908), copy | ||
the helper function that signs the request with HMAC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is HMAC signing of the requests required? What is the goal with it?
For context on my question, the HMAC approach was designed to bypass Cloudflare's rate limiting for the frontend, without the Nuxt server needing to know anything about the authorization/limits. It isn't necessary to use that method for the API because Cloudflare should be configured to bypass rate limiting for any API requests that contain an Authorization
header (as the API itself will handle those rate limits).
In other words, I would think the request flow could go like this, and not require any HMAC at all for regular browser users of the Openverse website:
- > User requests Openverse page and receives a response with a unique session ID in
set-cookie
- > User executes a client-side search -> Cloudflare checks rate limit for the IP + session ID: challenge if needed; otherwise ...
- > Request sent to the new Nuxt
/api/...
proxy routes - > The request is forwarded by Nuxt to the API with the API authentication method the Nuxt server already uses
- > Cloudflare does not rate limit because an
Authorization
header is present on the request - > The API does not rate limit because the
Authorization
header has a rate-limiting exempt token - > Response is sent back to Nuxt
- > Nuxt removes any response information that could expose Authorization information
- > Sends response body back to the client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a long reply first, and then realized that my thoughts were valid if only the /api/
routes are behind the rate limiting. If all frontend routes are behind the rate limit rules, we don't need to do anything special for protecting the /api/
routes (that can send unlimited requests to the API) from automatic abuse, as they would be behind the frontend rate limiting rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarayourfriend, your bullet points helped me picture this and made me realize that the cookie is probably an important part I didn't get. Thank you for summarizing it.
authentication, HMAC) | ||
- send the request using `ofetch` (or `ofetch.raw` to extract the response | ||
headers for `SEARCH_RESPONSE_TIME` Plausible event) | ||
- handle errors and return the appropriate response to the client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the route also reject requests as Unauthorized (401) if they are missing the session ID cookie? The API proxy routes should only be used for client-side requests of search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I had to write the draft of the implementation to see how it would work.
-
in the beginning of the server setup (e.g., in the
init-stores.server
plugin, if it's renamedinit.server
), set thesessionId
cookie that ishttpOnly
(if it does not exist). -
send the request from the store to the
/api
route.- On the client, the
httpOnly
will automatically be sent. - On the server, the request will be sent between the different parts of the Nuxt app, without the browser passing the cookies. We need to specifically pass the cookies using
getProxyRequestHeaders
fromh3
to extract them and pass them on with the$fetch
request. Since we are on the server, thehttpOnly
cookies are also available.
- On the client, the
envisage that we will need to adjust the limits many times because testing the | ||
Cloudflare rate limiting locally is not possible. | ||
|
||
### Use server routes in the frontend when the feature flag is on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that leveraging the @openverse/api-client
package would allow some simplification here. If the Nuxt API proxy routes are designed to match the same path-specification as the API itself, then the same API client can be used by swapping out the base URL based on the feature flag and whether the app is running server side or client side. To the rest of the application code, there would be no difference in how the requests are made.
It's probably more work up-front, because existing code would need to be changed to use the @openverse/api-client
, so worth considering whether it's something to plan for in the future. But even without using the @openverse/api-client
package, the same strategy of swapping the base URL could be used with the api-service
and avoid needing multiple different functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using the Nuxt-native methods would be better for logging and error handling of the requests on the server.
During the transition period, there would need to be separate branches for requesting the proxied/non-proxied urls. However, it would be more convenient to handle errors within the server route, where we can log all requests and errors on the server.
We could use the same naming for the server routes as the API has (so, instead of /api/search/image/?q=cat
have /v1/images/?q=cat
), which would make the conversion of the URLs simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough! For what it's worth, it's possible to swap the underlying fetch
implementation to pass options through to Nuxt's version, if it has better logging and other options. Check out the fetch
option to createClient
in the docs.
The main benefit of using @openverse/api-client
is unifying the API and types, and it provides a convenient way to swap out the target URL as needed. It would be possible to re-write the paths dynamically in the function passed to fetch
, so that the Nuxt routes don't need to reproduce exactly the same paths as the upstream API.
Anyway, just an alternative idea in case it is useful. The plan you've got doesn't look bad to me at all 🙂
- **Always Issuing a Challenge Without Thresholds**: Issuing a managed challenge | ||
to every new user, regardless of load or conditions, would inconvenience users | ||
behind NATs and degrade the initial user experience. While simpler to | ||
implement (we would not need to calculate the exact rate limit), it fails our | ||
design goal of minimizing friction for normal users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issuing a managed to every new user, regardless of load or conditions
Can you explain further why every new user would be issued a challenge? For what it's worth, the managed challenge "should" most of the time be a JavaScript challenge, which requires no user interaction.
To reduce friction without needing to introduce the turnstile module and involving the application code, you could have a separate (or no) rate limit for the home page, relying on that being the most common entry point for users where they would receive the original cookie to carry for the rest of their session.
To clarify the situation behind a NAT, it might be overly cautious to think that the rate limit will be tripped so quickly. Imagine a large school with maybe 3 computer labs having a simultaneous class, 50 students each. That is 150 fully unauthenticated requests from a single IP that need to happen for everyone to get a cookie. If the rate limit of a cookie-less request is set to 150 requests / 30 minute period, and it just takes a managed challenge to clear it for the IP, I don't think this would introduce very much friction at all.
For users on low spec Android phone, they would probably avoid ever seeing the managed challenge and get the cookie, and then just be subject to the regular rate limit.
of going directly to the API, requests will go to the Nuxt server routes | ||
(`/api/**`). The main difference is that, when rate-limited, the client will | ||
receive a Cloudflare challenge response that will be handled by | ||
[Nuxt Turnstile Module](https://github.com/nuxt-modules/turnstile). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like turnstile is mostly geared towards websites that want to use Cloudflare challenges without proxying requests through Cloudflare itself. At least, that's what the Turnstile marketing page indicates as the primary target use case. Openverse proxies requests through Cloudflare already. Can you clarify the goal behind using Turnstile rather than regular Cloudflare managed challenges, exactly?
- **Multi-language Support**: Challenges support multiple languages [^2], | ||
although the number of supported languages is lower than the number of | ||
languages supported by Openverse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, it's possible to implement custom versions of Cloudflare's error and challenge pages: https://developers.cloudflare.com/support/more-dashboard-apps/cloudflare-custom-pages/configuring-custom-pages-error-and-challenge/
I'd recommend doing this for the rate-limit response, for example, so that users get an Openverse branded page explaining that they are making requests too quickly, and advising them how long they need to wait until they can make another request, or how to get in contact with the maintainers if they believe there is an error.
This should eliminate language-support issues for the error responses, ensure that the presentation language is consistent across the session (e.g., using Openverse's own methods for identifying which language to present the page in, which may differ from Cloudflare's methods), and ensure a cohesive presentation overall. I don't know if there is a way to configure the language of the challenge itself, but maybe Cloudflare will make a best-guess based on the lang-attribute on the custom page? In any case, it would be a better experience than the entire page being inconsistent with the rest of the website.
Originally, I thought of using Cloudflare managed challenges to protect the Nuxt | ||
server routes from abuse. However, this would have required issuing a challenge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A clarification here: "to protect the Nuxt server routes from abuse".
For the solution to work, it's necessary for all routes to be behind the same Cloudflare rate limit rules, not just the server routes. That way, if someone gets rate limited on the server routes, the frontend can reload the page and the user will actually get the managed challenge page for clearing the rate limiting (or the explanation page if they are going too fast and need to actually be limited).
If only the server routes are limited by the new configuration, then there's no way to present the user with a challenge that will clear that rate limit, and the user would be stuck.
I mention in another comment that I don't think this would require issuing a challenge to every new user and would not meaningfully increase friction in the vast majority of even extreme cases. Regardless, it is necessary for all routes to have the same rate limit rules for this Cloudflare-rate-limit strategy to work with the safe guard of users being able to clear the rate limit as needed via the challenge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading through your comments and thinking more about this, I agree that we should have all routes behind the same Cloudflare rate limit rules.
That way, if someone gets rate limited on the server routes, the frontend can reload the page and the user will actually get the managed challenge page for clearing the rate limiting (or the explanation page if they are going too fast and need to actually be limited).
In the original implementation, the turnstile module was supposed to handle the cases when the users get the managed challenge instead of the results from the server routes. The module would temporarily show the "turnstile" page (this is not for the Cloudflare turnstile product, but the page that handles the HTML response with the challenge from the proxy and then saves the cookie to clear the challenge), and then allow to show the target page.
Here are the docs on Cloudflare integration of Turnstile with WAF rules: https://blog.cloudflare.com/integrating-turnstile-with-the-cloudflare-waf-to-challenge-fetch-requests/ It gives the JS snippet to handle the challenge response from the API.
However, it would be easier to simply use the rate limiting for all frontend paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! I was not connecting the dots on the turnstile handling the flow of presenting the challenge for the subset of rate limited routes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for writing this detailed plan, @obulat. It's a nice explanation of the steps to incorporate Cloudflare-provided rate-limiting for the frontend and better control the use of Openverse. It sounded quite challenging at first, but I now have a clearer idea of what it entails after reading it. I only have a couple of loose points, namely:
- The cookie that will help identify each user
- How will be the local usage of Openverse after these changes are applied
...s/proxy_frontend_api_requests/20241202-implementation_plan_proxy_frontend_api_requests_ip.md
Outdated
Show resolved
Hide resolved
...s/proxy_frontend_api_requests/20241202-implementation_plan_proxy_frontend_api_requests_ip.md
Outdated
Show resolved
Hide resolved
This implementation plan outlines how we will route all frontend traffic to the | ||
API through Nuxt server routes — effectively “authenticating” these requests at | ||
the API level — and configure Cloudflare rate-limiting to protect these routes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would requests from the same user also be identified at the API level? I understand that everything will be seen as "coming from the frontend." For example, is the cookie value passed to the API, too?
- From the | ||
[k6 implementation](https://github.com/WordPress/openverse/pull/4908), copy | ||
the helper function that signs the request with HMAC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarayourfriend, your bullet points helped me picture this and made me realize that the cookie is probably an important part I didn't get. Thank you for summarizing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have much to add beyond what @sarayourfriend and @krysal have already covered. So 👍 from me to move into the next round of incorporating their feedback.
The turnstile module was a confusing aspect of the plan for me but it seems from existing feedback that a second version of the IP will not be needing it. (I will still check out the module, in case it still ends up being used for the plan).
- **Proxying thumbnail requests** The thumbnail requests from the frontend will | ||
remain anonymous on the API level. Since controlling search and related | ||
requests naturally throttles excessive thumbnail retrieval, we consider it | ||
unnecessary to further load the Nuxt server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this also similarly applicable for audio waveforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I forgot about the waveforms 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder the same as Dhruv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that we request the waveforms when we add the peaks=true
param to the audio requests. They are added to each audio object in the peaks
array. So, there is no separate request.
c463217
to
985bfdc
Compare
...s/proxy_frontend_api_requests/20241202-implementation_plan_proxy_frontend_api_requests_ip.md
Outdated
Show resolved
Hide resolved
(http.request.headers["x-h3-session"][0] ne "") | ||
or | ||
(http.request.headers["Cookie"] contains "ovSession") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are or would there be two different session cookies? If so, why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be only one session cookie. Did I use different names throughout the IP? That was a mistake, if so. I'll fix it.
However, there should be both the header and the cookie, because h3
useSession
sets both, and the priority is given to the header.
2. User 1: In one browser, simulate a user with a valid session cookie, | ||
confirming they are treated as authenticated and bypass stricter rate limits. | ||
3. User 2: In the other browser, simulate an anonymous user without a session | ||
cookie. Verify that their requests are subject to stricter IP-based rate | ||
limits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little lost here. How are we supposed to simulate each user? If I understand correctly, by simply using the frontend app the session cookie should be added after the first response, once the Cloudflare challenge is resolved correctly, so I guess the User 2 could be the tricky case, or should it simply one happen after another? If that is the case, then it is brilliant and fabulous that the test can be so simple!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, simply using the frontend app should add the session cookie. So, what we can test here is the fact that even if the same IP sent more requests than allowed by the rate limit, the next user with the same IP can see the challenge and use Openverse after the challenge is passed.
Here's how it can be done:
- Set the staging rate limit to something like 2 requests per 10 minutes
- Open several tabs in incognito mode, and visit the staging app. After the first two tabs, you should see the Cloudflare challenge. It should be easy to pass the challenge.
To test that it's not easy to pass the rate-limiting, we can send 3 requests to staging. The 3rd request should return the Cloudflare challenge, and not pass through the Cloudflare proxy.
##### Real rate limits | ||
|
||
Calculate the actual rate limits based on the expected traffic. The rate limits | ||
should be generous enough to accommodate normal user behavior. Later, they can | ||
be made stricter to deter automated abuse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was exactly wondering about this. Sounds fair, it can be defined or refined during/after the implementation 👌
...s/proxy_frontend_api_requests/20241202-implementation_plan_proxy_frontend_api_requests_ip.md
Outdated
Show resolved
Hide resolved
|
||
- `/api/search/[type]` - for search requests | ||
- `/api/[type]/[id]` - for single result requests | ||
- `/api/[type]/related/[id]` - for related media requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep consistency with API routes.
- `/api/[type]/related/[id]` - for related media requests | |
- `/api/[type]/[id]/related/` - for related media requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might make it more difficult to handle. With the proposed structure, the single and related pages would be /api/[type]/[id].ts
and /api/[type]/related/[id].ts
. If we move the [id]
in the related route, the single result route might also catch the related requests because it could match on /api/[type]/[id].ts
. Can we keep this as is, and change it when implementing, if possible, @krysal ?
- **Proxying thumbnail requests** The thumbnail requests from the frontend will | ||
remain anonymous on the API level. Since controlling search and related | ||
requests naturally throttles excessive thumbnail retrieval, we consider it | ||
unnecessary to further load the Nuxt server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder the same as Dhruv.
...s/proxy_frontend_api_requests/20241202-implementation_plan_proxy_frontend_api_requests_ip.md
Outdated
Show resolved
Hide resolved
...s/proxy_frontend_api_requests/20241202-implementation_plan_proxy_frontend_api_requests_ip.md
Outdated
Show resolved
Hide resolved
I'm glad the proposal has been simplified by removing the Nuxt Turnstile module requirement 😄 I have a small question about the number of sessions, other than that, there are minor comments/suggestions. Once clarified, I expect a quick approval. Thanks for writing this, @obulat. It looks very promising. |
Based on the high urgency of this PR, the following reviewers are being gently reminded to review this PR: @sarayourfriend Excluding weekend1 days, this PR was ready for review 21 day(s) ago. PRs labelled with high urgency are expected to be reviewed within 2 weekday(s)2. @obulat, if this PR is not ready for a review, please draft it to prevent reviewers from getting further unnecessary pings. Footnotes
|
…0241202-implementation_plan_proxy_frontend_api_requests_ip.md Co-authored-by: Krystle Salazar <krystle.salazar@automattic.com>
…0241202-implementation_plan_proxy_frontend_api_requests_ip.md Co-authored-by: Krystle Salazar <krystle.salazar@automattic.com>
…0241202-implementation_plan_proxy_frontend_api_requests_ip.md Co-authored-by: Krystle Salazar <krystle.salazar@automattic.com>
Due date:
2024-01-11
Assigned reviewers
Description
Related to #3473
Current round
This discussion is following the Openverse decision-making process. Information about this process can be found
on the Openverse documentation site.
Requested reviewers or participants will be following this process. If you are being asked to give input on a specific detail, you do not need to familiarise yourself with the process and follow it.
This discussion is currently in the Decision round. The deadline for review of this round is 2024-01-11
Revision details
[2024-01-08] I updated the plan based on the comments. The main changes are:
h3
session for identifying users: The plan now uses h3 sessions to create a verifiable session header/cookie. Cloudflare will validate that the header/cookie exists, allowing requests to pass even if the cookie is invalid (e.g., generated by automated requests). The Nuxt server will then decode the cookie using the secret and verify its validity. Invalid cookies will trigger a 401 Unauthorized response from the Nuxt server, preventing these requests from reaching the Django API. This ensures robust protection against automated abuse./_nuxt/
,.css
,.jpg
) are excluded from rate limiting to streamline traffic and reduce unnecessary challenges.Here's a branch with a draft of changes implementing this project (single commit for all changes): https://github.com/WordPress/openverse/tree/add/api-proxying