-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the Additional search views IP #3868
Conversation
Full-stack documentation: https://docs.openverse.org/_preview/3868 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. New files ➕: Changed files 🔄: |
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Show resolved
Hide resolved
The generic Openverse thumbnail will be used. We could also generate a thumbnail | ||
for the collection pages in the future, but this is not in scope for this | ||
project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to keeping the scope constrained.
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some feedback for places where some small changes for clarity would be useful. Thank you for doing this! It can be a chore to document changes as they're being made but this has clarified my own understanding of the work needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving a few initial comments even though it's in draft. I'll be digesting the API Search controller changes. Things have changed a lot since the last time I looked at that part of the code.
...s/additional_search_views/SUPERSEDED-20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one blocking comment, which is to request clarification and deeper explanation of the implementation details for the validation of the query parameters. The current approach, as described, I am fairly confident will create similar problems as we had with path parameters. I could be misunderstanding the intention, so I am not explicitly requesting a specific change, just either clarification of the proposed approach or a change to address the potential problems I've mentioned.
The existing `source` and `creator` parameters will be reused, but will be | ||
parsed differently when `collection` parameter is present: they will only allow | ||
a single value instead of being split by `,` as it is for the default search. If | ||
the `source` contains invalid values, such as `source=flickr,europeana`, when | ||
the `collection` parameter is present, a 400 error with `detail` of "Invalid | ||
source parameter for source collection: `flickr,europeana`" is returned. If | ||
possible, a list of valid sources should be returned in the error message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify the actual implementation details of this? Will it use new validate_
methods on the serializer, and then for each one check if collection
is present, and if so, what? Will it throw an error any time a comma is present?
I don't think we can realiably validate this for anything other than source. For all collection parameters, they are passed as-is to the ES query: https://github.com//WordPress/openverse/blob/f56f7310e81ac0266807192803e4a451a8983766/api/api/controllers/search_controller.py#L399-L410
They already do not support multiple values.
Which, again, we need to keep in mind that we cannot reliably split on any character for creator or tag names, neither of which are sanitised to remove commas. What would the validator do if a Flickr creator's name was Photographer, Esq."? What if the tag is "Taipei, Taiwan"? Will these throw errors? How would a user resolve them? By URL encoding the commas? I'm not sure that will work, they may need to double encode them. Normally, you can pass multiple values for a query parameter by using
param=1,param=2`:
> p = new URLSearchParams()
URLSearchParams {}
> p.append("param", "1")
undefined
> p.append("param", "2")
undefined
> p.toString()
'param=1¶m=2'
Some frameworks require adding []
to the front of explicit lists.
We, instead, have chosen to use a comma, which creates significant complication for things like creator. Luckily, we already do not treat creator on the search endpoint as a "listable" input. It is passed as-is in the normal search strategy:
I'm highly sceptical we can reliably validate these inputs, and should instead pass them directly to ES regardless of what is in them. We could validate source to check that it exists, so that flickr,wikimedia_commons
would 404 if passed as source
with collection
because there is no flickr,wikimedia_commons
source. But trying to validate and prevent multiple inputs on tag and creator looks to me like it will create similar issues that we had with the path parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked a similar question and Olga shared that the intention was to only validate the source, yes.
Good call for more specifics here though; I overlooked that this still wasn't absolutely clear in my re-review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add more clarification in the IP itself, but will also add a short explanation here.
For collection view, only two validation steps will be used:
- if
collection
parameter is present, we check that the relevant parameters are also present:
- if
collection=source
, there is thesource
value in the query - if
collection=creator
, there arecreator
andsource
values in the query - if
collection=tag
, there is thetag
value in the query
- if collection is
source
orcreator
, we check that the value of thesource
parameter, as is, is present in the list of available creators for the media type. So,flickr
orstocksnap
will be valid, butflick
,Flickr
,flickr,stocksnap
will throw an error. We will not split this values by comma or lower-case them. And since we control the creator names, it should be okay.
As for the tag
and creator
parameters, they will be passed to the search controller "as is".
By the way, while trying this out, I realized that our current source validation is not very clear. We take a string of the source
value, split it by comma, leave all of the values that are names of sources and ignore invalid values. So, https://api.openverse.engineering/v1/images/?source=flick
ignores the source parameter and returns all of the images. This might be very confusing if the user has made a spelling mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds, good, Olga. Thanks for clarifying. It would resolve the issue for me if you update this paragraph to clarify that only the source parameter would ever be validated in any capacity. As it reads now, the creator value would also be validated, but I think that's an artefact of trying to clarify the difference between the normal search route's handling of these two, and the creator parameter getting sidelined a bit.
By the way, while trying this out, I realized that our current source validation is not very clear. We take a string of the source value, split it by comma, leave all of the values that are names of sources and ignore invalid values. So, https://api.openverse.engineering/v1/images/?source=flick ignores the source parameter and returns all of the images. This might be very confusing if the user has made a spelling mistake.
That sounds like a good new issue, maybe even a "help wanted" one!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarayourfriend, I opened a discussion instead of an issue for this because I'm not sure what the best solution is. #3895
I updated the serializer and validator descriptions, so this PR is ready for re-review, @sarayourfriend. |
Changes addressed, not an official reviewer so deferring to @krysal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that there is a certain consensus. Now that I have read the API part more carefully, I have a couple of doubts that I would like to clarify before approving.
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Show resolved
Hide resolved
...ts/proposals/additional_search_views/20230719-implementation_plan_additional_search_views.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Olga Bulat <obulat@gmail.com>
…719-implementation_plan_additional_search_views.md Co-authored-by: zack <6351754+zackkrida@users.noreply.github.com>
Signed-off-by: Olga Bulat <obulat@gmail.com>
Signed-off-by: Olga Bulat <obulat@gmail.com>
…719-implementation_plan_additional_search_views.md Co-authored-by: Krystle Salazar <krystle.salazar@automattic.com>
Signed-off-by: Olga Bulat <obulat@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The corrections make it a lot clearer. Thanks for re-writing this @obulat! LGTM.
Fixes
Updates the Implementation Plan in line with the discussion conclusions
Description
This PR updates the IP to move from path parameters to the query parameters. Also, SEO-related changes were added, too.
Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin