-
Notifications
You must be signed in to change notification settings - Fork 0
Add science purpose #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
mbtaylor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks basically good, I made a couple of suggestions.
| IVOA-science. The purpose of this rule is to help operators to throttle | ||
| indiscriminate downloads by ``stupid'' crawlers (like the harvesters | ||
| employed to gather training material for AI models around 2025) without | ||
| impacting common clients; for instance, rate limits could be tight | ||
| without a conforming user agent header. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say simply "the purpose of this rule is [throttling]", since there are other use cases, for instance managing usage statistics. Possible alternative wording:
Presence of this header provides a means to identify requests by known VO-aware clients as distinct from those by potentially indiscriminate crawlers like the harvesters employed to gather training material for AI models around 2025. This information may be used for instance to throttle indiscriminate downloads by applying tighter rate limits for requests without a conforming user-agent header, or for better understanding of usage statistics by distinguishing known science queries.
| The access was done to directly support a science case. This explicitly | ||
| includes education and training, in particular because we do not want to | ||
| suggest that software used in such settings -- which plausibly is going | ||
| to be the same as software used in pure research -- should be | ||
| reconfigured for them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about "to directly support a science case"; I'd suggest something a bit more woolly like "in support of science usage" or "in the context of science usage". I think the main target here is to differentiate clients that understand the VO/astronomy services they are engaging with from those that are just hitting anything they can find. From a practical point of view, at least for clients like topcat and stilts, it's not likely to be feasible to get them to present different user-agent headers on the basis of the user intention for particular
requests, only on the basis of the tools in use.
Given that I'm wondering if there's a different term than "science" that should be used here, but I don't have great suggestions. IVOA-voclient or just IVOA-client maybe?
|
On Tue, Jan 06, 2026 at 08:04:57AM -0800, Mark Taylor wrote:
@mbtaylor commented on this pull request.
> Presence of this header provides a means to identify requests by
> known VO-aware clients as distinct from those by potentially
I like that better, too, and thus I've (by and large) adopted it in
commit 8d666479.
Given that I'm wondering if there's a different term than "science"
that should be used here, but I don't have great suggestions.
IVOA-voclient or just IVOA-client maybe?
Hm... I don't like "client" here because crawlers and validators
arguably are clients, too. I'll try the IVOA mailing lists for more
precise terminology. I give you, anyway, that it's a bit lame to say
"science" and then say "but it's education, too".
|
This PR mainly reverts the previous stance that "normal" requests have no special ivoa tag in hopes to develop a marker for "well-behaved client".