Add private/auth awareness to datasource index #21652
Replies: 5 comments
-
Current use of hostRules in datasources:
|
Beta Was this translation helpful? Give feedback.
-
Probably the biggest uncertainty is around I can think of four scenarios for in-repo
Things to note:
|
Beta Was this translation helpful? Give feedback.
-
First step is #8470, which returns Some datasources like GitHub, npm and Docker Hub might have authorization set for every request, but that doesn't mean every result returned is private. Instead, there needs to be secondary logic added to confirm private/public through other means if the registry itself doesn't directly indicate it in metadata. Potentially such datasources could implement a new function Our approach should be either:
I think maybe keeping it as |
Beta Was this translation helpful? Give feedback.
-
I like this. Additionally I would not base the decision if a result should be cached onto the Instead the datasource should be able to return a In that case the datasource doesn't need to care about whether a datasource is public or private. E.g. Lets simplify and say that the docker resource should always be cached, in that case the datasource could return For another datasource it could incorporate the auth tokens into the cachekey, like The github release datasource could incorporate if the repository is private/public. This would move the logic on what to cache to the datasource but the actual caching will happen in the layer above. |
Beta Was this translation helpful? Give feedback.
-
Let's test the idea of the datasource index using Shortcomings:
Both the above can likely be addressed if credentials have the possibility to be host + package-aware, e.g.
If we did the latter, would that replace hostRules with credentials, or even hostRules altogether? Another challenge: hostRules supports domainName, hostName, and baseUrl. Would we need to support all for |
Beta Was this translation helpful? Give feedback.
-
To simplify datasource logic and particularly caching, we'd like that the centralized datasource dispatcher (index file) could be responsible for caching. i.e.
Potentially, private results could be cached too if we can assure we only return them when the same credentials are provided as the original request.
Therefore the dispatcher needs the ability to know if a result is private. I think we can simplify part of that by defining private as "if custom authentication was included with the request", i.e. Bearer or Basic authentication.
The dispatcher can only know this if either (a) responsibility for reading hostRules moves from the http module into dispatcher, or (b) the http module has a way to "pass back" the information that a custom authentication header was applied.
Point (a) does not seem possible because currently authentication headers are sometimes inserted by certain datasources too, not just by the http module.
If using point (b) then it means the dispatcher can't know all possible auth for all datasources so can simply implement "only cache public packages". Datasources could return back an optional new boolean field
private
.An improved future solution could look like:
.npmrc
parsing in npm. Instead if should be converted to packageRules and hostRules during the extract phaseThis approach could also solve our desire for path-based hostRules (#5825).
Note: self-hosted users could also choose via a new config option to cache all results, private or not - if there is no trust boundary between repositories. Doing so could result in a performance improvement compared to today.
Beta Was this translation helpful? Give feedback.
All reactions