-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run transferbot
in a Kubernetes Job
#830
Conversation
2bc1ccc
to
5442ba3
Compare
bb9285e
to
b1711ef
Compare
5442ba3
to
600e9c2
Compare
007400b
to
8cbfe9d
Compare
app/Jobs/WikiEntityImportJob.php
Outdated
$import = WikiEntityImport::findOrFail($this->importId); | ||
$creds = $this->acquireCredentials($wiki->domain); | ||
|
||
$this->targetWikiUrl = str_contains($wiki->domain, "localhost") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ugly. I'd be happy to learn about a cleaner solution for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only suggestion I have is we revisit the behaviour of the TransferBot image again and do something like the sidecar proxy thing and always talk to that over http.
I agree this is a nasty leaking of knowing the details about the network environment of the job container into the api.
We could at least tighten it by parsing the url and checking it's the TLD of the FQDN. That would save us having some mystery errors in the future for localhostingofdinnerparties.wikibase.cloud
which this would inadvertently catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think parsing the domain and then checking its TLD is a good compromise.
'consumerName' => 'WikiEntityImportJob', | ||
'ownerOnly' => '1', | ||
'consumerVersion' => '1', | ||
'grants' => 'basic|highvolume|import|editpage|editprotected|createeditmovepage|uploadfile|uploadeditmovefile|rollback|delete|mergehistory', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure these could be reduced even further, but I am totally lost when debugging and was happy I found a working combo that is not "every permission ever".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised most of these are needed. I'm trying to think if there is a quick way for us to test these
app/Jobs/WikiEntityImportJob.php
Outdated
'containers' => [ | ||
0 => [ | ||
'hostNetwork' => true, | ||
'name' => 'run-qs-updater', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably change this from the copy paste.
Indeed inf we are looking for a name I'd be keen on EntityImport
and moving away from transferbot (some random chat thread we had one month ago: https://mattermost.wikimedia.de/swe/pl/4kao4tr49inobd4i45iuz8cqqo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the typo, my reasoning for using transferbot wording here was that this is the name of the image we're running. But maybe ditching this in favor of entity-import is better, yes.
5497376
to
f6328a1
Compare
* feat: scaffold entity import job * feat: fetch credentials from mediawiki * feat: scaffold control flow for supervising job * feat: add parameterized spec * chore: connect proper job to http layer * fix: adjust kubernetes job spec to pass * feat: add callback handling * refactor: move enum to dedicated file * fix: adjust oauth grants so they work with platform user * fix: tighten validation for domain parameter * test: add test case for import job * chore: change job configuration values * chore: lock transferbot image version * refactor: address review feedback * chore: bump transferbot image version
* Add API endpoints for handling entity imports (#823) * feat: scaffold model for storing data about transferbot runs * feat: scaffold controller logic * fix: wiki access needs to be scoped down to users * fix: use consistent plural form for table name * feat: dispatch mock job on creation * test: scaffold test setup * test: cover further functionality * fix: use proper controller reference * fix: store entire data * fix: only update pending imports * chore: import is not used * chore: rename dummy job * feat: add input validation * refactor: handle enum validation through laravel features * feat: add job for marking stalled entity imports as stalled (#833) * feat: tighten validation on entity parameters (#838) * Run `transferbot` in a Kubernetes Job (#830) * feat: scaffold entity import job * feat: fetch credentials from mediawiki * feat: scaffold control flow for supervising job * feat: add parameterized spec * chore: connect proper job to http layer * fix: adjust kubernetes job spec to pass * feat: add callback handling * refactor: move enum to dedicated file * fix: adjust oauth grants so they work with platform user * fix: tighten validation for domain parameter * test: add test case for import job * chore: change job configuration values * chore: lock transferbot image version * refactor: address review feedback * chore: bump transferbot image version * docs: add CHANGELOG
Ticket https://phabricator.wikimedia.org/T368024
Depends on wbstack/mediawiki#447, wbstack/transferbot#3