-
Notifications
You must be signed in to change notification settings - Fork 202
Move the capacity reporting into db_discovery #3778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3778 +/- ##
===================================================
+ Coverage 31.68993% 31.74015% +0.05022%
===================================================
Files 158 158
Lines 47564 47536 -28
===================================================
+ Hits 15073 15088 +15
+ Misses 31603 31560 -43
Partials 888 888
... and 3 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
// Only update network capabilities every 25 minutes | ||
if time.Since(dbo.lastNetworkCapabilitiesReported) >= networkCapabilitiesReportingInterval { | ||
// Save network capabilities in LivepeerNode | ||
dbo.node.UpdateNetworkCapabilities(orchNetworkCapabilities) | ||
|
||
dbo.lastNetworkCapabilitiesReported = time.Now() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we always send network capabilities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to leave the existing behaviour alone, so only sending to kafka every 25 mins, wdyt @ad-astra-video ? Going from 25 mins to 10 seconds seems wrong :)
cfg.LiveAIAuthWebhookURL = fs.String("liveAIAuthWebhookUrl", "", "Live AI RTMP authentication webhook URL") | ||
cfg.LivePaymentInterval = fs.Duration("livePaymentInterval", *cfg.LivePaymentInterval, "Interval to pay process Gateway <> Orchestrator Payments for Live AI Video") | ||
cfg.LiveOutSegmentTimeout = fs.Duration("liveOutSegmentTimeout", *cfg.LiveOutSegmentTimeout, "Timeout duration to wait the output segment to be available in the Live AI pipeline; defaults to no timeout") | ||
cfg.LiveAICapRefreshModels = fs.String("liveAICapRefreshModels", "", "Comma separated list of models to periodically fetch capacity for. Leave unset to switch off periodic refresh.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note that we need to be careful with deploying this change because if we have this flag configured in infra the gateway will fail to start because the flag does not exist anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's a pain, i guess maybe i could leave it in but not use it, then come along after the prod deploy and remove it
Since these two pieces of code were doing similar things, move capacity reporting into
db_discovery
, more importantly this also addresses the issue we have currently where we only track idle containers in the metrics as we were getting an OrchestratorCapped error if the O was busy.