-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodically Check For Migration Opportunities #214
Conversation
…nt of time, and a simple test.
fc4c9d0
to
754ee92
Compare
754ee92
to
0a971c8
Compare
…moving from in-flight map
src/scheduler/Scheduler.cpp
Outdated
// during the migration? msg.set_groupid(); | ||
|
||
if (migrationStrategy == | ||
faabric::util::MigrationStrategy::BIN_PACK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is quite similar to the makeSchedulingDecision
code but with some differences that made it preferrable to be implemented as a separate bit of code:
- First, we are not actually scheduling new functions, we are just checking if a particular kind of re-arrangement is possible.
- Second, we don't enforce the changes, therefore we actually don't have to persist them (in the
HostResources
).
2130201
to
89d47f8
Compare
89d47f8
to
aa9fcc8
Compare
} | ||
|
||
std::vector<std::shared_ptr<faabric::PendingMigrations>> | ||
Scheduler::doCheckForMigrationOpportunities( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need for a new method other than re-using makeSchedulingDecisions
arises from the fact that:
- In this method we just check for the ability to migrate, but do not actually change the underlying host resources.
- The scheduling and migration policies can differ. The former being more complex and the latter simpler.
aed0533
to
0b02aba
Compare
faabric::PendingMigrations msg; | ||
msg.set_appid(originalDecision.appId); | ||
|
||
if (migrationStrategy == faabric::util::MigrationStrategy::BIN_PACK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't implemented the EMPTY_HOSTS
migration strategy, so I haven't implemented support an env. variable to set the migration strategy.
7f81c7c
to
446e90a
Compare
68f33b8
to
51dcc95
Compare
@@ -1554,12 +1568,10 @@ std::shared_ptr<InMemoryMpiQueue> MpiWorld::getLocalQueue(int sendRank, | |||
// Note - the queues themselves perform concurrency control | |||
void MpiWorld::initLocalQueues() | |||
{ | |||
// Assert we only allocate queues once | |||
assert(localQueues.size() == 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remove this assertion as now this method is also called at the end of prepareMigration
to update the localQueues
with the new ranks to hosts mapping.
if (msg.recordexecgraph()) { | ||
world.setMsgForRank(msg); | ||
} | ||
world.setMsgForRank(msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need the message when migrating.
@@ -303,6 +309,9 @@ faabric::util::SchedulingDecision Scheduler::makeSchedulingDecision( | |||
|
|||
// Work out how many we can handle locally | |||
int slots = thisHostResources.slots(); | |||
if (topologyHint == faabric::util::SchedulingTopologyHint::UNDERFULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all the logic required to implement the UNDERFULL
topology hint. I could move it to a separate PR but it is literally just this and the tests.
// Remove the app from in-flight map if still there, and this host is the | ||
// master host for the message | ||
if (msg.masterhost() == thisHost) { | ||
removePendingMigration(msg.appid()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the PR description, this is the only time we remove a pending migration.
In particular, we don't remove it after we do a migration. It is not clear when it is safe to do that.
The main reason for the complexity to check is that, at the moment, we don't hit a faabric-based barrier after migrating. The PoC for a migration app is designed in a way that functions hit an application-based barrier after migrating, and that is how we know we are good to continue executing.
faabric::scheduler::FunctionCallClient& getFunctionCallClient( | ||
const std::string& otherHost); | ||
|
||
faabric::snapshot::SnapshotClient& getSnapshotClient( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the snapshot client to be a public member in faasm/faasm#565 before migrating a function. In there, we are doing a function chain from a function in a non-master host, to a function in the master host. Faabric won't register, take, and push snapshots in this case; we have to do it ourselves manually. Thus, we need access to the snapshot client from the scheduler instance.
We don't need the function call client to be a public member, but I was reluctant to split the declaration of the clients (happy to revert).
src/scheduler/Executor.cpp
Outdated
|
||
// MPI migration | ||
if (msg.ismpi()) { | ||
// TODO - delete the pending migration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this TODO
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, faabric does not know when the migration (app-wise) has finished. In fact, faabric does not even know when a migration is taking place. Faabric only is aware that some functions throw exceptions because they have been migrated.
The workaround (hack) I have implemented is the following:
- When we call
MpiWorld::prepareMigration
the local leader sets a boolean flag in the MPI world (hasBeenMigrated
). - Everytime we hit a barrier, after all ranks have hit the barrier (and before they have left it). The local leader checks for the
hasBeenMigrated
flag. - If
true
remove the pending migration from the map.
This uses the fact that faasm will call prepareMigration
before migrating, and will call MPI_Barrier
after.
Unfortunately, this can't really be tested from within faabric (I have tested it using the distributed test in faasm).
I can't think of a not ad-hoc way of doing this due to the lack of information in faabric.
…heck for flag in barriers to remove used pendingMigrations in the scheduler
…tion, remove unnecessary field to callFunctions, and re-factor necessary calls to callFunctions
In this PR I introduce the ability to migrate detect migration opportunities, and migrate MPI functions.
Function migration in faabric is divided in two independent processes:
Detection of migration opportunities:
MigrationStrategy
but currently onlyMigrationStrategy::BIN_PACK
is supported: we try to pack functions from hosts with fewer to hosts with more. In-flight apps willing to be checked for migration opportunities, and the opportunities found are stored in shared data structures.Performing the actual migration:
MpiWorld::prepareMigration
.Known issues:
pendingMigration
when execution finishes. Even if we migrate. It is not clear when is it safe to remove a migration after migrating.