Skip to content

Preemption pipeline failure can apply eviction side effects #5022

@Aman-Cool

Description

@Aman-Cool

Description

Summary

Preemption pipeline failures can cause victim task evictions to be applied even when preemption does not successfully complete, breaking the intended atomicity of scheduler Statement operations.


Impact

  • Victim tasks may be evicted without the preemptor being scheduled
  • Gang scheduling guarantees (e.g. MinAvailable) can be violated
  • Failed preemption attempts can cause unintended workload disruption

Background

In the current preemption flow:

  • Evictions are recorded on the scheduler Statement
  • The preemptor is then pipelined
  • If the pipeline fails, the preemptor state is rolled back, but recorded eviction operations may still be committed

This allows eviction side effects to escape failed preemption attempts.


Affected Code

  • pkg/scheduler/actions/preempt/preempt.go

    • Preemption pipeline execution and failure handling
  • pkg/scheduler/framework/statement.go

    • Operation recording and commit semantics
  • pkg/scheduler/framework/statement_test.go

    • Boundary behavior around rollback / discard scenarios

Design Question

What is the correct architectural approach to ensure eviction side effects are only committed when preemption succeeds?

Option A: Full Statement Save / Discard (Preferred Pattern)

  • Save the statement before eviction
  • Discard or recover it on pipeline failure
  • Only merge the statement when preemption completes successfully

Option B: Operation-Level Rollback

  • Track and explicitly roll back eviction operations on failure

This fixes the issue locally but may introduce undo semantics on Statement that were not originally intended.


Goal

Ensure failed preemption attempts never apply eviction side effects, while staying consistent with existing Statement design patterns.


Context

Identified while implementing a fix for eviction rollback on preemption pipeline failure. The initial implementation raised concerns about exposing rollback behavior on Statement, prompting this issue to align on the intended design before proceeding.


Steps to reproduce the issue

  1. Configure a workload where preemption is required to schedule a pending task
    (e.g. a gang-scheduled job that cannot be placed without evicting victim tasks).

  2. Trigger the preemption workflow so that victim task evictions are recorded on the scheduler Statement.

  3. Force the preemption pipeline to fail after eviction operations are recorded but before the preemptor is successfully scheduled
    (e.g. pipeline error, scheduling failure, or preemptor pod creation failure).

  4. Allow the scheduler Statement to be committed.

  5. Observe that victim tasks are evicted even though the preemptor was never scheduled.

Describe the results you received and expected

Expected Behavior

  • If the preemption pipeline fails, no eviction side effects should be committed
  • Victim tasks should remain running
  • Scheduler state should remain unchanged after a failed preemption attempt

Actual Behavior

  • Eviction operations recorded prior to pipeline failure are committed
  • Victim tasks are evicted despite preemption not succeeding

What version of Volcano are you using?

main branch @ current HEAD (pre-merge)

Any other relevant information

This issue is independent of Kubernetes version, OS, or kernel configuration.

The behavior is triggered by the scheduler preemption control flow and statement commit semantics, and can be reproduced in unit tests without a running cluster.

No additional logs or manifests are attached, as the issue is reproducible via scheduler logic and unit tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions