Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backingimage): backingimage ha eviction enhancement #2742

Merged
merged 3 commits into from
Jul 2, 2024

Conversation

ChanYiLin
Copy link
Contributor

@ChanYiLin ChanYiLin commented Apr 25, 2024

ref: longhorn/longhorn#2856

  1. Add minNumberOfCopies
  2. Add nodeSelector, diskSelector
  3. Change the Spec.Disks to Spec.DiskFileSpecMap and Add evictionRequested
  4. Periodically replenish the copies if the number of healthy copies less than the minNumberOfCopies or eviction situation
    • replenishBackingImageCopies()
      • if nonFailedCopies >= MinNumberOfCopies, we check if we need to replenish one copy for eviction
        • replenish one if NonEvictingCount < MinNumberOfCopies
      • if nonFailedCopies < MinNumberOfCopies
        • replenish one copy to meet the MinNumberOfCopies requirement.
  5. Periodically delete evicted copies
    • cleanupEvictionRequestedBackingImageCopies()
      • If there is no non evicted healthy copy, don't delete the copy.
      • Otherwise, delete the evicted copy.
  6. Add concurrent-backing-image-replenish-per-node-limit to limit the number of BackingImage being synced on the same node at the same time
    • if the number = 0, longhorn won't replenish the BackingImage to new node/disk
  7. Adjust replica scheduler, replica won't be scheduled to the node where BackingImage can not be stored.

@ChanYiLin ChanYiLin self-assigned this Apr 25, 2024
Copy link

mergify bot commented Apr 25, 2024

This pull request is now in conflict. Could you fix it @ChanYiLin? 🙏

@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 3 times, most recently from f6f2738 to cb486bd Compare April 26, 2024 07:21
@ChanYiLin
Copy link
Contributor Author

This pull request is now in conflict. Could you fix it @ChanYiLin? 🙏

fixed.

@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 2 times, most recently from 93b1320 to d1e592d Compare April 26, 2024 09:27
@ChanYiLin
Copy link
Contributor Author

ChanYiLin commented Apr 26, 2024

Hi @shuo-wu
I don't know the context why we always keep the first prepared file when cleanup backing image file on the disk in node controller here
I think maybe there is a case when a user tries to evict the file on that node.
Besides, the first prepared file has nothing special
If all the files are failed, we still cleanup the bids.Spec.NodeID and bids.Spec.DiskUUID to restart the prepare process.

So I changed the logic to

  • only starts to cleanup backing image file on the disk when the file is transferred
  • it is okay to clean up the file on the first prepared file

WDYT?
Thansk

@ChanYiLin ChanYiLin marked this pull request as ready for review April 26, 2024 09:33
@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 2 times, most recently from a598b5f to 260a658 Compare April 26, 2024 15:46
Copy link

mergify bot commented Apr 27, 2024

This pull request is now in conflict. Could you fix it @ChanYiLin? 🙏

@ChanYiLin
Copy link
Contributor Author

This pull request is now in conflict. Could you fix it @ChanYiLin? 🙏

fixed.

@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 3 times, most recently from 5bcae3b to 72fc205 Compare May 2, 2024 07:52
Copy link

mergify bot commented May 2, 2024

This pull request is now in conflict. Could you fix it @ChanYiLin? 🙏

@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 2 times, most recently from 2226465 to 5d5160f Compare May 23, 2024 07:06
@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 2 times, most recently from 5eed3ff to b108be3 Compare June 7, 2024 04:41
@ChanYiLin
Copy link
Contributor Author

@ChanYiLin ChanYiLin requested a review from shuo-wu June 11, 2024 09:41
controller/backing_image_controller.go Outdated Show resolved Hide resolved
controller/backing_image_manager_controller.go Outdated Show resolved Hide resolved
controller/backing_image_controller.go Show resolved Hide resolved
@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 5 times, most recently from ede6a74 to 8973d98 Compare June 19, 2024 04:36
@ChanYiLin ChanYiLin requested a review from shuo-wu June 19, 2024 04:44
Copy link

mergify bot commented Jul 1, 2024

This pull request is now in conflict. Could you fix it @ChanYiLin? 🙏

Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will continue reviewing the controllers and scheduler tomorrow morning.

k8s/pkg/apis/longhorn/v1beta2/backingimage.go Show resolved Hide resolved
types/setting.go Outdated Show resolved Hide resolved
types/types.go Outdated Show resolved Hide resolved
types/types.go Outdated Show resolved Hide resolved
webhook/resources/backingimage/mutator.go Outdated Show resolved Hide resolved
webhook/resources/backingimage/mutator.go Outdated Show resolved Hide resolved
webhook/resources/backingimage/mutator.go Outdated Show resolved Hide resolved
webhook/resources/backingimage/mutator.go Outdated Show resolved Hide resolved
Copy link
Contributor

@shuo-wu shuo-wu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM except for the some comments from Derek

controller/backing_image_controller.go Show resolved Hide resolved
controller/node_controller.go Show resolved Hide resolved
controller/node_controller.go Show resolved Hide resolved
datastore/longhorn.go Outdated Show resolved Hide resolved
controller/node_controller.go Outdated Show resolved Hide resolved
controller/node_controller.go Outdated Show resolved Hide resolved
controller/replica_controller.go Outdated Show resolved Hide resolved
controller/backing_image_manager_controller.go Outdated Show resolved Hide resolved
controller/backing_image_manager_controller.go Outdated Show resolved Hide resolved
controller/backing_image_manager_controller.go Outdated Show resolved Hide resolved
@ChanYiLin ChanYiLin force-pushed the LH2856_BackingImage_HA branch 2 times, most recently from b7ae78e to 5a8e760 Compare July 2, 2024 07:12
ref: longhorn/longhorn 2856

Signed-off-by: Jack Lin <jack.lin@suse.com>
ref: longhorn/longhorn 2856

Signed-off-by: Jack Lin <jack.lin@suse.com>
… ready node and disk for BackingImage

ref: longhorn/longhorn 2856

Signed-off-by: Jack Lin <jack.lin@suse.com>
Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants