Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine identity validation #1897

Merged
merged 6 commits into from
Aug 4, 2023
Merged

Conversation

ejweber
Copy link
Contributor

@ejweber ejweber commented May 11, 2023

longhorn/longhorn#5845

This PR makes use of the changes in longhorn-engine (longhorn/longhorn-engine#902) and longhorn-instance-manager (longhorn/longhorn-instance-manager#229) to ensure identity validation during normal Longhorn operation.

@mergify
Copy link

mergify bot commented May 22, 2023

This pull request is now in conflicts. Could you fix it @ejweber? 🙏

@mergify
Copy link

mergify bot commented Jun 21, 2023

This pull request is now in conflicts. Could you fix it @ejweber? 🙏

@ejweber ejweber force-pushed the 5845-engine-identity branch 2 times, most recently from e899de5 to c860e0b Compare June 26, 2023 22:04
@mergify
Copy link

mergify bot commented Jun 28, 2023

This pull request is now in conflicts. Could you fix it @ejweber? 🙏

@ejweber ejweber force-pushed the 5845-engine-identity branch 3 times, most recently from 04ec768 to 4eccc05 Compare July 3, 2023 17:58
@ejweber ejweber marked this pull request as ready for review July 7, 2023 18:03
@ejweber ejweber requested a review from a team as a code owner July 7, 2023 18:03
@ejweber
Copy link
Contributor Author

ejweber commented Jul 7, 2023

Core tests all pass except for live engine upgrade tests, which are failing because of longhorn/longhorn-engine#902 (comment).

https://ci.longhorn.io/job/private/job/longhorn-tests-regression/4521/

Upgrade tests are running.

https://ci.longhorn.io/job/private/job/longhorn-tests-regression/4524/

@mergify
Copy link

mergify bot commented Jul 16, 2023

This pull request is now in conflicts. Could you fix it @ejweber? 🙏

go.mod Outdated Show resolved Hide resolved
Copy link
Contributor

@PhanLe1010 PhanLe1010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: This PR is not a draft anymore, can we edit the PR desc?

engineapi/snapshot.go Outdated Show resolved Hide resolved
@ejweber
Copy link
Contributor Author

ejweber commented Jul 24, 2023

Realized I missed the engine upgrade case while testing. This prevents us from using engine identity validation for old engines that are live upgraded.

func (c *InstanceManagerClient) engineInstanceUpgrade(req *EngineInstanceUpgradeRequest) (*longhorn.InstanceProcess, error) {
if err := CheckInstanceManagerCompatibility(c.apiMinVersion, c.apiVersion); err != nil {
return nil, err
}
frontend, err := GetEngineInstanceFrontend(req.Engine.Spec.BackendStoreDriver, req.VolumeFrontend)
if err != nil {
return nil, err
}
args := []string{"controller", req.Engine.Spec.VolumeName, "--frontend", frontend, "--upgrade"}
for _, addr := range req.Engine.Spec.UpgradedReplicaAddressMap {
args = append(args, "--replica", GetBackendReplicaURL(addr))
}
if req.EngineCLIAPIVersion >= 6 {
args = append(args,
"--size", strconv.FormatInt(req.Engine.Spec.VolumeSize, 10),
"--current-size", strconv.FormatInt(req.Engine.Status.CurrentSize, 10))
}
if req.EngineCLIAPIVersion >= 7 {
args = append(args,
"--engine-replica-timeout", strconv.FormatInt(req.EngineReplicaTimeout, 10),
"--file-sync-http-client-timeout", strconv.FormatInt(req.ReplicaFileSyncHTTPClientTimeout, 10))
if req.DataLocality == longhorn.DataLocalityStrictLocal {
args = append(args,
"--data-server-protocol", "unix")
}
if req.Engine.Spec.UnmapMarkSnapChainRemovedEnabled {
args = append(args, "--unmap-mark-snap-chain-removed")
}
}
binary := filepath.Join(types.GetEngineBinaryDirectoryForEngineManagerContainer(req.Engine.Spec.EngineImage), types.EngineBinaryName)
if c.GetAPIVersion() < 4 {
process, err := c.processManagerGrpcClient.ProcessReplace(
req.Engine.Name, binary, DefaultEnginePortCount, args, []string{DefaultPortArg}, DefaultTerminateSignal)
if err != nil {
return nil, err
}
return parseProcess(imapi.RPCToProcess(process)), nil
}
instance, err := c.instanceServiceGrpcClient.InstanceReplace(string(req.Engine.Spec.BackendStoreDriver), req.Engine.Name,
string(longhorn.InstanceManagerTypeEngine), binary, DefaultEnginePortCount, args, []string{DefaultPortArg}, DefaultTerminateSignal)
if err != nil {
return nil, err
}
return parseInstance(instance), nil
}

  • Ensure validation flags are passed to live upgraded engines.

@ejweber
Copy link
Contributor Author

ejweber commented Jul 25, 2023

There is a new commit that fixes cloning, which is not included in the core tests. My code mistakenly assumed that the volume name we cloned from and to would be the same. This caused us to try to communicate with a from replica using the to volume name (triggering an invalid gRPC error). Please rereview when you are able.

cc @PhanLe1010 @james-munson

@ejweber ejweber closed this Jul 25, 2023
@ejweber ejweber reopened this Jul 25, 2023
shuo-wu
shuo-wu previously approved these changes Jul 31, 2023
Copy link
Contributor

@shuo-wu shuo-wu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

engineapi/engine.go Show resolved Hide resolved
Copy link
Contributor

@PhanLe1010 PhanLe1010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to pump this version to 5 too

CurrentInstanceManagerAPIVersion = 4

PhanLe1010
PhanLe1010 previously approved these changes Aug 1, 2023
Copy link
Contributor

@PhanLe1010 PhanLe1010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…gine identity flags and fields

Signed-off-by: Eric Weber <eric.weber@suse.com>
Signed-off-by: Eric Weber <eric.weber@suse.com>
Signed-off-by: Eric Weber <eric.weber@suse.com>
Signed-off-by: Eric Weber <eric.weber@suse.com>
Signed-off-by: Eric Weber <eric.weber@suse.com>
Signed-off-by: Eric Weber <eric.weber@suse.com>
@PhanLe1010 PhanLe1010 merged commit d36fc04 into longhorn:master Aug 4, 2023
3 checks passed
@ejweber
Copy link
Contributor Author

ejweber commented Aug 29, 2023

@mergify backport v1.5.x

@mergify
Copy link

mergify bot commented Aug 29, 2023

backport v1.5.x

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants