Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flytepropeller][flyteadmin] Streaming Decks V2 #6053

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

Future-Outlier
Copy link
Member

@Future-Outlier Future-Outlier commented Nov 27, 2024

Tracking issue

#5574

Why are the changes needed?

To enhance user visibility into Flyte Decks at different stages of workflow execution (running, failing, and succeeding), enabling better debugging and analysis.

Summary

Condition Description Has Deck
Enabled and Running Yes
Unknown State with Deck Yes
Unknown State without Deck No
Enabled and Succeeded Yes
Enabled but Memory Exceeded No
Disabled No

What changes were proposed in this pull request?

2025/01/14 update

  1. add a BoolValue, it can return 3 values, which can represent 3 status
    nil -> DeckUnknown
    true -> DeckEnabled
    false -> DeckDisabled

Concept:

  1. propeller will turn node info to NodeExecutionEvent, and send it to admin.

nev, err := ToNodeExecutionEvent(
nCtx.NodeExecutionMetadata().GetNodeExecutionID(),
p,
nCtx.InputReader().GetInputPath().String(),
nCtx.NodeStatus(),
nCtx.ExecutionContext().GetEventVersion(),
nCtx.ExecutionContext().GetParentInfo(), nCtx.Node(),
c.clusterID,
nCtx.NodeStateReader().GetDynamicNodeState().Phase,
c.eventConfig,
targetEntity)

Life Cycle:

use new flytekit > 1.14.0

summary:

  1. NO HEAD request to be called. (save resource)
  2. use config from task template to know whether enable deck or not

details:

  1. propeller keep adding DeckURI when the task is running if FLYTE_ENABLE_DECK=true in the task template.
  2. propeller will put DeckURI to node info, and turn it to NodeExecutionEvent to flyte admin.
  3. flyte admin will add DeckURI to Closure
  4. flyte console will get DeckURI by sending request to admin.
    nativeURL = node.GetClosure().GetDeckUri()
    }
    } else {
    return nil, errors.NewFlyteAdminErrorf(codes.InvalidArgument, "unsupported source [%v]", reflect.TypeOf(req.GetSource()))
    }
    if len(nativeURL) == 0 {
    return nil, errors.NewFlyteAdminErrorf(codes.Internal, "no deckUrl found for request [%+v]", req)
    }
    ref := storage.DataReference(nativeURL)
    meta, err := s.dataStore.Head(ctx, ref)
    if err != nil {
    return nil, errors.NewFlyteAdminErrorf(codes.Internal, "failed to head object before signing url. Error: %v", err)
    }
  5. if flyte console can't get the DeckURI from the node Closure, it will not show the Flyte Deck button.

old flytekit <= 1.14.0

summary:

  1. we keep the backward compatible (show deck when succeed)

details:

  1. In the terminal state, use a HEAD request to know if the Deck URI exists or not.
    if exist, then put it to the node info.

How was this patch tested?

  1. unit test and remote execution.

python code:

from flytekit import ImageSpec, task, workflow
from flytekit.deck import Deck

flytekit_hash = "6b55930d0a77efc3594ebaac056f2c75024e61b5"
flytekit = f"git+https://github.com/flyteorg/flytekit.git@{flytekit_hash}"

# Define custom image for the task
custom_image = ImageSpec(packages=[flytekit],
                            apt_packages=["git"],
                            registry="localhost:30000",
                            env={"FLYTE_SDK_LOGGING_LEVEL": 10},
                         )

@task(enable_deck=False, container_image=custom_image)
def t_no_deck():
    # Deck.publish()
    print("No Deck")

@task(enable_deck=True, container_image=custom_image)
def t_deck():
    import time
    """
    1st deck only show timeline deck
    2nd will show
    """
    for i in range(3):
        Deck.publish()
        time.sleep(1)

@task(enable_deck=True, container_image=custom_image)
def t_fail_deck():
    import time

    for i in range(3):
        Deck.publish()
        time.sleep(3)
    time.sleep(10)
    raise ValueError("Failed Deck")

@workflow
def wf():
    t_no_deck()
    t_deck()
    t_fail_deck()

if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner
    import os

    runner = CliRunner()
    path = os.path.realpath(__file__)

    result = runner.invoke(pyflyte.main,
                           ["run", path, "t_no_deck"])
    print("Local Execution: ", result.output)

    result = runner.invoke(pyflyte.main,
                           ["run", "--remote", path,"wf"])
    print("Remote Execution: ", result.output)

Setup process

single binary.

flyte: this branch
flytekit: flyteorg/flytekit#2779
flyteconsole: flyteorg/flyteconsole#890

Screenshots

flytekit branch:
flyteorg/flytekit#2779

NEW FLYTEKIT, NO DECK, RUNNING With Deck, SUCCEED, and FAILED

OSS-STREAMING-DECK-small.mov

OLD FLYTEKIT, NO DECK, RUNNING With Deck, SUCCEED, and FAILED

OSS-STREAMING-DECK-OLD-FLYTEKIT-small.mov

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

follow up questions

  1. should we support Abort phase for the streaming deck?

should we support EPhaseAbort in this file?

https://github.com/flyteorg/flyte/blob/b3330ba4430538f91ae9fc7d868a29a2e96db8bd/flytepropeller/pkg/controller/nodes/handler/transition_info.go

  1. how can we support the auto-refresh UX?

Summary by Bito

This PR enhances Flyte Decks V2 streaming functionality with tri-state support through BoolValue and improved protocol buffer definitions. The changes implement refined deck URI handling with task condition-based status determination and enhanced terminal state management. Updates include renaming environment variable fields and improving secret handling mechanisms across TypeScript, Go, JavaScript, Python, and Rust implementations. The implementation maintains backward compatibility while optimizing workflow visibility and cluster assignment capabilities.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 5

Future-Outlier and others added 2 commits November 27, 2024 23:36
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Co-authored-by: Yi Cheng <luyc58576@gmail.com>
Co-authored-by: pingsutw  <pingsutw@apache.org>
Copy link

codecov bot commented Nov 27, 2024

Codecov Report

Attention: Patch coverage is 61.72840% with 31 lines in your changes missing coverage. Please review.

Project coverage is 37.07%. Comparing base (729e71a) to head (1d18265).

Files with missing lines Patch % Lines
...lytepropeller/pkg/controller/nodes/task/handler.go 61.25% 23 Missing and 8 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6053      +/-   ##
==========================================
+ Coverage   37.05%   37.07%   +0.01%     
==========================================
  Files        1318     1318              
  Lines      132638   132693      +55     
==========================================
+ Hits        49151    49191      +40     
- Misses      79237    79246       +9     
- Partials     4250     4256       +6     
Flag Coverage Δ
unittests-datacatalog 51.58% <ø> (ø)
unittests-flyteadmin 54.34% <100.00%> (+0.02%) ⬆️
unittests-flytecopilot 30.99% <ø> (ø)
unittests-flytectl 62.29% <ø> (ø)
unittests-flyteidl 7.23% <ø> (ø)
unittests-flyteplugins 53.85% <ø> (ø)
unittests-flytepropeller 42.72% <61.25%> (+0.02%) ⬆️
unittests-flytestdlib 55.29% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Future-Outlier <eric901201@gmail.com>
switch pluginTrns.pInfo.Phase() {
case pluginCore.PhaseSuccess:
// This is to prevent the console from potentially checking the deck URI that does not exist if in final phase(PhaseSuccess).
err = pluginTrns.RemoveNonexistentDeckURI(ctx, tCtx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this do a head call on the deck URI for every task that succeeds? Two thoughts here:
(1) does the flyteadmin merge algorithm then remove the deckURI from the execution metadata?
(2) this is incurring a 20-30ms performance degredation to every task execution

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will take a look tmr, thank you!!!

Copy link
Member Author

@Future-Outlier Future-Outlier Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this do a head call on the deck URI for every task that succeeds?

yes it will do a head call by RemoteFileOutputReader

func (r RemoteFileOutputReader) DeckExists(ctx context.Context) (bool, error) {
md, err := r.store.Head(ctx, r.outPath.GetDeckPath())
if err != nil {
return false, err
}
return md.Exists(), nil
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you know the performance degradation?
did you use grafana or other performance tools?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the flyteadmin merge algorithm then remove the deckURI from the execution metadata?

flyteadmin will set the deckURI in the execution metadata to nil if the propeller removes it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Future-Outlier
Copy link
Member Author

Future-Outlier commented Nov 27, 2024

How to test it?

  1. start a new sandbox
flytectl demo start --image futureoutlier/sandbox:deck-1205-1138 --force
  1. checkout streaming deck flytekit branch
cd flytekit
gh pr checkout 2779
  1. run a failure task (show deck after it failed)
from flytekit import ImageSpec, task, workflow
from flytekit.deck import Deck

flytekit_hash = "473ae1119af6f86c26c0790dee0affa3eb29be64"
flytekit = f"git+https://github.com/flyteorg/flytekit.git@{flytekit_hash}"

# Define custom image for the task
custom_image = ImageSpec(packages=[flytekit],
                            apt_packages=["git"],
                            registry="localhost:30000",
                            env={"FLYTE_SDK_LOGGING_LEVEL": 10},
                         )

@task(enable_deck=True, container_image=custom_image)
def t_deck():
    import time
    """
    1st deck only show timeline deck
    2nd will show
    """
    for i in range(5):
        Deck.publish()
        # # raise Exception("This is an exception")
        time.sleep(3)

@workflow
def wf():
    t_deck()

if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner
    import os

    runner = CliRunner()
    path = os.path.realpath(__file__)

    # result = runner.invoke(pyflyte.main,
    #                        ["run", path, "wf"])
    # print("Local Execution: ", result.output)

    result = runner.invoke(pyflyte.main,
                           ["run", "--remote", path,"wf"])
    # "--remote"
    print("Remote Execution: ", result.output)

@EngHabu
Copy link
Contributor

EngHabu commented Nov 28, 2024

Mind adding screenshots for the rendered deck and refresh to the PR description?

@Future-Outlier
Copy link
Member Author

Mind adding screenshots for the rendered deck and refresh to the PR description?

Yes no problem

@Future-Outlier
Copy link
Member Author

Mind adding screenshots for the rendered deck and refresh to the PR description?

its provided!
#6053 (comment)

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 2, 2025

Code Review Agent Run #f3ef5e

Actionable Suggestions - 1
  • flytepropeller/pkg/controller/nodes/task/handler.go - 1
    • Consider impact of removing deckPath parameter · Line 120-120
Additional Suggestions - 1
  • flytepropeller/pkg/controller/nodes/task/handler.go - 1
    • Consider removing unnecessary string cast · Line 43-43
Review Details
  • Files reviewed - 3 · Commit Range: 54aa165..65b6efe
    • flyteadmin/pkg/repositories/transformers/node_execution.go
    • flyteadmin/pkg/repositories/transformers/node_execution_test.go
    • flytepropeller/pkg/controller/nodes/task/handler.go
  • Files skipped - 0
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • OWASP (Security Vulnerability) - ✔︎ Successful
    • GOVULNCHECK (Security Vulnerability) - ✖︎ Failed
    • SNYK (Security Vulnerability) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 2, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Enhanced Deck URI Handling and Status Management

node_execution.go - Added deck URI support in node execution closure

node_execution_test.go - Added test cases for deck URI handling in node execution

handler.go - Implemented deck status management and streaming functionality

// - We relied on a HEAD request to check if the deck file exists, then added the URI to the event.
//
// After (new behavior):
// - If `FLYTE_ENABLE_DECK = true` is set in the task template config (requires Flytekit > 1.14.0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is no longer correct right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes super nice catch

@@ -380,6 +430,27 @@ func (t Handler) fetchPluginTaskMetrics(pluginID, taskType string) (*taskMetrics
return t.taskMetricsMap[metricNameKey], nil
}

func GetDeckStatus(ctx context.Context, tCtx *taskExecutionContext) (DeckStatus, error) {
// FLYTE_ENABLE_DECK is used when flytekit > 1.14.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update this comment


metadata := template.GetMetadata()
if metadata == nil {
return DeckUnknown, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct in older versions of flytekit? didn't tasks in the past also have this field? this means that this function will always return Disabled right for older versions of flytekit. meaning the condition on line 567 won't get triggered cuz it'll be disabled instead of unknown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, you are right, thinking solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

older versions of flytekit will return DeckUnknown, which ends up calling AddDeckURIIfDeckExists (which is the old code path where we would hit the blob store to check for the deck file before adding it to the events).

Signed-off-by: Future-Outlier <eric901201@gmail.com>
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 9, 2025

Code Review Agent Run Status

  • Limitations and other issues: ❌ Failure - Bito Code Review Agent didn't review this pull request automatically because it exceeded the size limit. No action is needed if you didn't intend for the agent to review it. Otherwise, you can initiate the review by typing /review in a comment below.

Comment on lines 432 to 445
// GetDeckStatus determines whether a task generates a deck based on its execution context.
//
// This function ensures backward compatibility with older Flytekit versions using the following logic:
// 1. For Flytekit > 1.14.3, the task template's metadata includes the `generates_deck` flag:
// - If `generates_deck` is set to true, it indicates that the task generates a deck, and DeckEnabled is returned.
// 2. If `generates_deck` is set to false or is not set (likely from older Flytekit versions):
// - DeckUnknown is returned as a placeholder status.
// - In terminal states, a HEAD request can be made to check if the deck file exists.
//
// In future implementations, a `DeckDisabled` status could be introduced for better performance optimization:
// - This would eliminate the need for a HEAD request in the final phase.
// - However, the tradeoff is that a new field would need to be added to FlyteIDL to support this behavior.

template, err := tCtx.tr.Read(ctx)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better comments!
cc @wild-endeavor

@Future-Outlier
Copy link
Member Author

Streaming Decks

#!/usr/bin/env bash

set -ex

ARCH="$(uname -m)"
case ${ARCH} in
x86_64|amd64)
  IMAGE_ARCH=amd64
  ;;
aarch64|arm64)
  IMAGE_ARCH=arm64
  ;;
*)
  >&2 echo "ERROR: Unsupported architecture: ${ARCH}"
  exit 1
  ;;
esac

FLYTECONSOLE_IMAGE="localhost:30000/flyteconsole:1216-2134"
IMAGE_DIGEST="$(docker manifest inspect --verbose localhost:30000/flyteconsole:1216-2134 | \
    jq --arg IMAGE_ARCH "${IMAGE_ARCH}" --raw-output \
      '.[] | select(.Descriptor.platform.architecture == $IMAGE_ARCH) | .Descriptor.digest')"

# Short circuit if we already have the correct distribution
[ -f cmd/single/dist/.digest ] && grep -Fxq ${IMAGE_DIGEST} cmd/single/dist/.digest && exit 0

# Create container from desired image
CONTAINER_ID=$(docker create localhost:30000/flyteconsole:1216-2134)
trap 'docker rm -f ${CONTAINER_ID}' EXIT

# Copy distribution
rm -rf cmd/single/dist
docker cp ${CONTAINER_ID}:/app cmd/single/dist
printf '%q' ${IMAGE_DIGEST} > cmd/single/dist/.digest

Signed-off-by: Future-Outlier <eric901201@gmail.com>
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 14, 2025

Code Review Agent Run Status

  • Limitations and other issues: ❌ Failure - Bito Code Review Agent didn't review this pull request automatically because it exceeded the size limit. No action is needed if you didn't intend for the agent to review it. Otherwise, you can initiate the review by typing /review in a comment below.

Signed-off-by: Future-Outlier <eric901201@gmail.com>
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 14, 2025

Code Review Agent Run Status

  • Limitations and other issues: ❌ Failure - Bito Code Review Agent didn't review this pull request automatically because it exceeded the size limit. No action is needed if you didn't intend for the agent to review it. Otherwise, you can initiate the review by typing /review in a comment below.

@Future-Outlier
Copy link
Member Author

/review

@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 14, 2025

Code Review Agent Run #ecb1b4

Actionable Suggestions - 2
  • flytepropeller/pkg/controller/nodes/task/handler.go - 2
Review Details
  • Files reviewed - 15 · Commit Range: 54aa165..db4b19e
    • flyteadmin/pkg/repositories/transformers/node_execution.go
    • flyteadmin/pkg/repositories/transformers/node_execution_test.go
    • flyteidl/clients/go/assets/admin.swagger.json
    • flyteidl/gen/pb-es/flyteidl/core/tasks_pb.ts
    • flyteidl/gen/pb-go/flyteidl/core/tasks.pb.go
    • flyteidl/gen/pb-go/gateway/flyteidl/service/admin.swagger.json
    • flyteidl/gen/pb-go/gateway/flyteidl/service/agent.swagger.json
    • flyteidl/gen/pb-go/gateway/flyteidl/service/external_plugin_service.swagger.json
    • flyteidl/gen/pb-js/flyteidl.d.ts
    • flyteidl/gen/pb-js/flyteidl.js
    • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.py
    • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.pyi
    • flyteidl/gen/pb_rust/flyteidl.core.rs
    • flyteidl/protos/flyteidl/core/tasks.proto
    • flytepropeller/pkg/controller/nodes/task/handler.go
  • Files skipped - 0
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

Comment on lines +192 to +198
func (p *pluginRequestedTransition) ObserveSuccess(outputPath storage.DataReference, taskMetadata *event.TaskNodeMetadata) {
if p.execInfo.OutputInfo == nil {
p.execInfo.OutputInfo = &handler.OutputInfo{
OutputURI: outputPath,
}
} else {
p.execInfo.OutputInfo.OutputURI = outputPath
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider validating OutputURI before update

Consider checking if the OutputURI is empty before updating it. The current implementation may overwrite an existing valid output URI with an empty one.

Code suggestion
Check the AI-generated fix before applying
 -	if p.execInfo.OutputInfo == nil {
 +	if p.execInfo.OutputInfo == nil && len(outputPath) > 0 {
  		p.execInfo.OutputInfo = &handler.OutputInfo{
  			OutputURI: outputPath,
  		}
  	} else {
 -		p.execInfo.OutputInfo.OutputURI = outputPath
 +		if len(outputPath) > 0 {
 +			p.execInfo.OutputInfo.OutputURI = outputPath
 +		}
  	}

Code Review Run #ecb1b4


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

}
}
pluginTrns.ObserveSuccess(tCtx.ow.GetOutputPath(), deckURI,
pluginTrns.ObserveSuccess(tCtx.ow.GetOutputPath(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider impact of method signature change

The ObserveSuccess method signature has been modified to remove the deckURI parameter, but the deck URI is now being added through separate methods. Consider if this change maintains backward compatibility with existing code.

Code suggestion
Check the AI-generated fix before applying
Suggested change
pluginTrns.ObserveSuccess(tCtx.ow.GetOutputPath(),
// TODO: Deprecated - deckURI parameter will be removed in future versions
pluginTrns.ObserveSuccess(tCtx.ow.GetOutputPath(), nil,

Code Review Run #ecb1b4


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Copy link
Contributor

@eapolinario eapolinario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Just a few minor comments.


metadata := template.GetMetadata()
if metadata == nil {
return DeckUnknown, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

older versions of flytekit will return DeckUnknown, which ends up calling AddDeckURIIfDeckExists (which is the old code path where we would hit the blob store to check for the deck file before adding it to the events).

@@ -88,6 +89,8 @@ message RuntimeMetadata {

// Task Metadata
message TaskMetadata {
// Remove generates_deck from here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a comment explaining that we're reusing the name generates_deck for the field number 15, but since we're changing its type we're reserving the old field number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes no problem

@@ -51,6 +51,7 @@ var childExecutionID = &core.WorkflowExecutionIdentifier{
const dynamicWorkflowClosureRef = "s3://bucket/admin/metadata/workflow"

const testInputURI = "fake://bucket/inputs.pb"
const DeckURI = "fake://bucket/deck.html"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test here showing that the deck uri also shows up in the case of failed executions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flyteidl/protos/flyteidl/core/tasks.proto Outdated Show resolved Hide resolved
flyteidl/protos/flyteidl/core/tasks.proto Outdated Show resolved Hide resolved
// This is for backward compatibility with older Flytekit versions.
// Older Flytekit versions did not set the `generates_deck` flag in the task template's metadata.
// So, we need to add deck URI to the event if it exists.
err = pluginTrns.AddDeckURIIfDeckExists(ctx, tCtx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we always check if the file exists in the terminal state? if flytekit fails to generate a deck for some reasons, we should not add deck_uri to the output info, right

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the concerns with having a deck_uri set in the event? flyteconsole will still make the call to ensure that the file exists before showing the final deck, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @eapolinario

Yes, but FlyteConsole currently needs to make an additional call to check if the task is in a terminal phase.

I think it's better to handle all the logic in Propeller, as this would make maintenance easier. It would also simplify FlyteConsole's implementation.

In summary:
We should keep as much backend logic in the backend as possible. This approach reduces the maintenance burden on FlyteConsole and improves the readability of the backend code.

Future-Outlier and others added 2 commits January 17, 2025 00:28
Co-authored-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Han-Ru Chen (Future-Outlier) <eric901201@gmail.com>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Han-Ru Chen (Future-Outlier) <eric901201@gmail.com>
@eapolinario eapolinario mentioned this pull request Jan 16, 2025
3 tasks
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 16, 2025

Code Review Agent Run #9d753e

Actionable Suggestions - 2
  • flytepropeller/pkg/controller/nodes/task/handler.go - 1
  • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.pyi - 1
    • Inconsistent type hints for generates_deck · Line 111-111
Review Details
  • Files reviewed - 11 · Commit Range: 54aa165..564dc5f
    • flyteadmin/pkg/repositories/transformers/node_execution.go
    • flyteadmin/pkg/repositories/transformers/node_execution_test.go
    • flyteidl/gen/pb-es/flyteidl/core/tasks_pb.ts
    • flyteidl/gen/pb-go/flyteidl/core/tasks.pb.go
    • flyteidl/gen/pb-js/flyteidl.d.ts
    • flyteidl/gen/pb-js/flyteidl.js
    • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.py
    • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.pyi
    • flyteidl/gen/pb_rust/flyteidl.core.rs
    • flyteidl/protos/flyteidl/core/tasks.proto
    • flytepropeller/pkg/controller/nodes/task/handler.go
  • Files skipped - 4
    • flyteidl/clients/go/assets/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/agent.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/external_plugin_service.swagger.json - Reason: Filter setting
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

return DeckUnknown, regErrors.Wrapf(err, "failed to read task template")
}

deckValue := template.GetMetadata().GetGeneratesDeck()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add nil check for metadata access

Consider adding error handling for the case when GetMetadata() returns nil. Currently, if template.GetMetadata() returns nil, the code will panic when calling GetGeneratesDeck() on a nil pointer.

Code suggestion
Check the AI-generated fix before applying
Suggested change
deckValue := template.GetMetadata().GetGeneratesDeck()
metadata := template.GetMetadata()
if metadata == nil {
return DeckUnknown, nil
}
deckValue := metadata.GetGeneratesDeck()

Code Review Run #9d753e


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

tags: _containers.ScalarMap[str, str]
pod_template_name: str
cache_ignore_input_vars: _containers.RepeatedScalarFieldContainer[str]
is_eager: bool
def __init__(self, discoverable: bool = ..., runtime: _Optional[_Union[RuntimeMetadata, _Mapping]] = ..., timeout: _Optional[_Union[_duration_pb2.Duration, _Mapping]] = ..., retries: _Optional[_Union[_literals_pb2.RetryStrategy, _Mapping]] = ..., discovery_version: _Optional[str] = ..., deprecated_error_message: _Optional[str] = ..., interruptible: bool = ..., cache_serializable: bool = ..., generates_deck: bool = ..., tags: _Optional[_Mapping[str, str]] = ..., pod_template_name: _Optional[str] = ..., cache_ignore_input_vars: _Optional[_Iterable[str]] = ..., is_eager: bool = ...) -> None: ...
generates_deck: _wrappers_pb2.BoolValue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent type hints for generates_deck

Consider updating the type hint for generates_deck to be consistent between the class attribute and constructor parameter. The class attribute uses _wrappers_pb2.BoolValue while the constructor parameter uses _Optional[_Union[_wrappers_pb2.BoolValue, _Mapping]].

Code suggestion
Check the AI-generated fix before applying
Suggested change
generates_deck: _wrappers_pb2.BoolValue
generates_deck: _Optional[_Union[_wrappers_pb2.BoolValue, _Mapping]]

Code Review Run #9d753e


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 16, 2025

Code Review Agent Run #4a1c3e

Actionable Suggestions - 0
Additional Suggestions - 1
  • flyteadmin/pkg/repositories/gormimpl/signal_repo_test.go - 1
Review Details
  • Files reviewed - 38 · Commit Range: 564dc5f..69ba94e
    • charts/flyte-binary/templates/deployment.yaml
    • charts/flyte-binary/values.yaml
    • docker/sandbox-bundled/manifests/complete-agent.yaml
    • docker/sandbox-bundled/manifests/complete.yaml
    • docker/sandbox-bundled/manifests/dev.yaml
    • flyteadmin/pkg/common/filters.go
    • flyteadmin/pkg/repositories/gormimpl/signal_repo_test.go
    • flyteidl/gen/pb-es/flyteidl/admin/launch_plan_pb.ts
    • flyteidl/gen/pb-es/flyteidl/core/security_pb.ts
    • flyteidl/gen/pb-es/flyteidl/core/tasks_pb.ts
    • flyteidl/gen/pb-es/flyteidl/core/workflow_pb.ts
    • flyteidl/gen/pb-go/flyteidl/admin/execution.pb.go
    • flyteidl/gen/pb-go/flyteidl/admin/launch_plan.pb.go
    • flyteidl/gen/pb-go/flyteidl/core/security.pb.go
    • flyteidl/gen/pb-go/flyteidl/core/tasks.pb.go
    • flyteidl/gen/pb-go/flyteidl/core/workflow.pb.go
    • flyteidl/gen/pb-js/flyteidl.d.ts
    • flyteidl/gen/pb-js/flyteidl.js
    • flyteidl/gen/pb_python/flyteidl/admin/execution_pb2.py
    • flyteidl/gen/pb_python/flyteidl/admin/launch_plan_pb2.py
    • flyteidl/gen/pb_python/flyteidl/admin/launch_plan_pb2.pyi
    • flyteidl/gen/pb_python/flyteidl/core/security_pb2.py
    • flyteidl/gen/pb_python/flyteidl/core/security_pb2.pyi
    • flyteidl/gen/pb_python/flyteidl/core/workflow_pb2.py
    • flyteidl/gen/pb_python/flyteidl/core/workflow_pb2.pyi
    • flyteidl/gen/pb_rust/flyteidl.admin.rs
    • flyteidl/gen/pb_rust/flyteidl.core.rs
    • flyteidl/protos/flyteidl/admin/execution.proto
    • flyteidl/protos/flyteidl/admin/launch_plan.proto
    • flyteidl/protos/flyteidl/core/security.proto
    • flyteidl/protos/flyteidl/core/tasks.proto
    • flyteidl/protos/flyteidl/core/workflow.proto
    • flytepropeller/pkg/webhook/k8s_secrets.go
    • flytepropeller/pkg/webhook/k8s_secrets_test.go
    • flytepropeller/pkg/webhook/utils.go
    • flytestdlib/cache/auto_refresh.go
    • flytestdlib/cache/auto_refresh_example_test.go
    • flytestdlib/cache/in_memory_auto_refresh.go
  • Files skipped - 5
    • charts/flyte-binary/README.md - Reason: Filter setting
    • flyteidl/clients/go/assets/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/agent.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/external_plugin_service.swagger.json - Reason: Filter setting
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 17, 2025

Code Review Agent Run #daa4f1

Actionable Suggestions - 0
Review Details
  • Files reviewed - 11 · Commit Range: 69ba94e..c992eae
    • flyteidl/gen/pb-es/flyteidl/core/security_pb.ts
    • flyteidl/gen/pb-go/flyteidl/core/security.pb.go
    • flyteidl/gen/pb-js/flyteidl.d.ts
    • flyteidl/gen/pb-js/flyteidl.js
    • flyteidl/gen/pb_python/flyteidl/core/security_pb2.py
    • flyteidl/gen/pb_python/flyteidl/core/security_pb2.pyi
    • flyteidl/gen/pb_rust/flyteidl.core.rs
    • flyteidl/protos/flyteidl/core/security.proto
    • flytepropeller/pkg/webhook/k8s_secrets.go
    • flytepropeller/pkg/webhook/k8s_secrets_test.go
    • flytepropeller/pkg/webhook/utils.go
  • Files skipped - 4
    • flyteidl/clients/go/assets/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/agent.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/external_plugin_service.swagger.json - Reason: Filter setting
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

@Future-Outlier
Copy link
Member Author

Future-Outlier commented Jan 17, 2025

Hi, @eapolinario @pingsutw @wild-endeavor

Now all of these cases will be considered.
let's push this PR!

Condition Description Has Deck
Enabled and Running Yes
Unknown State with Deck Yes
Unknown State without Deck No
Enabled and Succeeded Yes
Enabled but Memory Exceeded No
Disabled No

Signed-off-by: Future-Outlier <eric901201@gmail.com>
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 17, 2025

Code Review Agent Run #dc455e

Actionable Suggestions - 5
  • flytepropeller/pkg/controller/nodes/task/handler.go - 4
  • flyteadmin/pkg/repositories/transformers/node_execution_test.go - 1
Review Details
  • Files reviewed - 2 · Commit Range: c992eae..1d18265
    • flyteadmin/pkg/repositories/transformers/node_execution_test.go
    • flytepropeller/pkg/controller/nodes/task/handler.go
  • Files skipped - 0
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

Comment on lines +557 to +562
if (deckStatus == DeckUnknown || deckStatus == DeckEnabled) && pluginTrns.pInfo.Phase().IsTerminal() {
if err := pluginTrns.RemoveDeckURIIfDeckNotExists(ctx, tCtx); err != nil {
logger.Errorf(ctx, "Failed to remove deck URI if deck does not exist. Error: %v", err)
}
}
}()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider proper error handling for RemoveDeckURIIfDeckNotExists

Consider checking for errors from RemoveDeckURIIfDeckNotExists before proceeding with the task completion. The current implementation only logs the error but continues execution which could lead to inconsistent state.

Code suggestion
Check the AI-generated fix before applying
Suggested change
if (deckStatus == DeckUnknown || deckStatus == DeckEnabled) && pluginTrns.pInfo.Phase().IsTerminal() {
if err := pluginTrns.RemoveDeckURIIfDeckNotExists(ctx, tCtx); err != nil {
logger.Errorf(ctx, "Failed to remove deck URI if deck does not exist. Error: %v", err)
}
}
}()
if (deckStatus == DeckUnknown || deckStatus == DeckEnabled) && pluginTrns.pInfo.Phase().IsTerminal() {
if err := pluginTrns.RemoveDeckURIIfDeckNotExists(ctx, tCtx); err != nil {
logger.Errorf(ctx, "Failed to remove deck URI if deck does not exist. Error: %v", err)
// Return error to allow proper handling at higher levels
return pluginTrns, err
}
}
}
}()

Code Review Run #dc455e


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines +556 to +562
defer func() {
if (deckStatus == DeckUnknown || deckStatus == DeckEnabled) && pluginTrns.pInfo.Phase().IsTerminal() {
if err := pluginTrns.RemoveDeckURIIfDeckNotExists(ctx, tCtx); err != nil {
logger.Errorf(ctx, "Failed to remove deck URI if deck does not exist. Error: %v", err)
}
}
}()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting deck cleanup logic

Consider moving the deck URI cleanup logic to a separate function for better code organization and reusability. The deferred function could be simplified by extracting the logic into a named function.

Code suggestion
Check the AI-generated fix before applying
Suggested change
defer func() {
if (deckStatus == DeckUnknown || deckStatus == DeckEnabled) && pluginTrns.pInfo.Phase().IsTerminal() {
if err := pluginTrns.RemoveDeckURIIfDeckNotExists(ctx, tCtx); err != nil {
logger.Errorf(ctx, "Failed to remove deck URI if deck does not exist. Error: %v", err)
}
}
}()
defer cleanupDeckURI(ctx, tCtx, deckStatus, pluginTrns)
func cleanupDeckURI(ctx context.Context, tCtx *taskExecutionContext, deckStatus DeckStatus, pluginTrns *pluginRequestedTransition) {
if (deckStatus == DeckUnknown || deckStatus == DeckEnabled) && pluginTrns.pInfo.Phase().IsTerminal() {
if err := pluginTrns.RemoveDeckURIIfDeckNotExists(ctx, tCtx); err != nil {
logger.Errorf(ctx, "Failed to remove deck URI if deck does not exist. Error: %v", err)
}
}
}

Code Review Run #dc455e


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

p.execInfo.OutputInfo.DeckURI = deckURI
}

func (p *pluginRequestedTransition) RemoveDeckURIIfDeckNotExists(ctx context.Context, tCtx *taskExecutionContext) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider clearer function name

The function name RemoveDeckURIIfDeckNotExists seems to have a double negative and may be confusing. Consider renaming to ClearDeckURIIfMissing or similar for better clarity.

Code suggestion
Check the AI-generated fix before applying
Suggested change
func (p *pluginRequestedTransition) RemoveDeckURIIfDeckNotExists(ctx context.Context, tCtx *taskExecutionContext) error {
func (p *pluginRequestedTransition) ClearDeckURIIfMissing(ctx context.Context, tCtx *taskExecutionContext) error {

Code Review Run #dc455e


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

exists, err := reader.DeckExists(ctx)
if err != nil {
logger.Errorf(ctx, "Failed to check deck file existence. Error: %v", err)
p.execInfo.OutputInfo.DeckURI = nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider nil check for OutputInfo

Consider initializing OutputInfo if it's nil before attempting to set DeckURI to avoid potential nil pointer dereference in RemoveDeckURIIfDeckNotExists

Code suggestion
Check the AI-generated fix before applying
 @@ -102,2 +102,5 @@
 		logger.Errorf(ctx, "Failed to check deck file existence. Error: %v", err)
 +		if p.execInfo.OutputInfo == nil {
 +			p.execInfo.OutputInfo = &handler.OutputInfo{}
 +		}
 		p.execInfo.OutputInfo.DeckURI = nil

Code Review Run #dc455e


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines +202 to +209
error := &core.ExecutionError{
Code: "foo",
}
request := admin.NodeExecutionEventRequest{
Event: &event.NodeExecutionEvent{
Phase: core.NodeExecution_FAILED,
OutputResult: &event.NodeExecutionEvent_Error{
Error: error,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name shadows built-in error type

Consider using error as a variable name may shadow the built-in error type. Consider renaming to execError or similar.

Code suggestion
Check the AI-generated fix before applying
 -	error := &core.ExecutionError{
 +	execError := &core.ExecutionError{
 		Code: "foo",
 	}
 -			Error: error,
 +			Error: execError,

Code Review Run #dc455e


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Copy link
Member Author

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old flytekit screenshot.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants