Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add managed environment POC #3021

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

granthamtaylor
Copy link
Contributor

@granthamtaylor granthamtaylor commented Dec 26, 2024

Why are the changes needed?

Task environments can quickly become unwieldy for large, complex codebases. There are well over a dozen commonly used configurations, from container_image to secret_requests that will be similar among many tasks, albeit arbitrarily different for some edge cases.

It is challenging to manage such environment configurations that contain a large number of such configurations while still being able to uniquely override the configurations for individual tasks, and extend a set of configurations to define entirely new environments.

This PR accomplishes both of these critical features in an intuitive manner.

What changes were proposed in this pull request?

These PR contributes a Environment class. An Environment contains a dictionary of configurations. These configurations are applied to flytekit.task at registration time.

However, the configurations of an Environment may be defined during the creation of the Environment, or during the creation of the task to be authored with an `Environment.

As a non-limiting example:

lite = Environment(
    container_image=fl.ImageSpec(builder="union", requirements="requirements.txt"),
    requests=fl.Resources(mem="2Gi"),
    retries=3,
    cache=True,
    cache_version="v0.0.3",
    secret_requests=[fl.Secret(key="WANDB_API_KEY")],
    environment={"PYTHONUNBUFFERED": "1"},
)

@lite.task
def my_task(...):
    # this will include all of the configurations defined in `lite`
    ...

@lite.task(retries=0)
def my_other_task(...):
    # this will include all of the configurations defined in `lite`, but with `retries` overwritten to 0
    ...

Additionally, one may create deep copies of an Environment

processor = lite.extend(
    requests=fl.Resources(cpu="8", mem="16Gi"),
    cache_serialize=True,
)

@processor.task
def my_big_task(...):
    # this will include all of the configurations defined in `processor `
    ...

@processor.task(retries=0)
def my_other_big_task(...):
    # this will include all of the configurations defined in `processor `, but with `retries` overwritten to 0

This allows for an organization to modularly define reusable environments once for an entire project, or, perhaps even define reusable environments for the entire organization.

How was this patch tested?

Tests to be added. I have personally been using this pattern for the last year with Flyte, and with KFP for over three years.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Summary by Bito

Introduction of a new Environment class in Flytekit for managing task configurations, providing modular and reusable functionality. The system enables creating, extending, and updating environment configurations with inheritance capabilities. This implementation focuses on improving configuration management flexibility and maintainability through a structured approach, allowing for complex configuration management through a reusable framework.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 1

Copy link

codecov bot commented Dec 26, 2024

Codecov Report

Attention: Patch coverage is 77.19298% with 13 lines in your changes missing coverage. Please review.

Project coverage is 76.08%. Comparing base (bc0e8c0) to head (543ea4f).
Report is 9 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/core/environment.py 76.78% 8 Missing and 5 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #3021       +/-   ##
===========================================
+ Coverage   51.35%   76.08%   +24.72%     
===========================================
  Files         204      211        +7     
  Lines       21446    21868      +422     
  Branches     2729     2740       +11     
===========================================
+ Hits        11014    16638     +5624     
+ Misses       9834     4423     -5411     
- Partials      598      807      +209     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@flyte-bot
Copy link
Contributor

flyte-bot commented Dec 30, 2024

Code Review Agent Run #763fc1

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 481576f..6761d6b
    • flytekit/core/environments.py
    • tests/flytekit/unit/experimental/environments.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Dec 30, 2024

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
New Feature - Task Environment Management System

__init__.py - Added Environment class import to make it accessible from the main package

environment.py - Implemented new Environment class for managing task configurations

test_environment.py - Added comprehensive test suite for Environment class functionality

@flyte-bot
Copy link
Contributor

flyte-bot commented Dec 30, 2024

Code Review Agent Run #089365

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 6761d6b..543ea4f
    • flytekit/__init__.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks very much for putting this in. quick question though - would you mind adding some a couple tests that cover using Environment with a config? Like could I make a SparkEnvironment with this? And flipping it, could I override an Environment with a Spark config for instance, back to a regular python task? (not saying we should, but could we document the behavior with a unit test?)


def inherit(old: dict[str, Any], new: dict[str, Any]) -> dict[str, Any]:
old = copy.deepcopy(old)
new = copy.deepcopy(new)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably don't need to copy new right? it doesn't get mutated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants