-
Notifications
You must be signed in to change notification settings - Fork 323
add managed environment POC #3021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3021 +/- ##
===========================================
+ Coverage 51.35% 91.71% +40.35%
===========================================
Files 204 121 -83
Lines 21446 5368 -16078
Branches 2729 0 -2729
===========================================
- Hits 11014 4923 -6091
+ Misses 9834 445 -9389
+ Partials 598 0 -598 ☔ View full report in Codecov by Sentry. |
Code Review Agent Run #763fc1Actionable Suggestions - 0Review Details
|
Changelist by BitoThis pull request implements the following key changes.
|
Code Review Agent Run #089365Actionable Suggestions - 0Review Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks very much for putting this in. quick question though - would you mind adding some a couple tests that cover using Environment with a config? Like could I make a SparkEnvironment
with this? And flipping it, could I override an Environment with a Spark
config for instance, back to a regular python task? (not saying we should, but could we document the behavior with a unit test?)
Code Review Agent Run #0f10d5Actionable Suggestions - 2
Review Details
|
It won't support For now, it just supports ordinary |
Code Review Agent Run #5d6d02Actionable Suggestions - 1
Review Details
|
from flytekit.core.environment import Environment | ||
|
||
|
||
def test_spark_task(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test may need cleanup of Spark session resources. Consider adding session cleanup using pyspark.sql.SparkSession.builder.getOrCreate().stop()
before and after test execution to prevent resource leaks.
Code suggestion
Check the AI-generated fix before applying
@@ -6,6 +6,9 @@
+ # Clean up any existing sessions
+ pyspark.sql.SparkSession.builder.getOrCreate().stop()
+
env = Environment(
task_config=Spark(
spark_conf={"spark": "1"},
@@ -20,4 +23,7 @@
return 10
assert my_spark.task_config is not None
- assert my_spark.task_config.spark_conf == {"spark": "1"}
+ assert my_spark.task_config.spark_conf == {"spark": "1"}
+
+ # Clean up after test
+ pyspark.sql.SparkSession.builder.getOrCreate().stop()
Code Review Run #5d6d02
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
|
||
def test_basic_environment(): | ||
|
||
env = Environment(retries=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we support env.dynamic
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a dynamic
workflow from an Environment
? That is a great question. Not currently. I can get that added though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah super easy to add.
The unit test failure is not related to your PR. You can fix it by rebasing the PR |
* Store protos in local cache (#3022) * Store proto obj instead of model Literal in local cache Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> * Remove unused file Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> --------- Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> * Bump aiohttp from 3.9.5 to 3.10.11 (#3018) Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.5 to 3.10.11. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](aio-libs/aiohttp@v3.9.5...v3.10.11) --- updated-dependencies: - dependency-name: aiohttp dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix bug in FlyteDirectory.listdir on local files (#2926) * Fix issue in FlyteDirectory.listdir Fixes flyteorg/flyte#6005 Signed-off-by: Pim de Haan <pim@cusp.ai> * Added test Signed-off-by: Pim de Haan <pim@cusp.ai> * Run make lint Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> --------- Signed-off-by: Pim de Haan <pim@cusp.ai> Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> * Fix unit tests in airflow plugin (#3024) Signed-off-by: Kevin Su <pingsutw@apache.org> * fix: Fix resource meta typos for async agent (#3023) Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * fix: format commands output (#3026) * Fix pydantic basemodel default input (#3013) * Fix pydantic default input Signed-off-by: Future-Outlier <eric901201@gmail.com> * add pydantic integration test Signed-off-by: Future-Outlier <eric901201@gmail.com> * Use duck typing by Thomas's advice Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * lint Signed-off-by: Future-Outlier <eric901201@gmail.com> --------- Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * [BUG] Open FlyteFile from remote path (#2991) * fix: Open FlyteFile from remote path Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Add integration test Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Use ctx as param instead of recreation Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Clean test logic 1. Remove redundant prints 2. Use `mock.patch.dict` to setup `os.environ` for the current test fn * Avoid contaminating other tests running in the same process Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Setup local path and downloader in constructor Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Move SimpleFileTransfer to an utility file Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant env var setup Please refer to #3001 Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * test: Add another ff use case Create ff in one task pod and read it in another task pod. Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> --------- Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * vllm inference plugin (#2967) * vllm inference plugin Signed-off-by: Daniel Sola <daniel.sola@union.ai> * fixed default value Signed-off-by: Daniel Sola <daniel.sola@union.ai> --------- Signed-off-by: Daniel Sola <daniel.sola@union.ai> * Add poetry to image spec (#3025) * Add poetry to image spec Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> * Add stricter check Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> --------- Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> * [test] Add integration test for accessing sd sttr in dc (#2969) * test: Add integration test for attr access of sd Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Correct file path Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * test: Support interaction with minio s3 bucket 1. Upload a local parquet file to minio s3 bucket 2. Access StructuredDataset attr from a dataclass 3. Open StructuredDataset from a remote path Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Delete an unmerged integration test Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Try imagespec with commit sha of corresponding fix Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant test Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove default_factory and create sd dc from input uri Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Clean test logic 1. Remove redundant prints 2. Use `mock.patch.dict` to setup `os.environ` for the current test fn * Avoid contaminating other tests running in the same process Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant minio env var setup and add test comments Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Support uploading tmp pqt file Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Udpate deprecated module Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant and unused imports Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> --------- Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> --------- Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Pim de Haan <pim@cusp.ai> Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> Signed-off-by: Future-Outlier <eric901201@gmail.com> Signed-off-by: Daniel Sola <daniel.sola@union.ai> Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Pim de Haan <pimdehaan@gmail.com> Co-authored-by: Kevin Su <pingsutw@apache.org> Co-authored-by: 江家瑋 <36886416+JiangJiaWei1103@users.noreply.github.com> Co-authored-by: V <0426vincent@gmail.com> Co-authored-by: Han-Ru Chen (Future-Outlier) <eric901201@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Daniel Sola <40698988+dansola@users.noreply.github.com>
Code Review Agent Run #7823f9Actionable Suggestions - 0Additional Suggestions - 10
Review Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you. Love this feature! It’s going to eliminate a lot of boilerplate code.
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Code Review Agent Run Status
|
Signed-off-by: Grantham Taylor <granthamtaylor@icloud.com> Signed-off-by: Shuying Liang <shuying.liang@gmail.com>
Why are the changes needed?
Task environments can quickly become unwieldy for large, complex codebases. There are well over a dozen commonly used configurations, from
container_image
tosecret_requests
that will be similar among many tasks, albeit arbitrarily different for some edge cases.It is challenging to manage such environment configurations that contain a large number of such configurations while still being able to uniquely override the configurations for individual tasks, and extend a set of configurations to define entirely new environments.
This PR accomplishes both of these critical features in an intuitive manner.
What changes were proposed in this pull request?
These PR contributes a
Environment
class. AnEnvironment
contains a dictionary of configurations. These configurations are applied toflytekit.task
at registration time.However, the configurations of an
Environment
may be defined during the creation of theEnvironment
, or during the creation of the task to be authored with an `Environment.As a non-limiting example:
Additionally, one may create deep copies of an
Environment
This allows for an organization to modularly define reusable environments once for an entire project, or, perhaps even define reusable environments for the entire organization.
How was this patch tested?
Tests to be added. I have personally been using this pattern for the last year with Flyte, and with KFP for over three years.
Check all the applicable boxes
Summary by Bito
This PR enhances Flytekit with multiple features: (1) introduces a new Environment class for managing task configurations with inheritance capabilities and deep copying, (2) adds VLLM model serving support and improved file handling capabilities, (3) implements dynamic workflow task support and remote path handling, while (4) improving test coverage across StructuredDataset attributes, FlyteFile handling, and configuration inheritance.Unit tests added: True
Estimated effort to review (1-5, lower is better): 5