-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cleanup functions to fix recursive loop and no such file raised by broken symlinks #1115
Conversation
✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.
|
5fdda73
to
f282a6b
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1115 +/- ##
==========================================
- Coverage 96.47% 95.56% -0.91%
==========================================
Files 64 65 +1
Lines 3287 3338 +51
==========================================
+ Hits 3171 3190 +19
- Misses 116 148 +32 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM mostly. Minor questions inline
broken_symlinks_count = 0 | ||
deleted_symlinks_count = 0 | ||
for root_dir, dirs, files in os.walk(dir_path): | ||
paths = [os.path.join(root_dir, filepath) for filepath in files] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check here if the filepath is a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was being checked after, but it would be a good improvement!
) | ||
|
||
|
||
# Airflow DAG parsing fails if recursive loops are found, so this method cannot be used from within an Airflow task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add an example here on how to call this and where to call this like mentioned in the PR description.
Or should we also create a public docs page listing the steps that we can share with users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a short docs will be great and also we could render example in the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some tests for this module or exclude from codecov for the time being?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
) | ||
|
||
|
||
# Airflow DAG parsing fails if recursive loops are found, so this method cannot be used from within an Airflow task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a short docs will be great and also we could render example in the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like an operator or decorator. I'm ok with keeping it in the project's root path, but what are your thoughts on placing it in the operator directory instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge is that while identify_broken_symbolic_links
could be exposed as an operator, ideally, users would check this before it is deployed to a remote instance of Airflow. Additionally, identify_recursive_loops
cannot be exposed as an operator since it would be useless (the exception is raised at Airflow DAG parsing). Therefore, we'd need a non-operator place to expose the second use case, so it felt we could benefit from having both methods together - and it'd be up to the user on how to use them.
Assuming the cause of the issue no longer exists, this DAG can be run only once. | ||
""" | ||
|
||
# [START dirty_dir_example] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could use tag dirty_dir_example
to render this example in docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Thanks a lot for the reviews, @pankajastro and @pankajkoti! I started addressing them, but we had a call with the team who initially raised this error, and we found a more straightforward workaround for the problem, as described in #1076 (comment) I believe we can close this PR for now, and if this feature becomes useful in the future, we can reopen/recreate it. |
Introduce tools that help users facing two issues raised by problematic Airflow or dbt project directories:
No such file or directory "/usr/local/aiflow/dags/dbt/dbt_venv/bin/python"
#1096)Disclaimer
Cosmos is not the cause of the two issues previously mentioned. The reason for these problems is a misconfiguration of the Airflow and/or dbt directories created by users and set (by users) to be used in Cosmos DAGs.
This PR proposes methods to mitigate these problems, but users need to introduce this or similar tooling in their deployment processes to avoid the issue from happening again.
Details on the problems addressed
The first of these issues relates to the dbt project folder having a symbolic link to a file that no longer exists:
The second of these issues relates to the dbt project folder having a symbolic link to a parent folder, which leads Airflow not to be able to parse dbt DAGs at all:
The second issue, in particular, can be very destructive since it can block the scheduler from parsing any DAGs in the Airflow project, leading to critical DAGs not running in the expected schedules.
How to use this PR
For problem (1), this PR has an example DAG,
example_cosmos_cleanup_dir_dag,
which can be scheduled as any other DAG in Airflow.It was tested by using the following steps:
(a) Create symlinks and make them invalid
(b) Triggering this DAG
In its logs, we can see:
When trying to run it again, it's possible to see no more broken symlinks were found:
For problem (2), since the problem happens at Airflow DAG parsing, this issue has to be solved outside of Airflow DAGs/tasks.
We are exposing a command line that allows users to visualize and delete recursive loops. This is an example of how to do both things:
The expectation is that end-users can run this command before deploying to Airflow, ensuring the deployed dbt project folder is valid and does not contain recursive links. This same command also checks for broken symbolic links.
To validate this feature, we did the same steps to reproduce the broken symlinks and also:
Example of logs created by running this command line:
From an end-user perspective, the script that solves the problem (2) can be automatically run in a few places, including
Closes: #1096
Closes: #1076