-
Notifications
You must be signed in to change notification settings - Fork 16.7k
Description
Apache Airflow version:
1.10.6
Kubernetes version (if you are using kubernetes) (use kubectl version):
1.15.12-gke.2
Environment:
Google Cloud Platform
What happened:
I have a DAG with ~500 tasks. They can be run concurrently.
The graph looks ok, but all the tasks have state "None". When I check the instance details it says:
Task's trigger rule 'all_done' requires all upstream tasks to have completed, but found False task(s) that weren't done. upstream_tasks_state={'total': 1, 'successes': Decimal('0'), 'skipped': Decimal('0'), 'failed': Decimal('0'), 'upstream_failed': Decimal('0'), 'done': 0}, upstream_task_ids={'01_bq_list_datasets_bhagavand-sandbox-sbx-b19f_1'}
for task 01_bq_list_datasets_bhagavand-sandbox-sbx-b19f_1.
So the task A is waiting for task A to be done before being scheduled which ends in deadlock.
What you expected to happen:
The task should not for itself to be done.
Maybe nothing went wrong - when I check my other DAGs, tasks in them also have themselves as upstream tasks.
Still, the tasks are not being scheduled.
Anything else we need to know:
This is my biggest DAG, it has 500 tasks. But it is not very much, right?
How often does this problem occur? Once? Every time etc?
Every time.