Skip to content

Task has itself as upstream task and does not get scheduled #11063

@Ruruthia

Description

@Ruruthia

Apache Airflow version:
1.10.6

Kubernetes version (if you are using kubernetes) (use kubectl version):
1.15.12-gke.2
Environment:
Google Cloud Platform

What happened:
I have a DAG with ~500 tasks. They can be run concurrently.
The graph looks ok, but all the tasks have state "None". When I check the instance details it says:
Task's trigger rule 'all_done' requires all upstream tasks to have completed, but found False task(s) that weren't done. upstream_tasks_state={'total': 1, 'successes': Decimal('0'), 'skipped': Decimal('0'), 'failed': Decimal('0'), 'upstream_failed': Decimal('0'), 'done': 0}, upstream_task_ids={'01_bq_list_datasets_bhagavand-sandbox-sbx-b19f_1'}
for task 01_bq_list_datasets_bhagavand-sandbox-sbx-b19f_1.
So the task A is waiting for task A to be done before being scheduled which ends in deadlock.

What you expected to happen:
The task should not for itself to be done.

Maybe nothing went wrong - when I check my other DAGs, tasks in them also have themselves as upstream tasks.
Still, the tasks are not being scheduled.

Anything else we need to know:
This is my biggest DAG, it has 500 tasks. But it is not very much, right?

How often does this problem occur? Once? Every time etc?
Every time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:Schedulerincluding HA (high availability) schedulerkind:bugThis is a clearly a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions