-
Notifications
You must be signed in to change notification settings - Fork 16.7k
Description
Apache Airflow version: airflow:1.10.10.1-alpha2-python3.6
Kubernetes version (if you are using kubernetes) (use kubectl version): 1.16.8
Environment:
Cloud provider or hardware configuration: AWS EKS
OS (e.g. from /etc/os-release): Redhat
Install tools: Official Helm Chart
What happened:
Some Tasks are fetching logs from pod. Getting failures when reading logs from S3 bucket.
Log file does not exist: /opt/airflow/logs/mydag/mytask/2020-07-21T11:58:55.019748+00:00/2.log
*** Fetching from: http://taskpod-49ccd964791a4740b199:8793/log/mydag/mytask/2020-07-21T11:58:55.019748+00:00/2.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='taskpod-49ccd964791a4740b199', port=8793): Max retries exceeded with url: /log/mydag/mytask/2020-07-21T11:58:55.019748+00:00/2.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f486c9b8be0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
What you expected to happen:
Tasks should not fetch logs from pod. There should be no error in reading logs from S3 bucket. It should always try to read from S3 and not from pod.
Some task are working fine:
*** Reading remote log from s3://mybucket/airflow/logs/mydag/mytask1/2020-07-21T11:58:55.019748+00:00/2.log.
Anything else we need to know:
Please note that this is happening intermittently. When running parallel tasks, I am able to read logs from S3 for some tasks while other give above error. Any help is appreciated.