-
Notifications
You must be signed in to change notification settings - Fork 16.8k
Description
Apache Airflow version:
1.10.12, using SQLLite as the backend
Kubernetes version (if you are using kubernetes) (use kubectl version):
N/A. Using Docker Swarm 19.03.8
Environment:
- Cloud provider or hardware configuration:
No cloud, bare-metal server:
HP ProLiant DL560 Gen8, BIOS P77 12/20/2013, 64 cpus
- OS (e.g. from /etc/os-release):
Fedora release 29 (Twenty Nine)
- Kernel (e.g.
uname -a):
Linux server.company.com 4.19.82-1300.fc29.x86_64 #1 SMP Fri Nov 8 10:49:58 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
- Install tools:
pip
- Others:
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Docker info:
Client:
Debug Mode: false
Server:
Containers: 21
Running: 0
Paused: 0
Stopped: 21
Images: 12
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem: <unknown>
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: j0pl320hoxuqcaaa14z2znvgo
Is Manager: true
ClusterID: kpgz783mpw8aapdxchtwdu2ff
Managers: 1
Nodes: 4
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 172.29.248.55
Manager Addresses:
172.29.248.55:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.19.82-1300.fc29.x86_64
Operating System: Fedora 29 (Twenty Nine)
OSType: linux
Architecture: x86_64
CPUs: 64
Total Memory: 125.9GiB
Name: server.company.com
ID: 7ESU:O253:JGNS:YJIY:XXX:CYTI:WFQC:6L5C:XXXX:62IO:VH23:XXXX
Docker Root Dir: /opt/docker
Debug Mode: false
HTTP Proxy: http://proxy.company.com:8080/
HTTPS Proxy: http://proxy.company:8080/
No Proxy: localhost,127.0.0.1,server.company.com,.company.com
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
privatereg.company.com:5000
localhost:5000
server.company.com:5000
127.0.0.0/8
Live Restore Enabled: false
What happened:
Created the following DAG to schedule a one time shot job:
from datetime import time
from datetime import datetime
from datetime import timedelta
from airflow import DAG
from airflow.contrib.operators.docker_swarm_operator import DockerSwarmOperator
DEFAULT_ARGS = {
'retry_delay': timedelta(minutes=5),
'retries': 1,
'email_on_failure': True,
'email_on_retry': False,
'email': ['myemail@company.com']
}
with DAG('24_7_box', description='24 x 7. With retries', default_args=DEFAULT_ARGS, schedule_interval='0 * * * Mon-Sun', start_date=datetime(2019, 7, 23), max_active_runs=1, catchup=False) as twenty_four_by_seven_dag:
# See:
# https://airflow.apache.org/docs/stable/_api/airflow/contrib/operators/docker_swarm_operator/index.html
# https://airflow.apache.org/docs/stable/_modules/airflow/contrib/operators/docker_swarm_operator.html
SLEEP_TASK = DockerSwarmOperator(
task_id="SLEEP_TASK",
image="fedora:29",
api_version="auto",
command="/bin/sleep 60",
docker_url="unix://var/run/docker-sysavtbuild.sock",
force_pull=False,
mem_limit="500m",
auto_remove=True,
)
SLEEP_TASK
What you expected to happen:
I was expecting the container to be created and be alive for 60 seconds, exit with code=0 after that. No ouput.
Other have reported success in the past using Docker Swarm Operator.
Not sure. The Airflow log shows the following:
[2020-10-17 09:46:58,475] {taskinstance.py:1150} ERROR - 400 Client Error: Bad Request ("json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64")
I can run this command from docker CLI as follows:
[user@server dags]$ docker run --rm --detach fedora:29 /bin/sleep 45
29912c34f43e2dfa20d417cb80113059a183518b99215609c0aa7b37874c27db
[user@server dags]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
29912c34f43e fedora:29 "/bin/sleep 45" 7 seconds ago Up 6 seconds gifted_pare
How to reproduce it:
- Copy the DAG provided into ~/airflow/dags
- Turn ON the DAG
- Trigger the DAG or let the scheduler run it. Error will show up eventually
Anything else we need to know:
Airflow.log
*** Reading local file: /home/user/airflow/logs/avt_24_7_box/SLEEP_TASK/2020-10-17T13:24:26.101897+00:00/2.log [2020-10-17 09:46:58,312] {taskinstance.py:670} INFO - Dependencies all met for [2020-10-17 09:46:58,321] {taskinstance.py:670} INFO - Dependencies all met for [2020-10-17 09:46:58,321] {taskinstance.py:880} INFO - -------------------------------------------------------------------------------- [2020-10-17 09:46:58,321] {taskinstance.py:881} INFO - Starting attempt 2 of 2 [2020-10-17 09:46:58,321] {taskinstance.py:882} INFO - -------------------------------------------------------------------------------- [2020-10-17 09:46:58,328] {taskinstance.py:901} INFO - Executing on 2020-10-17T13:24:26.101897+00:00 [2020-10-17 09:46:58,335] {standard_task_runner.py:54} INFO - Started process 35637 to run task [2020-10-17 09:46:58,371] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', '24_7_box', 'SLEEP_TASK', '2020-10-17T13:24:26.101897+00:00', '--job_id', '55', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/avt_test3.py', '--cfg_path', '/tmp/tmpaivrdhuu'] [2020-10-17 09:46:58,372] {standard_task_runner.py:78} INFO - Job 55: Subtask SLEEP_TASK [2020-10-17 09:46:58,398] {logging_mixin.py:112} INFO - Running %s on host %s server.company.com [2020-10-17 09:46:58,467] {docker_swarm_operator.py:105} INFO - Starting docker service from image fedora:29 [2020-10-17 09:46:58,475] {taskinstance.py:1150} ERROR - 400 Client Error: Bad Request ("json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64") Traceback (most recent call last): File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/client.py", line 259, in _raise_for_status response.raise_for_status() File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/requests/models.py", line 941, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http+docker://localhost/v1.40/services/createDuring handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/airflow/models/taskinstance.py", line 984, in run_raw_task
result = task_copy.execute(context=context)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/airflow/operators/docker_operator.py", line 277, in execute
return self.run_image()
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/airflow/contrib/operators/docker_swarm_operator.py", line 119, in run_image
labels={'name': 'airflow%s_%s' % (self.dag_id, self.task_id)}
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/utils/decorators.py", line 34, in wrapper
return f(self, *args, **kwargs)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/service.py", line 190, in create_service
self._post_json(url, data=data, headers=headers), True
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/client.py", line 265, in _result
self._raise_for_status(response)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/client.py", line 261, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 400 Client Error: Bad Request ("json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64")
[2020-10-17 09:46:58,481] {taskinstance.py:1194} INFO - Marking task as FAILED. dag_id=24_7_box, task_id=SLEEP_TASK, execution_date=20201017T132426, start_date=20201017T134658, end_date=20201017T134658
[2020-10-17 09:46:58,509] {configuration.py:373} WARNING - section/key [smtp/smtp_user] not found in config
[2020-10-17 09:46:58,583] {email.py:132} INFO - Sent an alert email to ['user@company.com']
[2020-10-17 09:47:03,312] {local_task_job.py:102} INFO - Task exited with return code 1