Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault when trying to fetch dataset #17

Open
mika-data opened this issue Feb 4, 2023 · 17 comments
Open

segmentation fault when trying to fetch dataset #17

mika-data opened this issue Feb 4, 2023 · 17 comments

Comments

@mika-data
Copy link

mika-data commented Feb 4, 2023

root@server:~/Downloads# python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 11:59:11 2023) in /usr/local/lib/python3.9/dist-packages/sling


root@server:~/Downloads# sling fetch --dataset caspar
[2023-02-04 11:59:34.807193: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 11:59:34.813720: I run.py:341] Execute command fetch
*** Signal 11 (Segmentation fault) at 0x0000020b26f0 for 0x0000020b26f0
  @ 0x0000020b26f0 (unknown)

**Segmentation fault**

root@server:~/Downloads# cat /etc/*release
PRETTY_NAME="**Debian GNU/Linux 11** (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
@mika-data
Copy link
Author

My fault, I have just downloaded the Python API via pip, yet.

The command line interpreter will probably work only for a small subset of commands.

@ringgaard
Copy link
Owner

@mika-data: Did you try to build the Python API yourself on your Debian machine? I normally build on Ubuntu, but I would think the differences are minor.

@mika-data
Copy link
Author

No, I have downloaded the Python API as a whl as recommended in your installation documentation.

I had previously only python3.9 on my machine, after I had build python3.6 and then build sling from source, everything seems to work for me.

root@server:/usr# python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 13:50:30 2023) in /usr/local/python-3.6.15/lib/python3.6/site-packages/sling
root@cgnvision:/usr# sling fetch --dataset caspar
[2023-02-04 13:51:48.997469: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 13:51:49.007196: I run.py:341] Execute command fetch
[2023-02-04 13:51:49.008710: I sling/task/job.cc:349] All systems GO
[2023-02-04 13:51:49.008867: I sling/task/job.cc:62] Starting stage #0
[2023-02-04 13:51:49.008945: I sling/task/job.cc:66] Start url-download
[2023-02-04 13:51:49.009773: I download.py:51] Download caspar from https://ringgaard.com/data/caspar/caspar.flow
[2023-02-04 13:51:49.009979: I download.py:78] Start download of ./data/e/caspar/caspar.flow
[2023-02-04 13:51:49.741937: I download.py:94] caspar downloaded
[2023-02-04 13:51:49.742027: I sling/task/job.cc:402] Task url-download completed
[2023-02-04 13:51:49.742188: I sling/task/job.cc:407] Task url-download done
[2023-02-04 13:51:49.742255: I sling/task/job.cc:419] Stage #0 done
[2023-02-04 13:51:49.743633: I workflow.py:821] sending final status to monitor
[2023-02-04 13:51:49.743902: I run.py:351] Done

@ringgaard
Copy link
Owner

Hmm, maybe I should test this on Python 3.9. My Ubuntu only has 3.8.

@mika-data
Copy link
Author

I have tested it on another debian machine. There it worked fine:

(wikidata) mika@server:~/Programming/wikidata$ pip3 install https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl
Collecting sling==3.0.0
  Downloading https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl (7.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.4/7.4 MB 3.9 MB/s eta 0:00:00
Installing collected packages: sling
Successfully installed sling-3.0.0
(wikidata) mika@server:~/Programming/wikidata$ python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 20:58:48 2023) in /home/mika/anaconda3/envs/wikidata/lib/python3.8/site-packages/sling
(wikidata) mika@server:~/Programming/wikidata$ sling fetch --dataset caspar
[2023-02-04 20:59:03.350186: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 20:59:03.354802: I run.py:341] Execute command fetch
[2023-02-04 20:59:03.355687: I sling/task/job.cc:349] All systems GO
[2023-02-04 20:59:03.355815: I sling/task/job.cc:62] Starting stage #0
[2023-02-04 20:59:03.355821: I sling/task/job.cc:66] Start url-download
[2023-02-04 20:59:03.356144: I download.py:51] Download caspar from https://ringgaard.com/data/caspar/caspar.flow
[2023-02-04 20:59:03.356218: I download.py:78] Start download of ./data/e/caspar/caspar.flow
[2023-02-04 20:59:05.813746: I download.py:94] caspar downloaded
[2023-02-04 20:59:05.813851: I sling/task/job.cc:402] Task url-download completed
[2023-02-04 20:59:05.814257: I sling/task/job.cc:407] Task url-download done
[2023-02-04 20:59:05.814305: I sling/task/job.cc:419] Stage #0 done
[2023-02-04 20:59:05.816389: I workflow.py:821] sending final status to monitor
[2023-02-04 20:59:05.816896: I run.py:351] Done
(wikidata) mika@blackbrain:~/Programming/wikidata$ cat /etc/*release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
(wikidata) mika@server:~/Programming/wikidata$ python -V
Python 3.8.16

@ringgaard
Copy link
Owner

So it seems to work on Python 3.8, but fail on Python 3.9, right?

@mika-data
Copy link
Author

Yes, I can confirm the bug on a second machine:

mika@server:Downloads$ pip3 install https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl
Collecting sling==3.0.0
  Using cached https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl (7.4 MB)
Installing collected packages: sling
Successfully installed sling-3.0.0
mika@server:Downloads$ python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Sat Feb  4 21:48:31 2023) in /home/mika/.local/lib/python3.9/site-packages/sling
mika@server:Downloads$ sling fetch --dataset caspar
[2023-02-04 21:48:43.254199: I sling/pyapi/pytask.cc:525] Start HTTP server on port 6767
[2023-02-04 21:48:43.256684: I run.py:341] Execute command fetch
*** Signal 11 (Segmentation fault) at 0x0000019d0660 for 0x0000019d0660
  @ 0x0000019d0660 (unknown)
**Speicherzugriffsfehler** <---- segmentation fault
mika@server:Downloads$ py -V
Python 3.9.2

@ringgaard
Copy link
Owner

Let me try to see if I can reproduce this on one of my own machines.

@ringgaard
Copy link
Owner

I can now reproduce the crash. It seems to have something to do with Python type registration in the pysling C extension when running in Python 3.9.

@ringgaard
Copy link
Owner

I seems like you need to build pysling.so using python3.9-dev for it to work with Python 3.9, so I have added support for building pysling.so for Python 3.9. You change DPYVER=36 to DPYVER=39 and rebuild using tools/buildall.sh. It seems like the 3.9 version can be used with earlier versions of Python, but I haven't change the default yet because I don't have Python 3.9 on all my machines that build the code.

@meerfrau
Copy link

When I compile from source against Python 3.10 and don't dockerize (pip/venv/...) you might get:

Compiling sling/pyapi/pyapi.cc failed: (Exit 1): gcc failed: error executing command (from target //sling/pyapi:pyapi) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 25 arguments skipped)
In file included from ./sling/pyapi/pyarray.h:19,
                 from sling/pyapi/pyapi.cc:17:
./sling/pyapi/pybase.h: In static member function 'static sling::Text sling::PyBase::GetText(PyObject*)':
./sling/pyapi/pybase.h:130:37: error: invalid conversion from 'const char*' to 'char*' [-fpermissive]
  130 |       data = PyUnicode_AsUTF8AndSize(obj, &length);
      |              ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
      |                                     |
      |                                     const char*

Sadly I don't know much C++, but isn't this just a point of permitting the type conversion?

@ringgaard
Copy link
Owner

Are you using the newest version of the code? Line 130 of pybase.h does not match your error message.

Are there any reason that you cannot use the pre-built wheel?

@meerfrau
Copy link

meerfrau commented Apr 20, 2023

I've changed pybase to:

#include <python3.10/Python.h>
#include <python3.10/structmember.h>

Are there any reason that you cannot use the pre-built wheel?

To see the error ;)

@meerfrau
Copy link

meerfrau commented Apr 20, 2023

I'm sorry, the current sources work perfectly against Python 3.10!

PS: Installed via sudo ln -s ./sling/python /usr/lib/python3.10/site-packages/sling → may you please add a setup.py for people like me?

@ringgaard
Copy link
Owner

@meerfrau: I use wheels instead of setuptools, so you can install SLING with the following command:

sudo pip3 install https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl

I have updated the code to support Python 3.10 by changing DPYVER=36 to DPYVER=310.

@zzysjtuiwct
Copy link

zzysjtuiwct commented Oct 9, 2024

Dear author, it seems like the package installed through the wheels (pip3 install https://ringgaard.com/data/dist/sling-3.0.0-py3-none-linux_x86_64.whl) still can not correctly work with python3.9 and python3.10, which will report errors when I use these two versions of the conda environment:

(test) root@2f6226b7531f:~# python3 -c "import sling; sling.which()"
SLING API version 3.0.0 (Wed Oct 9 03:03:26 2024) in /root/anaconda3/envs/test/lib/python3.9/site-packages/sling

(test) root@2f6226b7531f:~# sling fetch --dataset caspar --overwrite
[2024-10-09 03:04:31.383944: I sling/pyapi/pytask.cc:528] Start HTTP server on port 6767
[2024-10-09 03:04:31.393895: I run.py:341] Execute command fetch
*** Signal 11 (Segmentation fault) at 0x00000198d1f0 for 0x00000198d1f0
@ 0x00000198d1f0 (unknown)
Segmentation fault (core dumped)

But it can adequately work with python3.8. Could you please update the support for python3.10?

@ringgaard
Copy link
Owner

ringgaard commented Oct 9, 2024

It seems like there are some incompatibilities in the C extension API between python 3.8 and 3.9, so there is little hope that you can make one version which works with both.

I have now added a check to tools/buildall.sh which detects the python3 version and sets the PYVER variable. This will compile a pysling.so which is compatiable with your python version.

I'm in the process up upgrading my systems to Ubuntu 22, which uses python 3.10 by default. When this has been done, I will deprecate support for python versions before 3.9.

In the meantime, I have made a version available that can be used with Python >= 3.9 here:

https://ringgaard.com/data/dist/sling-3.0.1-py3-none-linux_x86_64.whl

NB: Please notice that this version is not updated daily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants