-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: g++: fatal error: Killed signal terminated program cc1plus #6833
Comments
Installation with |
I reinstalled now pymc under conda, but the problem remains :-( Operating System: Ubuntu 20.04.6 LTS Subsystem under Windows 10 WSL-2 Last updated: Tue Jul 25 2023 Python implementation: CPython arviz : 0.15.1 Watermark: 2.3.0 CompileError: Compilation failed (return status=1): g++: fatal error: Killed signal terminated program cc1plus |
Hm, it seems it's still using the system compile ( |
I am definitly sure that the environment was activated correctly. This python version is only used for pymc. Here is the module list and the output of g++ -v: |
That's not the output of |
|
This is what it shows for me:
You can see it has a compiler installed in my env which you lack, not sure why. But you can try to install it manually. |
I installed clang outside and environment But still I got /home/thomas/.local/lib/python3.8/site-packages/pytensor/tensor/rewriting/elemwise.py:1019: UserWarning: Loop fusion failed because the resulting node would exceed the kernel argument limit. CompileError: Compilation failed (return status=1): So /usr/bin/g++ is still called. Is there some additional configuration to do for switching to clang? |
What I meant is that you need to install |
Well, I made a clean install.
Watermark now gives: Python implementation: CPython arviz : 0.16.1 Watermark: 2.4.3 Again running the compiler-bug notebook gives after /home/thomas/mambaforge/envs/pymc/lib/python3.11/site-packages/pytensor/tensor/rewriting/elemwise.py:1028: UserWarning: Loop fusion failed because the resulting node would exceed the kernel argument limit. the well-known compiler bug, but now with gcc from the env CompileError: Compilation failed (return status=1): Since this used Python 3.11 and Pymc 5.7, I made a second attempt by downgrading Python to 3.8 and Pymc 3.6.1. The paths to gcc and g++ are the same as above as well as the error. So I think, it is not an issue with my installations. Did you run the compiler-bug.ipynb yourself? Could you reproduce the behaviour? Since the warning |
Did you try the conda-forge channel specifically? |
Can you try with a very simple model? import pymc as pm
with pm.Model() as m:
x = pm.Normal()
pm.sample() It is not clear for me if you see a problem with specific models or in general |
Looking for: ['pymc'] conda-forge/noarch 13.5MB @ 4.0MB/s 3.7s Pinned packages:
Transaction Prefix: /home/thomas/mambaforge/envs/pymc All requested packages already installed |
You should install from a fresh environment |
It is the specific model of the notebook. As I explained at the beginning, a colleague of mine who authored this model has no problem at all. All of my other models worked unter PyMC 5 (after some adaptations) without problem. with pm.Model() as m: Runs as expected: Auto-assigning NUTS sampler... 100.00% [4000/4000 00:02<00:00 Sampling 2 chains, 0 divergences] Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 2 seconds. |
So back to your case. After you install with conda-forge, can you try running a single chain? Just trying to narrow down the issue space |
Well, installed
Still got same behavior CompileError: Compilation failed (return status=1): Did you run the supplied notebook? How did it behave in your environment? |
I think you might not have enough resources (RAM) so g++ is getting killed. E.g. soedinglab/hh-suite#280 |
I increased the limit for the main storage to 10 GB and still the same error occured. Actually I can't believe that a compilation of roughly 8 MB of C-Code (compare the attached generated code file) cannot be done within 10 GB |
@maresb could this be an arch issue? |
No, this should be pure linux-64. This feels to me like a memory issue. Maybe the 10GB is not being made available somehow. I would check the output of |
I increased the limit for the main storage to 10 GB and still the same error occured. Actually I can't believe that a compilation of roughly 8 MB of C-Code (compare the attached generated code file) cannot be done within 10 GB |
@ThomasHoppe Not disk space but RAM. |
If I say main memory, I do not talk about disc space. Im talking about 10GB of RAM ! I enclose also a video showing the last 6 minutes from 31 minutes of the call to pymc.sample where you can see from |
@ThomasHoppe I misunderstood. Then it's definitely not the RAM. I'm a bit stumped, because it's not a compiler error but the compiler getting killed. |
State of the bug isolation:
|
@ThomasHoppe I didn't have time to look at your model before. I believe the source of the problem is that you have a very inefficient model. You are doing a series of operations per row of data, which builds a very large latent graph. You can probably vectorize your operations using advanced indexing, which will make the computational graph of the model much simpler and shorter to compile. |
Here is how I would write your last model (probably has bugs!!!): #import sklearn.preprocessing
model_toto = pm.Model()
with model_toto:
score = pm.Normal("score", tau=1., mu=0., shape=nb_clubs)
advantage_defence_diff = pm.Normal("offence_defence_diff",
tau=1., mu=1.5, shape=nb_clubs)
# number of goals scored more at home as away
home_advantage = pm.Normal("home_advantage", tau=10., mu=.0)
# softmax regression weights for winner predicton:
weights = pm.Normal("weights", mu=(0., .25, -0.25), tau=100., shape=(3))
heim = np.array([hg[0] for hg in home_goals_])
gast = np.array([hg[1] for hg in home_goals_])
h_goals = np.array([hg[2] for hg in home_goals_])
heim_ = np.array([ag[0] for hg in away_goals_])
gast_ = np.array([ag[1] for hg in away_goals_])
a_goals = np.array([ag[2] for hg in away_goals_])
s_h_, add_h = score[heim], advantage_defence_diff[heim]
s_g, add_g = score[gast], advantage_defence_diff[gast]
s_h = s_h_ + home_advantage
offence_heim = s_h + add_h
defence_heim = s_h - add_h
offence_gast = s_g + add_g
defence_gast = s_g - add_g
home_value = offence_heim - defence_gast
away_value = offence_gast - defence_heim
score_diff = s_h-s_g # can be negative!
### no negative values
home_value = pm.math.switch(pm.math.lt(home_value, 0.), low, home_value)
away_value = pm.math.switch(pm.math.lt(away_value, 0.), low, away_value)
# for prediction of the winner
toto = np.where(
h_goals == a_goals,
0,
np.where(
h_goals > a_goals,
1,
2
),
)
mu_home = pm.Deterministic("home_rate", home_value)
pm.Poisson("home_goals", observed=home_goals, mu=mu_home)
mu_away = pm.Deterministic("away_rate", away_value)
pm.Poisson("away_goals", observed=away_goals, mu=mu_away)
ha_diff = score_diff
ha_diff = ha_diff.reshape((-1,1))
ha_diff = ha_diff.repeat(3, axis=1)
pred = pm.math.exp(ha_diff * weights)
pred = (pred.T/pm.math.sum(pred, axis=1)).T
pm.Categorical('toto', p=pred, observed=toto) Those index and numerical operations are vectorized just like numpy, and your model won't grow exponentially in complexity with your data size. |
@ricardoV94: Thanks, for the suggestion. Actually, the model was designed by a colleague, who has no problems running it. He does not encounter the compiler problem. I also found that the iterative solution wouldn't be ideal, but hadn't the time diving deeper into it, without a running reference solution. Your seems to me quite plausible and we will give it a try ... |
Let us know if it works. If not, the right place to continue this discussion would be on discourse: https://discourse.pymc.io/ Regarding your colleague, even if he could manage to compile, I am certain the model will be considerably slower the way he wrote it down. I'll close this issue in the meantime, as it's not clear it would be worth the trouble to try and make the compiler more robust to very large graphs. |
Describe the issue:
During compilation of models compiler receives a kill signal (reason unknown).
Can be reproduced with two different models.
Reproduceable code example:
Error message:
PyMC version information:
Occured in 5.5.0 and 5.6.1
Detailed watermark:
Last updated: Tue Jul 18 2023
Python implementation: CPython
Python version : 3.8.10
IPython version : 8.0.1
arviz : 0.15.1
pandas : 2.0.2
daft : 0.1.2
pymc : 5.6.1
matplotlib: 3.7.1
numpy : 1.22.1
scipy : 1.7.3
pytensor: 2.12.3
Watermark: 2.3.0
Operating System: Ubuntu 20.04.6 LTS Subsystem under Windows 10 WSL-2
PyMC installation via pip
Context for the issue:
Stops further evaluation of the model with sample_posterior_prediction
D1.csv
compiler-bug.zip
The text was updated successfully, but these errors were encountered: