-
Notifications
You must be signed in to change notification settings - Fork 255
Description
Hi,
I trained a regression model using xgboost.XGBRegressor, and I'm trying to convert it into C code using m2cgen. However, I noticed that when the number of model parameters becomes large, the output of the generated C code does not match the output from Python.
Versions:
XGBoost: 1.6.2
m2cgen: 0.10.0
`import os
import numpy as np
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
from sklearn.utils import shuffle
import m2cgen as m2c
import sys
sys.setrecursionlimit(2147430647)
model = xgb.XGBRegressor(
objective='reg:squarederror',
n_estimators=236,
learning_rate=0.299280106671234,
max_depth=2,
#subsample=0.8,
#colsample_bytree=0.8,
tree_method='gpu_hist'
)
train model
model.fit(X_train, y_train)
predict
y_pred = model.predict(X_test)
c_code = m2c.export_to_c(model)
plt.show()
with open('xgboost.c', 'w') as f:
f.write(c_code)
print("C code generated successfully!")`
With the following model parameters, I was able to successfully convert the model to C and verify that the outputs are consistent between Python and C.
Below are comparison images of the outputs:
However, when I increased n_estimators t= 660 and max_depth=4(while keeping all other parameters the same), the output from the generated C code no longer matches the Python result. I'm not sure what's causing this issue.
All tests were conducted using the exact same input data for prediction.
Have you encountered this issue before, or do you know how to fix it?
The image below shows the output mismatch:
Thanks in advance for your help!

