You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
To disable RMM pooling, the docs say to set spark.rapids.memory.gpu.pool=NONE over the deprecated config spark.rapids.memory.gpu.pooling.enabled=false. However, the former is not respected when Python GPU scheduling is enabled, i.e. spark.rapids.sql.python.gpu.enabled=true.
Then pooling does occur on the python worker: > DEBUG: Pooled memory, pool size: 2965.8984375 MiB, max size: 8796093022208.0 MiB
when it should be honoring the JVM pooling conf spark.rapids.memory.gpu.pool=NONE.
Then if we revert to the old pooling config spark.rapids.memory.gpu.pooling.enabled=false:
spark.jars=$RAPIDS_JAR
spark.executorEnv.PYTHONPATH=$RAPIDS_JAR
spark.plugins=com.nvidia.spark.SQLPlugin
spark.rapids.sql.explain=ALL
spark.rapids.memory.gpu.allocFraction=0.5
spark.rapids.memory.pinnedPool.size=0
spark.python.daemon.module=rapids.daemon
spark.rapids.sql.python.gpu.enabled=true
spark.rapids.memory.gpu.pooling.enabled=false # switch to old config
Pooling does not occur on the python worker. So with spark.rapids.sql.python.gpu.enabled=true we should be respecting the new config as well.
The text was updated successfully, but these errors were encountered:
Reverts #842.
rapids.daemon python workers do not respect the new pooling configs when
`spark.rapids.sql.python.gpu.enabled=true` (see [this
issue](NVIDIA/spark-rapids#12228)); need to
use old config to disable pooling.
---------
Signed-off-by: Rishi Chandra <rishic@nvidia.com>
Describe the bug
To disable RMM pooling, the docs say to set
spark.rapids.memory.gpu.pool=NONE
over the deprecated configspark.rapids.memory.gpu.pooling.enabled=false
. However, the former is not respected when Python GPU scheduling is enabled, i.e.spark.rapids.sql.python.gpu.enabled=true
.Steps/Code to reproduce bug
Minimal repro with a python udf:
spark.python.daemon.module=rapids.daemon
and the new configspark.rapids.memory.gpu.pool=NONE
:Pooling does not occur on the python worker as expected.
spark.rapids.sql.python.gpu.enabled=true
:Then pooling does occur on the python worker:
> DEBUG: Pooled memory, pool size: 2965.8984375 MiB, max size: 8796093022208.0 MiB
when it should be honoring the JVM pooling conf
spark.rapids.memory.gpu.pool=NONE
.spark.rapids.memory.gpu.pooling.enabled=false
:Pooling does not occur on the python worker. So with
spark.rapids.sql.python.gpu.enabled=true
we should be respecting the new config as well.The text was updated successfully, but these errors were encountered: