Display proper error message when a kernel gets killed #1198
Replies: 1 comment
-
Hi @AmitJuneja25 - thanks for opening this discussion. By killed do you also include a kernel's user-initiated shutdown or is this purely about unexpected termination like OOM or package loading issues, for example? I suspect the latter, but just want to be on the same page. In either case, no, we don't have a means of getting this kind of failure information. This is true throughout the Jupyter ecosystem as well. That is, kernels launched in standard Jupyter Server/Lab configurations (without a Gateway server), do not report these kinds of failures. It sounds like you're building an administrative application, which is something we've wanted to do in EG since managing remote kernels is a bit like herding cats. The problem is that this kind of change touches the entire stack. This, coupled with the fact that EG will be moving to Kernel Provisioners and away from Process Proxies means that more repositories would need to be involved since only the provisioner (i.e., process proxy) knows how to interact with the resource manager or whatever. The Gateways will remain "resource manager agnostic". |
Beta Was this translation helpful? Give feedback.
-
When a kernel gets killed then we are unable to find the proper error message or reason for example whe it's OOM or any other reason due to which the kernel got killed or terminated. Wanted to know if there is anything which EG currently offers. We are using a dashboard application & we would like to show that message when the kernel dies due to such unknown errors.
We are using EG as a service which is deployed in our GKE cluster.
Beta Was this translation helpful? Give feedback.
All reactions