-
Notifications
You must be signed in to change notification settings - Fork 6
SEGFAULT
and/or SEGABRT
with GtkCanvas
#87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
My solution for our UI package for now will be to have something akin to: mutable struct CairoMakieCanvas
canvas::GtkCanvas
figure::CairoMakie.Figure
end
function draw(fig::Figure, canvas::CairoMakieCanvas)
canvas.figure = fig
@guarded draw(canvas.canvas) do widget
# ... CairoMakie.cairo_draw and so
end
end
and update the figure and draw |
When the canvas is resized, the Alternatively, could you call |
I'm not sure if I understand the situation correctly. The replacement due to resize would happen in either case right? And also with the the workaround I mentioned in my comment? In the MWE the stacktrace seemed to only be printed in the case without the I played around a bit more with the workaround I posted and so far I wasn't able to crash the GUI again. I'll only be able to a do proper stress test on monday with live image reconstructions and multiple sensor readings in parallel |
Ah okay I see the issue with having to redraw the surface now. What is still unclear to me why we get an issue when we construct the figure locally and don't have an outside reference to it. Maybe that issue is not being reproduced by the MWE correctly. For the reference issue we could try an approach with a package extension on CairoMakie and using ScopedValues. In the extension we could define a new draw method as follows: const figure_cache = ScopedValue(CairoMakie.Figure)()
function Gtk4.draw(fig::CairoMakie.Figure, canvas::GtkCanvas)
@with figure_cache => fig begin
@guarded draw(canvas) do widget
f = figure_cache[]
screen = CairoMakie.Screen(f.scene, config, Gtk4.cairo_surface(canvas))
CairoMakie.resize!(f.scene, Gtk4.width(widget), Gtk4.height(widget))
CairoMakie.cairo_draw(screen, f.scene)
end
end
end And then I think each instance of this function call should have its own figure_cache. I haven't tried it out yet (because Julia is broken on my PC atm), but once I do I can make a PR if I see that helping us with our problem |
For me the stacktraces appear whether or not the Do the crashes occur more when you use multiple threads? |
Yes so with the CairoMakie figures we would get a few assertion errors for Makie where the surface was a C_NULL. If we had a lot of drawing updates either because we had many plots being updated "at the same time" or updates with a quick succession we had our segfaults. Those two cases I can test tomorrow more thoroughly with the In those cases we also always have more than two threads, but I think most updates to the drawing function happen within The issue where I tested my workaround I could reproduce by opening two windows with I think 5 GtkCanvases each. That just SEGFAULT/SEBARBT reliably recently and it happened with just one thread too. All the data and code one would need for that is open-source, so I can give you a reproducer for that, but it's not a small MWE |
Oh and we just added a reference to the figure, not to the surface. Now I'm unsure if that actually fixed it for us or if we just dont encounter our race condition anymore |
Okay so we are adding/updating the I think I'm still missing something, because it looks like when we throw away the surface and resize if, that should happen "atomically" from a GTK loop perspective and the draw calls shouldn't be invoked when a surface is currently empty. It might be that our GUI is keeping some reference to the wrong CairoContext and tries to draw to it and now that I switched over to making most things in Makie I accidentally fixed it I did notice that we currently have multiple draw calls per resize in Gtk4. One in |
So far we also don't have anymore crashes when doing "online" reconstructions, which we used to immidiately have last week. In that setting we have a thread receiving data and and another thread periodically reconstructing a 3D image which we then display on such grid of canvases: Now each canvas is updated with its own figure which it caches until the next image is reconstructed. Once I'm back at the system I can also try resizing everything during this process. I also didn't find any function our the GUI that keeps a reference to a CairoContext that is outside of a draw function call |
@jwahlstrand I was trying to get a better MWE and I can reproduce a similar crash with it: using Gtk4, CairoMakie, Cairo
config = CairoMakie.ScreenConfig(1.0, 1.0, :good, true, false, nothing)
CairoMakie.activate!()
grid = GtkGrid()
grid[1,1] = GtkCanvas()
grid[1,2] = GtkCanvas()
grid[2,1] = GtkCanvas()
grid[2,2] = GtkCanvas()
grid.row_homogeneous = true
grid.column_homogeneous = true
w = GtkWindow(grid,"CairoMakie example")
function showData(canvas, data::AbstractMatrix)
@guarded draw(canvas) do widget
f, ax, p = heatmap(data)
CairoMakie.autolimits!(ax)
screen = CairoMakie.Screen(f.scene, config, Gtk4.cairo_surface(canvas))
CairoMakie.resize!(f.scene, Gtk4.width(widget), Gtk4.height(widget))
CairoMakie.cairo_draw(screen, f.scene)
end
end
function showData(canvas, data::AbstractVector)
@guarded draw(canvas) do widget
f, ax, p = lines(data)
CairoMakie.autolimits!(ax)
screen = CairoMakie.Screen(f.scene, config, Gtk4.cairo_surface(canvas))
CairoMakie.resize!(f.scene, Gtk4.width(widget), Gtk4.height(widget))
CairoMakie.cairo_draw(screen, f.scene)
end
end
function updateData(g, i)
data = zeros(32, 32)
val = mod1(div(i[], 10), size(data, 1))
data[val, :] .= 1
showData(g[1,1], data)
showData(g[1,2], data')
showData(g[2,1], data[1, :])
showData(g[2,2], data)
i[] = i[] + 1
end
index = Ref(1)
timer = Timer(timer -> updateData(grid, index), 0, interval = 0.05) I also tried this variant with keeping a vector of 4 figures which I update but there I get the same error. So it behaves a bit different to our setup, because there the caching seems to have helped. |
I can reproduce here on Mac. The stack trace is: , ┌ Warning: Error in @guarded callback
│ exception =
│ AssertionError: surface.ptr != C_NULL
│ Stacktrace:
│ [1] get_render_type
│ @ ~/.julia/packages/CairoMakie/JchUZ/src/cairo-extension.jl:79 [inlined]
│ [2] device_scaling_factor
│ @ ~/.julia/packages/CairoMakie/JchUZ/src/screen.jl:125 [inlined]
│ [3] CairoMakie.Screen(scene::Scene, config::CairoMakie.ScreenConfig, surface::Cairo.CairoSurfaceBase{UInt32})
│ @ CairoMakie ~/.julia/packages/CairoMakie/JchUZ/src/screen.jl:341 So it seems that the surface returned by |
I can also reproduce on Linux. When the lines that create the canvases are replaced with:
it works. So if the canvases (the Julia objects) are kept around, the surfaces survive. This suggests maybe there is a bug in how Gtk4 handles the case when the Julia object goes out of scope but the corresponding GObject isn't destroyed. When that happens, the Julia object is supposed to be saved in |
Interesting. I would have assumed that the grid should somehow hold the (Julia) reference to all its children. |
I understand what the code in gtype.jl does, but I don't understand in many cases why it's written that way. There are many cases where objects are kept around that should be garbage collected, leading to effective memory leaks. All my attempts to fix that have resulted in segfaults. It definitely also has issues when there is more than one thread. Interestingly, I went back to the original code (where the canvases are just put in the grid) and tried running just the first iteration of the timer, then starting the timer in the REPL, and the code works. When I did this, the canvases all appeared in |
…n `gc_preserve_glib` This prevents these objects, which are "subclasses" such as GtkCanvas, from being garbage collected at all. Previously, the code was supposed to intercept the garbage collection and do this "weak to strong" conversion if the object survives being unreferenced by Julia, but #87 seems to indicate a problem. This is a band-aid fix for that issue.
…n `gc_preserve_glib` (#89) This prevents these objects, which are "subclasses" such as GtkCanvas, from being garbage collected at all. Previously, the code was supposed to intercept the garbage collection and do this "weak to strong" conversion if the object survives being unreferenced by Julia, but #87 seems to indicate a problem. This is a band-aid fix for that issue.
Uh oh!
There was an error while loading. Please reload this page.
Hello,
for I think a few years now, we've had sporadic segmenation faults in our Gtk4.jl GUI. With recent versions we've been able to somewhat reliably produce the error and it looked like the CairoSurfaces of the GtkCanvas were somtimes garbage collected.
Usually we didnt't see any Julia-level errors or assertions about that. Today we tried to change our GtkCanvas printing from Cairo.jl to CairoMakie.jl and there we got some more Julia stacktraces and log messages.
I've been able to produce this MWE:
which prints a stacktrace if the surface access is illegal. So on the hand the "bug" is partially in the user code, because the figure isn't referenced and thus can be garbage collected.
But I'm not sure why that garbage collection triggers the CairoSurface to be also destroyed. I think that part might be an issue in Gtk4.jl. With this MWE I've also not managed to create a segfault, so something "regenerates" the cairo surface?
I saw here that the canvas get be reinitialized when resized, so maybe it happens there
The text was updated successfully, but these errors were encountered: