Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Autoreleasepools with Metal #103

Closed
habemus-papadum opened this issue Feb 23, 2023 · 3 comments · Fixed by #294
Closed

Use Autoreleasepools with Metal #103

habemus-papadum opened this issue Feb 23, 2023 · 3 comments · Fixed by #294
Labels
libraries Things about libraries and how we use them.

Comments

@habemus-papadum
Copy link
Contributor

habemus-papadum commented Feb 23, 2023

Objective-C methods in the Metal.framework should be executed within Autorelease Pools. This is also true for most other Cocoa frameworks, and not invoking methods within Autorelease pools leads to leaked memory.
todo: Make sure there is consensus on this conclusion / provide a compelling and concise argument if there are doubts.

We can easily expose autoreleasepools in libcmt with a single new function e.g.:
MtAutoreleasePool* mtNewAutoreleasePool() {return [[NSAutoreleasePool alloc] init];}

Weeds:

  • NSAutoreleasePool can only be explicitly created if Automatic Reference Counting is off (otherwise the compiler will force the use of @autoreleasepool blocks instead)
  • libcmt is currently compiled with arc off (the default)
  • Still we should mark that explicitly in CMake e.g. target_compile_options(cmt PRIVATE -fno-objc-arc)

Autorelease Pools are useful in macos/ios world and care should be taken not to hide them from the end user. With that said, they probably need to be auto-inserted by Metal.jl in the array programming interface, and also maybe in @Metal.sync

Autorelease Pools do have some complications:

  • need to consider how they will interact with Julia's gc
  • They create a qualitative different experience than say using CUDA.jl and probably make adding Metal.jl support to KernalAbstractions.jl more challenging
  • the underlying implementation uses threadlocal storage, which will need some consideration about how to use best with Julia tasks.
@habemus-papadum
Copy link
Contributor Author

come cryptic notes for my future self:

There are two places where Metal.jl is currently leaking due to no autorelease pools:

  • Slow path: compilation; Many string objects from internal method calls
  • Fast path: kernel launch: MtlCommandBuffer and encoder; for the fast path we might be able to ensure no yielding, in which case we can wrap in a autorelease pool

More broadly:
I think the best Julia solution is to modify libuv on darwin to drain the autorelease pool on each event loop cycle -- this is what glfw and the osx run loop does any way and most objective-c programs never need to use more than a single level of nesting of run loops. If we want, within tasks, we could allow one additional autorelease pool at any given time, guarded by a threadlocal lock -- so at most two levels of nesting

@habemus-papadum
Copy link
Contributor Author

Having libuv manage the first level of autorelease pool feels like the right solution.

We could wrap the following https://github.com/JuliaLang/libuv/blob/fa7058b865e3c4a5a9c9ff511ed3e589ce817a85/src/unix/core.c#L386-L441 in a @autoreleasepool {} scope.

GTK's solution:
https://github.com/GNOME/gtk/blob/36037a2ee89fd922b3ea4860c0e9fe94fb6138e5/gdk/macos/gdkmacoseventsource.c#L680-L703

GLFW's trick to get menu's etc to show properly: call [NSApp run] and then immediately stop the run loop in the delegate: https://github.com/glfw/glfw/blob/9a87635686c7fcb63ca63149c5b179b85a53a725/src/cocoa_init.m#L444

On my todo list: understand a little better the current on multithreading and libuv

@maleadt maleadt added the libraries Things about libraries and how we use them. label May 22, 2023
@maleadt
Copy link
Member

maleadt commented Feb 28, 2024

From the metal-cpp docs:

AutoreleasePools and Objects

Several methods that create temporary objects in metal-cpp add them to an AutoreleasePool to help manage their lifetimes. In these situations, after metal-cpp creates the object, it adds it to an AutoreleasePool, which will release its objects when you release (or drain) it.

By adding temporary objects to an AutoreleasePool, you do not need to explicitly call release() to deallocate them. Instead, you can rely on the AutoreleasePool to implicitly manage those lifetimes.

If you create an object with a method that does not begin with alloc, new, copy, mutableCopy, or Create, the creating method adds the object to an autorelease pool.

The typical scope of an AutoreleasePool is one frame of rendering for the main thread of the program. When the thread returns control to the RunLoop (an object responsible for receiving input and events from the windowing system), the pool is drained, releasing its objects.

You can create and manage additional AutoreleasePools at smaller scopes to reduce your program's working set, and you are required to do so for any additional threads your program creates.

If an object's lifecycle needs to be extended beyond the scope of an AutoreleasePool instance, you can claim ownership of it by calling its retain() method before the pool is drained. In these cases, you are responsible for making the appropriate release() call on the object after you no longer need it.

You can find a more-detailed introduction to the memory management rules here: https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/mmRules.html, and here: https://developer.apple.com/library/archive/documentation/CoreFoundation/Conceptual/CFMemoryMgmt/Concepts/Ownership.html

For more details about the application's RunLoop, please find its documentation here: https://developer.apple.com/documentation/foundation/nsrunloop

Use and debug AutoreleasePools

When you create an autoreleased object and there is no enclosing AutoreleasePool, the object is leaked.

To prevent this, you normally create an AutoreleasePool in your program's main function, and in the entry function for every thread you create. You may also create additional AutoreleasePools to avoid growing your program's high memory watermark when you create several autoreleased objects, such as when rendering.

Use the Environment Variable OBJC_DEBUG_MISSING_POOLS=YES to print a runtime warning when an autoreleased object is leaked because no enclosing AutoreleasePool is available for its thread.

You can also run leaks --autoreleasePools on a memgraph file or a process ID (macOS only) to view a listing of your program's AutoreleasePools and all objects they contain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libraries Things about libraries and how we use them.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants