Skip to content
This repository was archived by the owner on Jun 24, 2024. It is now read-only.

10102.md #460

Merged
merged 5 commits into from
Dec 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 142 additions & 0 deletions content/notes/wwdc22/10102.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
---
contributors: parjohns
Speakers: Galo Avila, engineering manager in GPU Software, and Eylon Caspi.
---

# Offline Compilation
Offline compilation can help reduce app stutters, first launch, and new level load times. This is accomplished by moving GPU binary generation to project build time.

## Previous Way to Generate GPU Binaries
- Metal library is instantiated from source during runtime using AIR (Apple's Intermediate Representation) this operation is CPU intensive.
- Library generation can be moved to build time by precompiling the source file and instantiating it.
- When the Metal library is in memory, a Pipeline State Descriptor and Pipeline State Object (PSO) are created.
- Creating a PSO is CPU intensive.
- After the PSO is created, just-in-time GPU binary generation takes place.
- When PSOs are created, Metal stores the GPU binaries in its file system cache.
- Binary archives let users control when and where GPU binaries are cached.
- PSO creation can become a lightweight operation by using PSO descriptors to cache GPU binaries in an archive.
![Runtime][runtime]
## What's New
Offline binary generation allows for a Metal pipeline script to be specified at project build time. This new artifact is equivalent to a collection of Pipeline State Descriptors in the API. The output provides a binary archive that can be loaded to accelerate PSO creation.
![psodescriptor][psodescriptor]

### Creating Metal Pipeline Script
A Metal pipeline script is a JSON formatted description of one or more API Pipeline State Descriptors and can be created in a JSON editor or harvested from the binary archives.


**Using a JSON Editor**
1. Specify API Metal library file path.
2. Add API render descriptor function names as render pipeline properties.
3. Add pipeline state information (such as raster_sample_count or pixel formats).

Metal code for generating a render pipeline script
```
// An existing Obj-C render pipeline descriptor
NSError *error = nil;
id<MTLDevice> device = MTLCreateSystemDefaultDevice();

id<MTLLibrary> library = [device newLibraryWithFile:@"default.metallib" error:&error];

MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new];
desc.vertexFunction = [library newFunctionWithName:@"vert_main"];
desc.fragmentFunction = [library newFunctionWithName:@"frag_main"];
desc.rasterSampleCount = 2;
desc.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;
desc.depthAttachmentPixelFormat = MTLPixelFormatDepth32Float;
```

JSON equivalent
```{
"//comment": "Its equivalent new JSON script",
"libraries": {
"paths": [
{
"path": "default.metallib"
}
]
},
"pipelines": {
"render_pipelines": [
{
"vertex_function": "vert_main",
"fragment_function": "frag_main",
"raster_sample_count": 2,
"color_attachments": [
{
"pixel_format": "BGRA8Unorm"
},
],
"depth_attachment_pixel_format": "Depth32Float"
}
]
}
}
```
Further schema details can be found in Metal's developer documentation
https://developer.apple.com/documentation/metal

**Using Metal Runtime**

This is done during runtime
1. Create Pipeline Descriptor with state and functions
2. Add descriptor to binary archive
3. Serialize binary archive to be imported by app

Harvesting sample
```// Create pipeline descriptor
MTLRenderPipelineDescriptor *pipeline_desc = [MTLRenderPipelineDescriptor new];
pipeline_desc.vertexFunction = [library newFunctionWithName:@"vert_main"];
pipeline_desc.fragmentFunction = [library newFunctionWithName:@"frag_main"];
pipeline_desc.rasterSampleCount = 2;
pipeline_desc.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;
pipeline_desc.depthAttachmentPixelFormat = MTLPixelFormatDepth32Float;

// Add pipeline descriptor to new archive
MTLBinaryArchiveDescriptor* archive_desc = [MTLBinaryArchiveDescriptor new];
id<MTLBinaryArchive> archive = [device newBinaryArchiveWithDescriptor:archive_desc error:&error];
bool success = [archive addRenderPipelineFunctionsWithDescriptor:pipeline_desc error:&error];

// Serialize archive to file system
NSURL *url = [NSURL fileURLWithPath:@"harvested-binaryArchive.metallib"];
success = [archive serializeToURL:url error:&error];
```

Extracting JSON pipelines script from binary archive can be done to move generation from runtime to build time.

This can be done by using `metal-source` while specifying buffers and output directory options

```metal-source -flatbuffers=json harvested-binaryArchive.metallib -o /tmp/descriptors.mtlp-json```

### Generating Offline GPU Binaries

Generating GPU binary from source can be done by invoking `metal` with source, pipeline script, and output

```metal shaders.metal -N descriptors.mtlp-json -o archive.metallib```

Generating GPU Binary from Metal Library can be done by invoking `metal-tt` with source, pipeline script, and output file

```metal-tt shaders.metallib descriptors.mtlp-json -o archive.metallib```

### Loading Offline Binaries
1. Provide binary archive URL when creating archive descriptor
2. Use URL to instantiate archive

For more information regarding API, see last year's talk.

https://developer.apple.com/videos/play/wwdc2021/10229

# Optimize for Size
The Metal Compiler optimizes aggressively for runtime performance. These optimizations may expand the GPU program size which may have unexpected costs. Xcode 14 provides a new optimization mode for metal: optimize for size.
This setting prevents optimizations such as inlining and loop unrolling which should lower application size and compile time. This setting is useful for situations where the user encounters long compilation time. Optimize for size may hurt runtime performance, however it may improve runtime performance if the program was
incurring runtime penalties associated with large size.

## Enabling Optimize for Size
This feature can be enabled in 3 ways:
1. In Xcode build settings under Metal Compiler - Build Options by selecting Size [-Os] under Optimization Level
2. In Terminal with option `-Os`
3. In Metal Framework setting `MTLLibraryOptimizationLevelSize` in an `MTLCompileOptions` object



[runtime]: ../../../images/notes/wwdc22/10102/runtime.JPG
[psodescriptor]: ../../../images/notes/wwdc22/10102/psodescriptor.JPG
Binary file added images/notes/wwdc22/10102/psodescriptor.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/notes/wwdc22/10102/runtime.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.