Skip to content

Implementation of tiled attention with bf16 and circular buffers which reduces memory requirements by 4x on longer context on gemma models #6481

Implementation of tiled attention with bf16 and circular buffers which reduces memory requirements by 4x on longer context on gemma models

Implementation of tiled attention with bf16 and circular buffers which reduces memory requirements by 4x on longer context on gemma models #6481

Triggered via pull request February 23, 2026 16:58
Status Failure
Total duration 8m 41s
Artifacts

build.yml

on: pull_request
Matrix: build
Fit to window
Zoom out
Zoom in

Annotations

4 errors and 2 warnings
ubuntu-latest (make) Release
Process completed with exit code 2.
macos-latest (make) Release
Process completed with exit code 2.
windows-latest (windows) Release
Canceling since a higher priority waiting request for build-refs/pull/839/merge-windows-latest-windows-Release exists
windows-latest (windows) Release
The operation was canceled.
macos-latest (make) Release
Failed to save: Failed to FinalizeCacheEntryUpload: Received non-retryable error: Failed request: (404) Not Found: cache entry not found
bazel
Failed to restore: Cache service responded with 400