Implementation of tiled attention with bf16 and circular buffers which reduces memory requirements by 4x on longer context on gemma models. #6468
Annotations
4 errors and 1 warning
|
bazel
Process completed with exit code 1.
|
|
macos-latest (make) Release
Process completed with exit code 2.
|
|
ubuntu-latest (make) Release
Process completed with exit code 2.
|
|
windows-latest (windows) Release
Process completed with exit code 1.
|
|
bazel
Failed to restore: Cache service responded with 400
|