Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rope optimization part 2 #331

Merged
merged 3 commits into from
Feb 9, 2024
Merged

Rope optimization part 2 #331

merged 3 commits into from
Feb 9, 2024

Conversation

rwitten
Copy link
Collaborator

@rwitten rwitten commented Jan 12, 2024

Single v4-8 step time from 1.209 to 1.199 secs.

(No convergence data since it is actually bit-wise identical locally)

Optimization From Blake!

…since it is actually bit-wise identical locally)

Optimiization From Blake!
@rwitten rwitten force-pushed the rwitten_blake_wizard branch from 0228dc2 to c7700b0 Compare January 12, 2024 18:49
@khatwanimohit khatwanimohit self-requested a review January 12, 2024 19:08
@rwitten rwitten force-pushed the rwitten_blake_wizard branch 2 times, most recently from 0428d7d to c7700b0 Compare January 13, 2024 00:27
@rwitten
Copy link
Collaborator Author

rwitten commented Jan 13, 2024

The GPU failure is

jaxlib.xla_extension.XlaRuntimeError: INTERNAL: All algorithms tried for %bitcast.220 = bf16[12,2048,8,256]{1,3,2,0} bitcast(bf16[12,8,2048,256]{2,3,1,0} %copy.130), metadata={op_name="jit(train_step)/jit(main)/transpose(jvp(Transformer))/decoder/while/body/remat/decoder/self_attention/query_rotary/convert_element_type[new_dtype=float32 weak_type=False]" source_file="/app/MaxText/layers/embeddings.py" source_line=172} failed. Falling back to default algorithm.  Per-algorithm errors:

when downcasting to bfloat16 on GPU.

@copybara-service copybara-service bot merged commit 71ae4a4 into main Feb 9, 2024
8 checks passed
@copybara-service copybara-service bot deleted the rwitten_blake_wizard branch February 9, 2024 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants