Skip to content

1.15x 7b sfp prefill speedup: Matmul in attention#256

Merged
copybara-service[bot] merged 1 commit intodevfrom
test_644049023
Jun 18, 2024
Merged

1.15x 7b sfp prefill speedup: Matmul in attention#256
copybara-service[bot] merged 1 commit intodevfrom
test_644049023

Conversation

@copybara-service
Copy link

1.15x 7b sfp prefill speedup: Matmul in attention
2b bf16:
prefill 114.456 -> 115.222
decode 16.8847 -> 16.9987

7b sfp:
prefill 18.8575 -> 21.7325
decode 5.68428 -> 5.79791

2b bf16:
prefill 114.456 -> 115.222
decode  16.8847 -> 16.9987

7b sfp:
prefill 18.8575 -> 21.7325
decode 5.68428 -> 5.79791

PiperOrigin-RevId: 644283676
@copybara-service copybara-service bot merged commit a07f60c into dev Jun 18, 2024
@copybara-service copybara-service bot deleted the test_644049023 branch June 18, 2024 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant