Skip to content

Commit 68238b7

Browse files
committed
docs: setting n_gqa is no longer required
1 parent 1981782 commit 68238b7

File tree

1 file changed

+0
-7
lines changed

1 file changed

+0
-7
lines changed

README.md

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -143,13 +143,6 @@ For instance, if you want to work with larger contexts, you can expand the conte
143143
llm = Llama(model_path="./models/7B/llama-model.gguf", n_ctx=2048)
144144
```
145145

146-
### Loading llama-2 70b
147-
148-
Llama2 70b must set the `n_gqa` parameter (grouped-query attention factor) to 8 when loading:
149-
150-
```python
151-
llm = Llama(model_path="./models/70B/llama-model.gguf", n_gqa=8)
152-
```
153146

154147
## Web Server
155148

0 commit comments

Comments
 (0)