Skip to content

Update examples from ggml to gguf and add hw-accel note for Web Server#688

Merged
abetlen merged 3 commits intoabetlen:mainfrom
jasonacox:patch-1
Sep 14, 2023
Merged

Update examples from ggml to gguf and add hw-accel note for Web Server#688
abetlen merged 3 commits intoabetlen:mainfrom
jasonacox:patch-1

Conversation

@jasonacox
Copy link
Copy Markdown
Contributor

Minor README documentation updates:

  • Convert references in examples from ggml models to gguf
  • Add comment to Web Server section that Hardware Acceleration can be applied to the server installation as well:

Similar to Hardware Acceleration section above, you can also install with GPU (cuBLAS) support like this:

CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python[server]
python3 -m llama_cpp.server --model models/7B/gguf-model.bin --n_gpu_layers 35

@abetlen
Copy link
Copy Markdown
Owner

abetlen commented Sep 12, 2023

@jasonacox thank you! I think for the filenames the file extension actually changed from .bin to .gguf so something like model.gguf or similar would be a little better.

Update examples to use filenames with gguf extension (e.g. llama-model.gguf).
@jasonacox
Copy link
Copy Markdown
Contributor Author

jasonacox commented Sep 13, 2023

Great idea @abetlen ! I added a commit to change these examples to llama-model.gguf.

By the way, THANK YOU for the library! 🙇 Brilliant work. ❤️

@abetlen abetlen merged commit 40b2290 into abetlen:main Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants