Merged
Conversation
|
Found the second server confusing as well. And code duplication. |
6baee85 to
eae21ac
Compare
Owner
|
@Stonelinks I've updated the app server to use a more standard |
This commit "deprecates" the example fastapi server by remaining runnable but pointing folks at the module if they want to learn more. Rationale: Currently there exist two server implementations in this repo: - `llama_cpp/server/__main__.py`, the module that's runnable by consumers of the library with `python3 -m llama_cpp.server` - `examples/high_level_api/fastapi_server.py`, which is probably a copy-pasted example by folks hacking around IMO this is confusing. As a new user of the library I see they've both been updated relatively recently but looking side-by-side there's a diff. The one in the module seems better: - supports logits_all - supports use_mmap - has experimental cache support (with some mutex thing going on) - some stuff with streaming support was moved around more recently than fastapi_server.py
eae21ac to
0fcc25c
Compare
Contributor
Author
|
cool, yeah create_app is better than whatever i did! fixed up the last commit on this branch |
xaptronic
pushed a commit
to xaptronic/llama-cpp-python
that referenced
this pull request
Jun 13, 2023
* Update README.md * Update README.md remove facebook
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Split this in case these two implementations exist for a reason, but this depends on #125. Really its just the one commit at the end.
I found it confusing that two fastapi servers exist in this repo (more detail Stonelinks#2)
So this PR "deprecates" the example by pointing folks at the module one, but still remaining runnable.