Conversation
262b6b1 to
36fb0d7
Compare
|
This PR correctly adds the response type This change emphasizes an existing issue with As this PR is non breaking an only adds the correct functionality. It may make sense to follow up with the changes to the Issues related to enabling
|
|
Thanks for adding the While this PR does implement it, I'm wondering if it might miss an opportunity to better align with OpenAI's specification, which was a key point in issue #3058. The key difference between OpenAI and TGI isn't just about Instead of making another alias, I'd propose to change the required response format for response_format: {
"type": "json_schema",
"json_schema": {
"name": "some_name",
"strict": true,
"schema": ... // the actual json schema
}
} |
|
this PR has been updated to handle # model id: meta-llama/Meta-Llama-3.1-8B-Instruct
import requests
import json
# simple person JSON schema
person_schema = {
"type": "object",
"properties": {
"firstName": {
"type": "string",
"description": "The person's first name.",
"minLength": 3,
},
"lastName": {
"type": "string",
"description": "The person's last name.",
"minLength": 3,
},
"age": {"type": "integer", "minimum": 0},
},
"required": ["firstName", "lastName", "age"],
}
response = requests.post(
"http://localhost:3000/v1/chat/completions",
json={
"model": "model-name",
"messages": [
{
"role": "user",
"content": "John Smith is a 32-year-old software engineer.",
}
],
"temperature": 0.0,
"response_format": {
"type": "json_schema",
"value": {"name": "person", "strict": True, "schema": person_schema},
},
},
headers={"Content-Type": "application/json"},
)
print(json.dumps(response.json(), indent=2))with the response: {
"object": "chat.completion",
"id": "",
"created": 1742224552,
"model": "model-name",
"system_fingerprint": "3.1.1-dev0-native",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "{ \"firstName\": \"John\", \"lastName\": \"Smith\", \"age\": 32 }"
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 46,
"completion_tokens": 23,
"total_tokens": 69
}
} |
jorado
left a comment
There was a problem hiding this comment.
Other than this fix, everything looks good to me! Let me know if there’s anything else I can help with.
| assert called == '{ "unit": "fahrenheit", "temperature": [ 72, 79, 88 ] }' | ||
| assert chat_completion == response_snapshot | ||
|
|
||
| json_payload["response_format"]["type"] = "json_schema" |
There was a problem hiding this comment.
| json_payload["response_format"]["type"] = "json_schema" | |
| json_payload["response_format"] = { | |
| "type": "json_schema", | |
| "value": {"name": "weather", "strict": True, "schema": Weather.schema()}, | |
| } |
This PR reopens #2982 to run CI