Question about chat history #2896
-
|
Hey @GabrielBianconi @virajmehta , Previously I asked the question in slack, but after some experiments it still didn't work, I think there are some misunderstanding, so I'll write down the question here for better communication. We are building an automation tool based on LLM, the idea is Step 1 User will submit a URL, we will crawl the content and ask LLM to get a detailed understanding of the website, let's say in this step we define a function step1 with user_prompt1, user_schema1, and output_schema1. The request should be like: llm.InferenceInput {
Messages: []llm.Message{
llm.Message {
role: "user",
content: []llm.ContentBlock {
llm.ContentBlock {
"arguments": map[string]string {
"step1": "step1 args",
},
},
},
},
},
}Everything works well then we move on to Step 2, we will ask LLM to generate more information after user confirm the website understanding is correct, let's say in this step define another function step2 with user_prompt2, user_schema2, and output_schema2. In step2, the user_schema2 is different with user_schema1, and in step 2, we should have step1 user/assistant message as the context, so in TensorZero Request, it should be similar to: llm.InferenceInput {
Messages: []llm.Message{
llm.Message {
role: "user",
content: []llm.ContentBlock {
llm.ContentBlock {
"arguments": map[string]string {
"step1": "step1 args",
},
},
},
},
llm.Message {
role: "assistant",
content: []llm.ContentBlock {
llm.ContentBlock {
"content": "step1 llm response",
},
},
},
llm.Message {
role: "user",
content: []llm.ContentBlock {
llm.ContentBlock {
"arguments": map[string]string {
"step2": "step2 args",
},
},
},
},
},
}But step 2 request will NOT work because when evaluating the request, the first message have key "step1" as the arguments, but step2 requires key "step2", and TensorZero will return 400. From my understanding, in our step2 request, the first two messages should just be static text (without arguments evaluating, they even don't which function this message belongs to), then maybe everything will works well. But the issue is we managed all template in TensorZero, and in our service code we have can get the complete request text unless we duplicate the template in our code which violates the reason we use TensorZero. Any help is appreciated! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Hi @wangfenjin - we came up with a more general solution here: #2972 Upgrading this feature request into an issue. We'll implement it soon, thanks! |
Beta Was this translation helpful? Give feedback.
Hi @wangfenjin - we came up with a more general solution here: #2972
Upgrading this feature request into an issue. We'll implement it soon, thanks!