Replies: 2 comments 2 replies
-
You're correct that the way conversation history is formatted in the prompt can encourage the LLM to continue generating additional, fictional dialogue turns. This is a common issue when the prompt ends with something like:
The model often interprets this as an invitation to continue the conversation pattern. Here’s what I found regarding your questions: 1. Modifying the History Prompt TemplateYou can modify the history prompt template in 2. Marking the Last User Query DifferentlyThere is no built-in feature in
This can help signal to the model where the conversation history ends and where it should stop generating. 3. End-of-Sequence (EOS) TokenThere is no direct support in 4. Alternative Approaches
Summary: If you need an example of a custom prompt template or further guidance on implementation, let me know! To continue talking to Dosu, mention @dosu-bot. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
Beta Was this translation helpful? Give feedback.
-
Issue resolved by explicitly setting stop sequences for the model to prevent it from generating additional rounds of conversation.
This forces the model to stop generating output once it encounters one of those stop tokens. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm using the
ContextChatEngine
in my implementation to support conversation history. Since switching to this engine, I've noticed that the LLM (using llama.cpp with Mistral/Gemma models) sometimes generates multiple rounds of user/assistant dialogue by itself in the response. It creates fictional back-and-forth conversations instead of just answering the current prompt.I suspect this behavior might be caused by how the conversation history is structured in the final prompt. From what I see, it's formatted like:
This may be encouraging the model to continue the pattern and generate additional, made-up turns.
Is there any way to prevent this behavior?
Any advice or alternative approaches would be helpful.
Beta Was this translation helpful? Give feedback.
All reactions