Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
When the reasoning format
is deepseek
, the reasoning part (things between <think>
and </think>
) would be place in message.reasoning_content
. Is it possible to put the grammar / json schema enforcement after the </think>
?
Motivation
The model should be free to reason, but strict with an answer format. When the users use deepseek
reasoning format, it means they don't care about the reasoning so much, just want to have the answer separately.
Say I need to model to return the answer in a json format. If the model is free to reason for a while instead of putting the answer right in the json, the performance might be better.
Possible Implementation
A. Update the grammar root, enable a thinking section wrapped in <think>
and </think>
is the reasoning format is deepseek
or
B. An ugly way: let the model generate until it hits </think>
, then apply grammar. (This is the current work around method I'm using)