Misc. bug: Reasoning content is not separated when streaming

### Name and Version

version: 5523 (aa6dff05)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
llama-server -m models/Qwen3-30B-A3B-IQ4_XS.gguf --jinja
```

### Problem description & steps to reproduce

Thinking content should be separated when streaming too.

@ochafik in #12379 said:
> Note: Ideally, we'd stream the thoughts as a reasoning_content delta (now trivial to implement), but for now we are just aiming for compatibility w/ DeepSeek's API (if --reasoning-format deepseek, which is the default).

I just tested using the official deepseek API and thoughts are separated.

Official deepseek API:
`"choices":[{"index":0,"delta":{"content":null,"reasoning_content":"Okay"},"logprobs":null,"finish_reason":null}]}`
llama.cpp server API:
`"choices":[{"finish_reason":null,"index":0,"delta":{"content":"<think>Okay"}}]`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Reasoning content is not separated when streaming #13867

Name and Version

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Reasoning content is not separated when streaming #13867

Description

Name and Version

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions