[Inference Providers] Fix structured output schema in chat completion #3082

hanouticelina · 2025-05-14T10:22:17Z

This PR fixes compatibility issues with structured outputs across providers by ensuring the InferenceClient follows the OpenAI API specs structured output.

Originally raised by @akseljoonas on Slack:

I have been trying out structured outputs through the InferenceClient on the hub's Python package, and I saw that each inference provider (Nebius, Novita, Together, etc) has a slightly different format that they expect for the call with structured output. So I wanted to ask if there is any mapping being done under the hood to match the format of the provider? Im currently passing in the schema below and getting 500 errors with Novita while it works with Nebius
response_format = {
"type": "json_object",
"value": response_format["json_schema"]["schema"],
}

It turns out some providers don’t fully follow OpenAI’s spec for the response_format field. This PR ensures the client always uses the OpenAI-compliant format and adds internal mappings for each provider when needed.

Note: When integrating a new provider into our clients, we should ensure that structured output and function calling are compatible with the OpenAI specs. If that’s not the case, a custom mapping should be added as part of the integration.

HuggingFaceDocBuilderDev · 2025-05-14T10:50:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Wauplin · 2025-05-14T15:05:44Z

Is this PR a follow-up of huggingface/huggingface.js#1380 or are the types added manually ? (just curious)

hanouticelina · 2025-05-14T15:17:24Z

I added the types manually for now (huggingface/huggingface.js#1380 is more of an experiment and not totally ready to be used). also, i'm not sure it's worth prioritizing the switch to the OpenAPI specs of OpenAI for now.

That said, since we don't do any input validation, adding the new types here is not "necessary", it was mainly to show the discrepancy between TGI and OpenAI specs for structured output.

Wauplin

Thanks for working on this! Now I understand better why we are always testing MCP servers on Nebius ^^

src/huggingface_hub/inference/_providers/cohere.py

src/huggingface_hub/inference/_providers/nebius.py

src/huggingface_hub/inference/_generated/types/chat_completion.py

src/huggingface_hub/inference/_providers/hf_inference.py

src/huggingface_hub/inference/_providers/together.py

Wauplin

Haven't tested it myself for all providers but looks good to me! Just a nit regarding removing parameters from the input mapping (not sure it's necessary).

Pre-approving so that I'm not a blocker on this :)

src/huggingface_hub/inference/_providers/cohere.py

fix structured output

877f0de

hanouticelina requested review from julien-c, Wauplin and SBrandeis May 14, 2025 10:22

fix

b15d02d

hanouticelina marked this pull request as draft May 14, 2025 10:31

hanouticelina added 3 commits May 14, 2025 12:39

Merge branch 'main' into fix-response-format-providers

3792d26

style

638d45a

run style again

16698de

fix tests

1c3300f

Merge branch 'main' into fix-response-format-providers

efbe125

hanouticelina marked this pull request as ready for review May 21, 2025 10:41

Wauplin reviewed May 21, 2025

View reviewed changes

hanouticelina added 2 commits May 21, 2025 16:21

rename types

720d95f

review suggestions

eaedda0

hanouticelina requested a review from Wauplin May 21, 2025 15:14

Wauplin approved these changes May 21, 2025

View reviewed changes

src/huggingface_hub/inference/_providers/cohere.py Outdated Show resolved Hide resolved

hanouticelina added 3 commits May 21, 2025 17:51

no need to mutate parameters

ce45ef0

docs

5ea6a0f

better

edd4465

hanouticelina merged commit 417ad89 into main May 22, 2025
25 checks passed

hanouticelina deleted the fix-response-format-providers branch May 22, 2025 10:16

This was referenced Jun 17, 2025

Update chat-completion json_schema specs (Python) #3167

Open

Update chat-completion json_schema specs (JS) huggingface/huggingface.js#1537

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Inference Providers] Fix structured output schema in chat completion #3082

[Inference Providers] Fix structured output schema in chat completion #3082

Uh oh!

hanouticelina commented May 14, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 14, 2025

Uh oh!

Wauplin commented May 14, 2025

Uh oh!

hanouticelina commented May 14, 2025

Uh oh!

Wauplin left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Wauplin left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Inference Providers] Fix structured output schema in chat completion #3082

[Inference Providers] Fix structured output schema in chat completion #3082

Uh oh!

Conversation

hanouticelina commented May 14, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 14, 2025

Uh oh!

Wauplin commented May 14, 2025

Uh oh!

hanouticelina commented May 14, 2025

Uh oh!

Wauplin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Wauplin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!