-
Notifications
You must be signed in to change notification settings - Fork 738
[Inference Providers] Fix structured output schema in chat completion #3082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Is this PR a follow-up of huggingface/huggingface.js#1380 or are the types added manually ? (just curious) |
I added the types manually for now (huggingface/huggingface.js#1380 is more of an experiment and not totally ready to be used). also, i'm not sure it's worth prioritizing the switch to the OpenAPI specs of OpenAI for now. That said, since we don't do any input validation, adding the new types here is not "necessary", it was mainly to show the discrepancy between TGI and OpenAI specs for structured output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! Now I understand better why we are always testing MCP servers on Nebius ^^
src/huggingface_hub/inference/_generated/types/chat_completion.py
Outdated
Show resolved
Hide resolved
src/huggingface_hub/inference/_generated/types/chat_completion.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't tested it myself for all providers but looks good to me! Just a nit regarding removing parameters from the input mapping (not sure it's necessary).
Pre-approving so that I'm not a blocker on this :)
This PR fixes compatibility issues with structured outputs across providers by ensuring the
InferenceClient
follows the OpenAI API specs structured output.Originally raised by @akseljoonas on Slack:
It turns out some providers don’t fully follow OpenAI’s spec for the
response_format
field. This PR ensures the client always uses the OpenAI-compliant format and adds internal mappings for each provider when needed.Note: When integrating a new provider into our clients, we should ensure that structured output and function calling are compatible with the OpenAI specs. If that’s not the case, a custom mapping should be added as part of the integration.