Skip to content

models_usage shows prompt_tokens and completion_tokens as 0 #6548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
quantexperts opened this issue May 16, 2025 · 5 comments · Fixed by #6578
Closed

models_usage shows prompt_tokens and completion_tokens as 0 #6548

quantexperts opened this issue May 16, 2025 · 5 comments · Fixed by #6578
Assignees

Comments

@quantexperts
Copy link

What happened?

Describe the bug
I was previously using v0.5.1 and sometime over the last few weeks, I had upgraded to v0.5.6. I was checking the history data and noticed that the prompt_tokens and completion_tokens under models_usage are coming as 0.
I remember they used to report token count but I can only see zero.

To Reproduce

response = await agent.on_messages(
	messages=[request], 
	cancellation_token=CancellationToken()
)
print('>>>>', response.chat_message.json())

Expected behavior
Expecting token usage counts

Screenshots
I see output like: {"source":"assistant","models_usage":{"prompt_tokens":0,"completion_tokens":0},"metadata":{},"content":"Hello! How can I assist you today?","type":"TextMessage"}

Which packages was the bug in?

Python Core (autogen-core)

AutoGen library version.

Python 0.5.6

Other library version.

No response

Model used

gpt-4o-mini, gemini-2.0-flash

Model provider

OpenAI

Other model provider

Gemini

Python version

3.12

.NET version

None

Operating system

None

@ekzhu
Copy link
Collaborator

ekzhu commented May 17, 2025

v0.5.7, cannot reproduce

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main():
    client = OpenAIChatCompletionClient(model="gemini-2.0-flash")
    assistant = AssistantAgent(
        "assistant",
        model_client=client,
        system_message="You are a helpful assistant.",
    )
    result = await assistant.run(task="What is the capital of France?")
    print(result)

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())
messages=[TextMessage(source='user', models_usage=None, metadata={}, content='What is the capital of France?', type='TextMessage'), TextMessage(source='assistant', models_usage=RequestUsage(prompt_tokens=13, completion_tokens=8), metadata={}, content='The capital of France is Paris.\n', type='TextMessage')] stop_reason=None

@ekzhu ekzhu added the awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster label May 17, 2025
@quantexperts
Copy link
Author

quantexperts commented May 18, 2025

Hi, below is a sample code using Azure OpenAI but I noticed the same behaviour with Gemini as well. Please let me know what am I missing:

from autogen_agentchat.agents import AssistantAgent
from autogen_core.models import ChatCompletionClient, UserMessage
from autogen_agentchat.messages import TextMessage
# Print loaded versions
import importlib.metadata

def print_versions():
    packages = ["autogen-agentchat", "autogen-core"]
    for pkg in packages:
        try:
            version = importlib.metadata.version(pkg)
            print(f"{pkg} version: {version}")
        except importlib.metadata.PackageNotFoundError:
            print(f"{pkg} is not installed or version not found")

print_versions()

autogen-agentchat version: 0.5.7
autogen-core version: 0.5.7

model_config = {
    'provider': 'autogen_ext.models.openai.AzureOpenAIChatCompletionClient', 
    'config': 
        {
            'model': 'gpt-4o-mini', 
            'azure_endpoint': os.getenv('AZURE_ENDPOINT_URL'),
            'api_version': '2025-02-01-preview', 
            'max_tokens': 500,
            'temperature': 0.1
        }
}
model_client = ChatCompletionClient.load_component(model_config)

async def main():
    agent = AssistantAgent(
            name="assistant",
            model_client=model_client,
            model_client_stream=True,
            system_message = "You are a helpful assistant",
            reflect_on_tool_use=True
        )
    request = TextMessage(content="What is the capital of France?", source="user")
    stream = agent.run_stream(task=request)
    async for message in stream:
        if isinstance(message, TextMessage):
            print(message.model_dump())
            
await main()

{'source': 'user', 'models_usage': None, 'metadata': {}, 'content': 'What is the capital of France?', 'type': 'TextMessage'}
{'source': 'assistant', 'models_usage': {'prompt_tokens': 0, 'completion_tokens': 0}, 'metadata': {}, 'content': 'The capital of France is Paris.', 'type': 'TextMessage'}

@github-actions github-actions bot removed the awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster label May 18, 2025
@peterychang
Copy link
Collaborator

The problem appears to be limited to streaming responses. Investigating

@peterychang
Copy link
Collaborator

peterychang commented May 22, 2025

The problem here is a missing parameter. As a temporary workaround while I fix the issue, you can add the following parameter on your side:

OpenAI/Azure OpenAI:

model_client = OpenAIChatCompletionClient(
    ...,
    stream_options={"include_usage": True},
)

@Baukebrenninkmeijer
Copy link

I was encountering this problem with both the streaming and non-streaming calls from agents, when using Azure OpenAI. If I disable the stream_options parameters now for batch, it will still return usage of 0 tokens. However, this solved the problem I was having, so thank you! @peterychang

peterychang added a commit that referenced this issue May 28, 2025
## Why are these changes needed?

Enables usage statistics for streaming responses by default.

There is a similar bug in the AzureAI client. Theoretically adding the
parameter
```
model_extras={"stream_options": {"include_usage": True}}
```
should fix the problem, but I'm currently unable to test that workflow

## Related issue number

closes #6548

## Checks

- [ ] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [ ] I've made sure all auto checks have passed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants