You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have you read the custom model provider docs, including the 'Common issues' section?Model provider docs
Have you searched for related issues? Others may have faced similar issues.
Describe the question
I don't believe prompt caching through bedrock as the model provider is supported, I tested the same prompt using bedrock converse API directly which returned prompt cache tokens used as well as using the agents sdk
but the agents sdk never returned >0 cached tokens when called again right after.
Debug information
Agents SDK version: (e.g. v0.0.3)
Python version (e.g. Python 3.10)
Repro steps
Ideally provide a minimal python script that can be run to reproduce the issue.
Please read this first
Describe the question
I don't believe prompt caching through bedrock as the model provider is supported, I tested the same prompt using bedrock converse API directly which returned prompt cache tokens used as well as using the agents sdk
but the agents sdk never returned >0 cached tokens when called again right after.
Debug information
v0.0.3
)Repro steps
Ideally provide a minimal python script that can be run to reproduce the issue.
agent = Agent(
name="big prompt agent",
instructions= "some prompt that needs prompt caching > token requirement",
model=LitellmModel(
model=f"bedrock/{BedrockModelIdentifier.CLAUDE35_HAIKU}",
),
)
result = Runner.run_sync(agent, prompt)
Method 1: Get total usage from context wrapper
total_usage = result.context_wrapper.usage
print("First request usage:")
print(
total_usage.input_tokens_details,
total_usage.output_tokens_details,
total_usage.input_tokens,
total_usage.output_tokens,
)
result2 = Runner.run_sync(agent, prompt)
print("\nSecond request usage (should show cached tokens):")
total_usage2 = result2.context_wrapper.usage
print(
total_usage2.input_tokens_details,
total_usage2.output_tokens_details,
total_usage2.input_tokens,
total_usage2.output_tokens,
)
Expected behavior
A clear and concise description of what you expected to happen.
total_usage2 input_tokens_details to return cached_tokens >0
The text was updated successfully, but these errors were encountered: