[Bug]: max_input_tokens has no effect — input length still exceeds the limit on SWE-bench #8606

Hambaobao · 2025-05-21T03:52:38Z

Is there an existing issue for the same bug? (If one exists, thumbs up or comment on the issue instead).

I have checked the existing issues.

Describe the bug and reproduction steps

Hi OpenHands team,

I’m running evaluations on the SWE-bench dataset using OpenHands and noticed that setting max_input_tokens doesn’t seem to have any effect. Even when I set this parameter to a low value, the actual input length still exceeds the specified limit.

I’m using a custom_tokenizer, but I couldn’t find anywhere in the code where max_input_tokens is actually enforced to truncate or filter the input — either before or after tokenization. It seems like this parameter is currently ignored during input construction.

Could you clarify if this is expected behavior? If it’s an oversight, I’d be happy to help contribute a fix.

Thanks!

OpenHands Installation

Development workflow

OpenHands Version

0.35.0

Operating System

Linux

Logs, Errors, Screenshots, and Additional Context

I have set the max_input_tokens to 30720.

    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 38912 tokens. However, you requested 40654 tokens (36558 in the messages, 4096 in the completion). Please reduce the length of the messages or completion.", 'type': 'BadRequestError', 'param': None, 'code': 400}

The text was updated successfully, but these errors were encountered:

enyst · 2025-05-21T10:38:59Z

Hi @Hambaobao , yes, you are correct, compressing agent history per tokens is not implemented.

We have added instead a mechanism to condense the events, which is in openhands/memory/condenser. It works per events, not per tokens. This PR shows how you could use it, if you wish:

specify condenser config for evals #8177

We definitely welcome contributions to change it to per tokens or implement per tokens. There may be a fix in this PR, but to be honest I didn't like it and I didn't have the time to either reconsider or test it well and propose it:

Token-aware condenser #7587

Hambaobao · 2025-05-21T10:55:27Z

Hi @enyst , thank you very much for your response, I'll see what I can do.

Hambaobao added the bug Something isn't working label May 21, 2025

mamoodi added the configuration Related to configuring OpenHands label May 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: max_input_tokens has no effect — input length still exceeds the limit on SWE-bench #8606

[Bug]: max_input_tokens has no effect — input length still exceeds the limit on SWE-bench #8606

Hambaobao commented May 21, 2025

enyst commented May 21, 2025

Uh oh!

Hambaobao commented May 21, 2025

Uh oh!

[Bug]: max_input_tokens has no effect — input length still exceeds the limit on SWE-bench #8606

[Bug]: max_input_tokens has no effect — input length still exceeds the limit on SWE-bench #8606

Comments

Hambaobao commented May 21, 2025

Is there an existing issue for the same bug? (If one exists, thumbs up or comment on the issue instead).

Describe the bug and reproduction steps

OpenHands Installation

OpenHands Version

Operating System

Logs, Errors, Screenshots, and Additional Context

enyst commented May 21, 2025

Uh oh!

Hambaobao commented May 21, 2025

Uh oh!