Skip to content

Explain the relation of invoker.inactivity-timeout and .concurrent-invocations-limit for different workloads #548

Open
@tillrohrmann

Description

@tillrohrmann

For different workloads it might be beneficial to tune the system differently to obtain the best performance. Maybe adding performance tuning guide under "operate" could help users. To motivate this issue here are a few scenarios that need different configuration to perform well:

High load short lived invocations calling other invocations

With the default settings when running a high load workload where invocations call other invocations we can end up in a situation where the callers block slots for the callees to run until the inactivity-timeout kicks in. Under these circumstances one either should reduce the inactivity timeout to clear slots faster or if the service endpoints support more load increase the concurrent-invocations-limit. As part of this we should also document that the invoker has the concept of slots which are occupied by in-flight invocations. More details for the described problem can be found here restatedev/restate#2758.

Invocations with long side effects/steps

For invocations with long lasting side effects/steps (e.g. when querying a LLM), the current inactivity-timeout might be too aggressive. In this case, the system might suspend an invocation that could still have made more progress just because an individual step has taken too long. Here it would be beneficial to increase the inactivity-timeout.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions