Skip to content

Cannot :ack task after implicit net.box reconnect #85

Closed
@RunsFor

Description

@RunsFor

Prerequisite

  1. Server which stores tasks using tarantool/queue with custom driver based on utubettl
  2. Client which takes tasks from the server through net.box connection

Scenario

  1. Client takes a task through net.box connection and start processing it
  2. Network glitch (or another issue) occur and net.box decides to reconnect
  3. Client finishes processing task and decides to ack the task.
  4. Server responds to ack command with error - Task was not taken in the session
  5. After ttr delay task returned to the READY state

Problem description

This "ack error" happen because tarantool/queue locks "taken" task to box.session.id(). That means that not only the same client must take and ack the same task, but it has to be done inside the same net.box connection, since box.session.id() updates after implicit reconnect. That implies that client has no way to handle such error from the server and retry ack.

Possible solutions

There is a possible workaround to eliminate this kind of errors is to implement a simple buffer (fiber.channel) on the server with client ack commands and let another fiber to call an actual queue:ack. But this is not very handy, if you want to handle ack response on the client.

Another way to approach this is to use different id to lock a task. This id should determine the same client even through implicit net.box reconnect calls. This id may be set explicitly through some registration process (api breaking change) or implicitly using some tarantool client information (if exists).

Is this reasoning correct? Are there any different approaches/workarounds?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions