Description
Prerequisite
- Server which stores tasks using tarantool/queue with custom driver based on utubettl
- Client which takes tasks from the server through
net.box
connection
Scenario
- Client takes a task through
net.box
connection and start processing it - Network glitch (or another issue) occur and
net.box
decides to reconnect - Client finishes processing task and decides to
ack
the task. - Server responds to
ack
command with error -Task was not taken in the session
- After
ttr
delay task returned to theREADY
state
Problem description
This "ack
error" happen because tarantool/queue locks "taken" task to box.session.id()
. That means that not only the same client must take
and ack
the same task, but it has to be done inside the same net.box
connection, since box.session.id()
updates after implicit reconnect. That implies that client has no way to handle such error from the server and retry ack
.
Possible solutions
There is a possible workaround to eliminate this kind of errors is to implement a simple buffer (fiber.channel
) on the server with client ack
commands and let another fiber to call an actual queue:ack
. But this is not very handy, if you want to handle ack
response on the client.
Another way to approach this is to use different id
to lock a task. This id
should determine the same client even through implicit net.box
reconnect calls. This id may be set explicitly through some registration process (api breaking change) or implicitly using some tarantool client information (if exists).
Is this reasoning correct? Are there any different approaches/workarounds?