Skip to content

luatest doesn't detect crash at exit #252

Closed
@locker

Description

@locker

test.luatest_helpers.server.Server.stop() doesn't check the exit code of the stopped Tarantool instance so a crash at exit would be overlooked.

How to reproduce:

  1. Apply the patch:
    diff --git a/src/box/iproto.cc b/src/box/iproto.cc
    index c35dc73590e1..09a9bac0ce81 100644
    --- a/src/box/iproto.cc
    +++ b/src/box/iproto.cc
    @@ -2786,6 +2786,7 @@ iproto_send_stop_msg(void);
     static int
     iproto_on_shutdown_f(void *arg)
     {
    +	*((volatile int *)0) = 0;
     	(void)arg;
     	fiber_set_name(fiber_self(), "iproto.shutdown");
     	iproto_send_stop_msg();
  2. Run a luatest that starts a Tarantool server:
    ./test-run.py box-luatest/net_box_test.lua

The test passes although Tarantool crashes at exit.

$ ./test-run.py box-luatest/net_box_test.lua
<snip>
======================================================================================
WORKR TEST                                            PARAMS          RESULT
---------------------------------------------------------------------------------
[001] box-luatest/net_box_test.lua                                    [ pass ]
---------------------------------------------------------------------------------
Top 10 tests by occupied memory (RSS, Mb):
*   24.7 box-luatest/net_box_test.lua

(Tests quicker than 0.1 seconds may be missed.)
---------------------------------------------------------------------------------
Top 10 longest tests (seconds):
*   0.33 box-luatest/net_box_test.lua
---------------------------------------------------------------------------------
Statistics:
* pass: 1

A diff test would fail in this case.

$ ./test-run.py box/net.box_watcher.test.lua
<snip>
======================================================================================
WORKR TEST                                            PARAMS          RESULT
---------------------------------------------------------------------------------
Traceback (most recent call last):
  File "src/gevent/greenlet.py", line 766, in gevent._greenlet.Greenlet.run
  File "/home/vlad/src/tarantool/tarantool/test-run/lib/test.py", line 36, in _run
    self.callable(*self.callable_args, **self.callable_kwargs)
  File "/home/vlad/src/tarantool/tarantool/test-run/lib/tarantool_server.py", line 930, in crash_detect
    self.kill_current_test()
  File "/home/vlad/src/tarantool/tarantool/test-run/lib/tarantool_server.py", line 995, in kill_current_test
    if self.current_test.current_test_greenlet:
AttributeError: 'NoneType' object has no attribute 'current_test_greenlet'
2022-02-04T10:19:39Z <TestRunGreenlet at 0x7fd2b1358d00 info='Crash detector: <subprocess.Popen object at 0x7fd2b122f4f0>'> failed with AttributeError

[001] Exception: an integer is required (got type bytes)
---------------------------------------------------------------------------------
Top 10 tests by occupied memory (RSS, Mb):

(Tests quicker than 0.1 seconds may be missed.)
---------------------------------------------------------------------------------
Top 10 longest tests (seconds):
---------------------------------------------------------------------------------
[Internal test-run error] The following tasks were dispatched to some worker task queue, but were not reported as done (does not matters success or fail):
- [box/net.box_watcher.test.lua, null]

Beside detecting crashes, it's important to fix, because ASAN sets the exit code to 1 if there are memory leaks.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions