Skip to content

test: flaky vinyl/bloom.test.lua fail because of previous vinyl/gh.test.lua on Snapshot is already in progress #126

Closed
tarantool/test-run
#309
@avtikhon

Description

@avtikhon

Tarantool 2.9.0-57-gea0b126ff
Target: Linux-x86_64-Debug
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=ON
Compiler: /usr/bin/cc /usr/bin/c++
C_FLAGS: -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -fopenmp -msse2 -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation -Wno-gnu-alignof-expression -fno-gnu89-inline -Wno-cast-function-type -Werror
CXX_FLAGS: -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -fopenmp -msse2 -std=c++11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation -Wno-invalid-offsetof -Wno-gnu-alignof-expression -Wno-cast-function-type -Werror

Fail:

[001] Test failed! Result content mismatch:
[001] --- vinyl/bloom.result	Wed Apr  7 07:45:20 2021
[001] +++ var/rejects/vinyl/bloom.reject	Mon Jun  7 08:23:17 2021
[001] @@ -15,7 +15,7 @@
[001]  ...
[001]  box.snapshot()
[001]  ---
[001] -- ok
[001] +- error: Snapshot is already in progress
[001]  ...
[001]  for i = 1, 10 do s:get{i} end
[001]  ---
[001] 
[001] Last 15 lines of Tarantool Log file [Instance "vinyl"][/home/ubuntu/tarantool/test/var/001_vinyl/vinyl.log]:
[001] 2021-06-07 08:23:17.225 [1424727] vinyl.dump.0/102/task I> writing `/home/ubuntu/tarantool/test/var/001_vinyl/vinyl/539/0/00000000000000000118.index'
[001] 2021-06-07 08:23:17.225 [1424727] snapshot/101/main I> saving snapshot `/home/ubuntu/tarantool/test/var/001_vinyl/vinyl/00000000000000021501.snap.inprogress'
[001] 2021-06-07 08:23:17.230 [1424727] main/108/vinyl.scheduler I> 539/0: dump completed
[001] 2021-06-07 08:23:17.230 [1424727] main/108/vinyl.scheduler I> dumped 49338 bytes in 0.0 s, rate 3.6 MB/s
[001] 2021-06-07 08:23:17.231 [1424727] snapshot/101/main I> done
[001] 2021-06-07 08:23:17.231 [1424727] main/123/console/unix/: I> vinyl checkpoint completed
[001] 2021-06-07 08:23:17.240 [1424727] main I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/00000000000000020342.snap
[001] 2021-06-07 08:23:17.240 [1424727] main I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/00000000000000019325.vylog
[001] 2021-06-07 08:23:17.240 [1424727] main/105/gc I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/535/0/00000000000000000096.index
[001] 2021-06-07 08:23:17.241 [1424727] main/105/gc I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/535/0/00000000000000000096.run
[001] 2021-06-07 08:23:17.241 [1424727] main/105/gc I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/535/0/00000000000000000102.index
[001] 2021-06-07 08:23:17.241 [1424727] main/105/gc I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/535/0/00000000000000000102.run
[001] 2021-06-07 08:23:17.241 [1424727] main/105/gc I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/535/0/00000000000000000104.index
[001] 2021-06-07 08:23:17.241 [1424727] main/105/gc I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/535/0/00000000000000000104.run
[001] 2021-06-07 08:23:17.243 [1424727] wal I> removed /home/ubuntu/tarantool/test/var/001_vinyl/vinyl/00000000000000020342.xlog

Reproducer:

echo >a.rep <<EOF
- [vinyl/gh.test.lua, null]
- [vinyl/bloom.test.lua, null]
EOF

c=0 ; while ./test-run.py --reproduce a.rep --force ; do rm -rf var/ ; c=$(($c+1)) ; echo "================ $c" ; done

Found that running tests in the same Tarantool worker process next tests after 'vinyl/gh.test.lua' test may get it in non consistence state when some snapshot internal processes not yet completed. To avoid of it 'vinyl/gh.test.lua' test must be fixed either each test must be run in standalone new Tarantool worker process.

To be sure that there is no big impact on time of the testings:

Parallel Host No restarts With restarts Diff
-j1 MCS Github 14m40s 17m20s +15%
-j4 (default) MCS Github 9m32s 10m26s +9%
-j8 MCS Github 9m06s 10m28s +13%
-j16 Laptop 5m59s 7m02s +14%

More parallel values not interesting due to Github Actions hosts and MCS local hosts use 2 cores hosts with 4 test threads by default and bigger parallel values doesn't change the timings in general.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions