Skip to content

plugin: SIGSEGV with concurrent GC during plugin load #17455

Closed
@aclements

Description

@aclements

What version of Go are you using (go version)?

Current master (86b2f29)

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/austin/r/go"
GORACE=""
GOROOT="/home/austin/go.dev"
GOTOOLDIR="/home/austin/go.dev/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build566905877=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

What did you do?

cd misc/cgo/testplugin
GOGC=1 ./test.bash

What did you expect to see?

PASS

What did you see instead?

fatal error: unexpected signal during runtime execution
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x479ebb]

goroutine 19 [running]:
runtime.throw(0x53483e, 0x2a)
    /home/austin/go.dev/src/runtime/panic.go:587 +0x95 fp=0xc42002cd98 sp=0xc42002cd78
runtime.sigpanic()
    /home/austin/go.dev/src/runtime/signal_unix.go:253 +0x2db fp=0xc42002cde8 sp=0xc42002cd98
runtime.scanblock(0x7fa095ab3020, 0x11b0, 0x0, 0xc420027228)
    /home/austin/go.dev/src/runtime/mgcmark.go:1108 +0x5b fp=0xc42002ce60 sp=0xc42002cde8
runtime.markrootBlock(0x7fa095ab3020, 0x11b0, 0x0, 0xc420027228, 0x0)
    /home/austin/go.dev/src/runtime/mgcmark.go:280 +0x7c fp=0xc42002ce90 sp=0xc42002ce60
runtime.markroot(0xc420027228, 0x3)
    /home/austin/go.dev/src/runtime/mgcmark.go:164 +0xc8 fp=0xc42002cf18 sp=0xc42002ce90
runtime.gcDrain(0xc420027228, 0x5)
    /home/austin/go.dev/src/runtime/mgcmark.go:977 +0x8f fp=0xc42002cf48 sp=0xc42002cf18
runtime.gcBgMarkWorker(0xc420026000)
    /home/austin/go.dev/src/runtime/mgc.go:1468 +0x1ca fp=0xc42002cfb8 sp=0xc42002cf48
runtime.goexit()
    /home/austin/go.dev/src/runtime/asm_amd64.s:2158 +0x1 fp=0xc42002cfc0 sp=0xc42002cfb8
created by runtime.gcBgMarkStartWorkers
    /home/austin/go.dev/src/runtime/mgc.go:1357 +0x98

goroutine 1 [runnable]:
plugin.lastmoduleinit(0x1587400)
    /home/austin/go.dev/src/runtime/plugin.go:39 +0x16a
plugin.open(0x52f37c, 0x7, 0x0, 0x0, 0x0)
    /home/austin/go.dev/src/plugin/plugin_dlopen.go:73 +0x367
plugin.Open(0x52f37c, 0xa, 0x3, 0x2, 0xc420040e08)
    /home/austin/go.dev/src/plugin/plugin.go:28 +0x35
main.main()
    /home/austin/go.dev/misc/cgo/testplugin/src/host/host.go:24 +0x5c

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
    /home/austin/go.dev/src/runtime/asm_amd64.s:2158 +0x1
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x479ebb]

goroutine 21 [running]:
runtime.throw(0x53483e, 0x2a)
    /home/austin/go.dev/src/runtime/panic.go:587 +0x95 fp=0xc42002dd98 sp=0xc42002dd78
runtime.sigpanic()
    /home/austin/go.dev/src/runtime/signal_unix.go:253 +0x2db fp=0xc42002dde8 sp=0xc42002dd98
runtime.scanblock(0x7fa095ab4920, 0x1a590, 0x0, 0xc420029828)
    /home/austin/go.dev/src/runtime/mgcmark.go:1108 +0x5b fp=0xc42002de60 sp=0xc42002dde8
runtime.markrootBlock(0x7fa095ab4920, 0x1a590, 0x0, 0xc420029828, 0x0)
    /home/austin/go.dev/src/runtime/mgcmark.go:280 +0x7c fp=0xc42002de90 sp=0xc42002de60
runtime.markroot(0xc420029828, 0x4)
    /home/austin/go.dev/src/runtime/mgcmark.go:169 +0x153 fp=0xc42002df18 sp=0xc42002de90
runtime.gcDrain(0xc420029828, 0x5)
    /home/austin/go.dev/src/runtime/mgcmark.go:977 +0x8f fp=0xc42002df48 sp=0xc42002df18
runtime.gcBgMarkWorker(0xc420028600)
    /home/austin/go.dev/src/runtime/mgc.go:1468 +0x1ca fp=0xc42002dfb8 sp=0xc42002df48
runtime.goexit()
    /home/austin/go.dev/src/runtime/asm_amd64.s:2158 +0x1 fp=0xc42002dfc0 sp=0xc42002dfb8
created by runtime.gcBgMarkStartWorkers
    /home/austin/go.dev/src/runtime/mgc.go:1357 +0x98

This is crashing in the GC's data segment scan. The problem is almost certainly that the plugin's moduledata has been linked into the module data list, but md.gcdatamask hasn't been initialized yet, so if a concurrent GC tries to scan the data segment of the module at just the wrong time, it will crash. I wasn't able to figure out where it gets linked into the list, but I can think of two potential solutions:

  1. Don't link the module data into the list until everything that could be concurrently accessed is fully initialized.
  2. Stop the world around the plugin loading process.

The first is preferable if possible, since attempting to stop the world will delay plugin loading during a concurrent GC cycle until the cycle ends.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions