Skip to content

runtime: clearpools causes excessive STW1 time #22331

Closed
@bryanpkc

Description

@bryanpkc

Our application uses a large number of goroutines, mutexes, and defers. Using go 1.8.3 on linux amd64, we observed that the STW1 time is dominated by the runtime.clearpools function in mgc.go. Here is an excerpt from the GC trace:

gc 12 @27.641s 4%: 16+865+0.82 ms clock, 522+6169/9551/1514+26 ms cpu, 8375->8694->5362 MB, 8943 MB goal, 48 P

By instrumenting the runtime, we determined that, out of the 16ms STW1 time, 9.3ms was spent in the loop that disconnects sched.sudogcache linked list, and 6.7ms was spent in the loop that disconnects the linked lists in sched.deferpool. These O(n) loops end up causing a very long pause.

I understand the reason for originally introducing the loops in #9110. However, looking at tip around the call sites of releaseSudog and freedefer, I couldn't see a case where a released object would still be referenced by a live pointer somewhere else in the system. Are these loops still really necessary?

FWIW, I replaced the loops with a simple zeroing of the heads of the linked lists, and test/fixedbugs/issue9110.go still passed.

I propose making these loops optional to avoid the excessive STW1 time, and enabling them only with a GODEBUG option for debugging purposes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions