runtime: clearpools causes excessive STW1 time

Our application uses a large number of goroutines, mutexes, and defers. Using go 1.8.3 on linux amd64, we observed that the STW1 time is dominated by the `runtime.clearpools` function in mgc.go. Here is an excerpt from the GC trace:
```
gc 12 @27.641s 4%: 16+865+0.82 ms clock, 522+6169/9551/1514+26 ms cpu, 8375->8694->5362 MB, 8943 MB goal, 48 P
```
By instrumenting the runtime, we determined that, out of the 16ms STW1 time, 9.3ms was spent in the loop that disconnects `sched.sudogcache` linked list, and 6.7ms was spent in the loop that disconnects the linked lists in `sched.deferpool`. These O(n) loops end up causing a very long pause.

I understand the reason for originally introducing the loops in #9110. However, looking at tip around the call sites of `releaseSudog` and `freedefer`, I couldn't see a case where a released object would still be referenced by a live pointer somewhere else in the system. Are these loops still really necessary?

FWIW, I replaced the loops with a simple zeroing of the heads of the linked lists, and test/fixedbugs/issue9110.go still passed.

I propose making these loops optional to avoid the excessive STW1 time, and enabling them only with a GODEBUG option for debugging purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

runtime: clearpools causes excessive STW1 time #22331

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

runtime: clearpools causes excessive STW1 time #22331

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions