-
Notifications
You must be signed in to change notification settings - Fork 18.1k
runtime/pprof: incorrect mutex contention time #73760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Related Issues
Related Code Changes (Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
I'm not at all convinced that this is a bug, and the order in which goroutines run and/or acquire a mutex is not generally specified. |
I log the name to confirm the order. I've also tried several other acquire orders, and the issue is consistently reproducible.
Initially I thought it might be a limitation, but given how simple the case is, I’d like to file a report to consult. |
gowrap1 has 4s, or 1s * 4 goroutines Without approximations, I'd expect gowrap2 to take the blame for 2 seconds of delay to 1 goroutine for a total of 2 seconds. But semrelease1 does an approximation to avoid taking O(N) time to visit every waiting goroutine. That may be to blame here. |
@rhysh Hm It's not the order I mentioned? Btw, all the goroutines wait for the same duration before unlocking, so the approximation using the head and tail should be correct. I guess the Lines 428 to 429 in 2a7ca15
|
It looks like the argument of
How did you determine the order? You mentioned logging, but there isn't any in the demo code. I agree with @dr2chase that the demo code doesn't enforce a specific order.
It's right to be reset — each critical section should only get the blame for the time that it was the lock-holder, so each lock-releaser needs to take full blame for the waiters it delayed via advancing their timers to It's a theory based on my understanding of |
There's no specific order, I thought you were evaluating the reported order since it's quite similar. But as I mentioned, the issue is always reproducible regardless of the order.
Sorry, here's the updated code: var mtx sync.Mutex
var wg sync.WaitGroup
func acquire(n time.Duration, name string) {
mtx.Lock()
defer mtx.Unlock()
println("start wait", name)
time.Sleep(n)
wg.Done()
}
func main() {
runtime.SetMutexProfileFraction(1)
wg.Add(5)
go acquire(1*time.Second, "gowrap1") // gowrap1
go acquire(2*time.Second, "gowrap2") // gowrap2
go acquire(3*time.Second, "gowrap3") // gowrap3
go acquire(4*time.Second, "gowrap4") // gowrap4
go acquire(5*time.Second, "gowrap5") // gowrap5
wg.Wait()
// Write goroutine profile to file
f, err := os.Create("mutex.prof")
if err != nil {
log.Fatal(err)
}
defer f.Close()
if err := pprof.Lookup("mutex").WriteTo(f, 0); err != nil {
log.Fatal(err)
}
}
Yes, but when unlocking occurs, the wait times for the
I didn't say it's entirely wrong; but in the case of Lines 428 to 429 in 2a7ca15
The wait time of the tail will be 0 and it underestimates the tail's wait time. Here is the formula:
So when dt = t0 - acquiretime = 2s
dt = 2s
dtail = 0 // underestimate
dt += (0 + 2s) / 2 * 1 -> 3s The dt is 3s, as shown in the diagram. |
Uh oh!
There was an error while loading. Please reload this page.
Go version
go1.23.9
Output of
go env
in your module/workspace:What did you do?
I tried a simple snippet to evaluate the mutex profile:
What did you see happen?
The order in which goroutines acquire the lock is:
gowrap1
→gowrap5
→gowrap2
→gowrap3
→gowrap4
. However, the result shows an overshoot ingowrap3
's contention time and underestimatesgowrap2
:What did you expect to see?
The expected contention time for
gowrap2
should be 4 seconds. The contention time forgowrap3
should be 3 seconds. Not sure if this behavior is intentional.The text was updated successfully, but these errors were encountered: