The hardest bug I’ve ever squashed in Go wasn’t a memory leak (the garbage collector’s generally got your back there), but a goroutine leak, in which I was regularly starting goroutines that were getting parked on a wait somewhere rather than exiting cleanly. The overhead of any given one of them was small, but in aggregate eventually enough take down my program.
I recently took to shoring up our codebase by introducing the use of Uber’s goleak in packages at risk of accidentally leaving something running. It runs after a test (or tests of an entire package) and errors if extra goroutines are found.
Luckily, no serious leaks came up, although there were some minor ones. I was impressed that we didn’t find more given tens of thousands of lines of preexisting code and the use of hundreds of third-party modules, which is a testament of the Go ecosystem’s high average quality.
In the cases they are found, leaky modules are hard to fix because they’re in external code and may only be a transitive dependency. We ended up wrapping goleak’s helpers to automatically ignore known offenders:
package prequire
var knownGoroutineLeaks = []goleak.Option{
// `ptesting` starts up some pools that run for the duration of
// the test suite. Ignore pgxpool's background goroutine.
goleak.IgnoreTopFunction("github.com/jackc/pgx/v4/pgxpool.(*Pool).backgroundHealthCheck"),
// OpenCensus boots goroutines on `init`, and even though we only
// have it asa transitive dependency and it's not used, even that's
// enough to have itboot its workers.
//
// There's an open GitHub issue:
//
// https://github.com/census-instrumentation/opencensus-go/issues/1191
//
// I opened a fix, but it may or may not make it in:
//
// https://github.com/census-instrumentation/opencensus-go/pull/1287
goleak.IgnoreTopFunction("go.opencensus.io/stats/view.(*worker).start"),
}
// VerifyNoGoroutineLeak verifies that no goroutines are still around
// after atest case runs, allowing for some extraneous goroutines
// that we can't do muchabout.
func VerifyNoGoroutineLeak(t *testing.T, options ...goleak.Option) {
t.Helper()
goleak.VerifyNone(t, append(knownGoroutineLeaks, options...)...)
}
// VerifyNoGoroutineLeakMain is similar to the above, but runs in a
// special GoTestMain function. It's purpose is to check no leaks once
// at the end of the test runs for an entire package.
func VerifyNoGoroutineLeakMain(m *testing.M, options ...goleak.Option) {
goleak.VerifyTestMain(m, append(knownGoroutineLeaks, options...)...)
}
It works, but it’s something to be careful about – it’ll be far easier and more tempting to add new leaks to the ignored list rather than dig in and fix a root problem.
If you’re writing a package, a couple guidelines for goroutines that’ll benefit users:
init
.Package-level goroutines are a problem because it’s not clear that they have to be stopped, and whose responsibility it is to do so. The superior alternative are goroutines local to a struct
instance, and which has a Stop
or Close
function.
Goroutines in init
are a problem because as noted above, they’ll start even if the package wasn’t directly used. This makes it really non-obvious where they should be shut back down, if there’s even a way of doing so.
Did I make a mistake? Please consider sending a pull request.