Beyond Goroutines: Production Patterns for Go Concurrency

#golang #go #concurrency #goroutines #channels #patterns #best-practices

go keyword: easy. Coordinating ten thousand of them without leaks, deadlocks, or unbounded memory: an entire engineering discipline.

This post is the set of patterns I reach for in production Go services — the ones that show up everywhere from inference gateways to event processors. None of these are clever tricks. They are the boring, correct primitives that turn concurrent Go from “a goroutine spray-and-pray” into something you can actually operate.

The mental model

Before any pattern: get the model right.

A goroutine is cheap (≈2 KB stack), but not free. Ten thousand idle goroutines costs you 20 MB and a ton of scheduler pressure if they wake up frequently.
A channel is a synchronization primitive first, a queue second. Treat it like the former; you will write fewer deadlocks.
A mutex is correct and fast for protecting state. A channel is correct for transferring ownership. Picking the wrong one for your problem is the #1 source of bad Go concurrency.
The context package is how cancellation flows through a system. If a function does anything that can block, it should accept a context.Context as its first argument. No exceptions.

Pattern 1: errgroup is almost always what you want

If you find yourself spawning N goroutines and waiting for all of them, do not write it by hand. Use golang.org/x/sync/errgroup.

import "golang.org/x/sync/errgroup"

func fetchAll(ctx context.Context, urls []string) ([]Result, error) {
    g, ctx := errgroup.WithContext(ctx)
    results := make([]Result, len(urls))

    for i, url := range urls {
        i, url := i, url // capture loop vars (Go 1.22+ doesn't need this)
        g.Go(func() error {
            r, err := fetch(ctx, url)
            if err != nil {
                return err
            }
            results[i] = r
            return nil
        })
    }
    if err := g.Wait(); err != nil {
        return nil, err
    }
    return results, nil
}

What errgroup gives you that hand-rolled sync.WaitGroup doesn’t:

First-error wins. If one goroutine returns an error, the derived context is cancelled, signaling all others to stop.
Clean error propagation. No errors []error slice and synchronization on it.
Bounded concurrency via g.SetLimit(n).

This single library replaces 80% of the bespoke concurrency code I see in code reviews.

Pattern 2: Bounded concurrency with semaphore.Weighted

errgroup.SetLimit(n) is fine when each task has equal cost. When tasks have different weights — say, image processing where bigger images cost more memory — use golang.org/x/sync/semaphore:

import "golang.org/x/sync/semaphore"

sem := semaphore.NewWeighted(int64(maxMemoryMB))

for _, img := range images {
    weight := int64(img.SizeMB)
    if err := sem.Acquire(ctx, weight); err != nil {
        return err
    }
    go func(img Image) {
        defer sem.Release(weight)
        process(img)
    }(img)
}

A weighted semaphore is what you actually want for resource-shaped concurrency. “I want at most 4 GB of in-flight image data” is a much better invariant than “I want at most 8 goroutines.”

Pattern 3: Cancellation done right

Cancellation in Go is propagated through context.Context. Every blocking call inside your function must check it — select is how:

func (w *Worker) processBatch(ctx context.Context, batch []Item) error {
    for _, item := range batch {
        select {
        case <-ctx.Done():
            return ctx.Err()
        default:
        }
        if err := w.processItem(ctx, item); err != nil {
            return err
        }
    }
    return nil
}

The two cardinal sins:

Ignoring ctx.Err() — your goroutine will keep running after the caller has given up.
Creating a context.TODO() deep inside a function — you have just dropped the cancellation chain. Always plumb the caller’s context through.

If a function genuinely cannot accept a context, that is a smell, not a feature.

Pattern 4: The pipeline

A pipeline is a chain of stages, each a goroutine, each communicating through a channel. It is the right shape for streaming data through a series of transformations:

func pipeline(ctx context.Context, input <-chan Item) <-chan Result {
    parsed := parse(ctx, input)
    enriched := enrich(ctx, parsed)
    output := serialize(ctx, enriched)
    return output
}

func parse(ctx context.Context, in <-chan Item) <-chan Parsed {
    out := make(chan Parsed)
    go func() {
        defer close(out)
        for item := range in {
            p, err := parseItem(item)
            if err != nil {
                continue // or send to error channel
            }
            select {
            case out <- p:
            case <-ctx.Done():
                return
            }
        }
    }()
    return out
}

Three things make pipelines correct:

The producer closes the channel. The consumer reads until the channel is closed via range.
Every send is wrapped in a select with ctx.Done(). Otherwise a slow consumer plus a cancelled context produces a goroutine leak.
Each stage is independent. No stage knows the topology beyond its input and output channel.

Pattern 5: Worker pool with a job channel

When you have a stream of work and a fixed pool of workers, do this:

func runPool(ctx context.Context, numWorkers int, jobs <-chan Job) error {
    g, ctx := errgroup.WithContext(ctx)

    for i := 0; i < numWorkers; i++ {
        g.Go(func() error {
            for {
                select {
                case <-ctx.Done():
                    return ctx.Err()
                case job, ok := <-jobs:
                    if !ok {
                        return nil
                    }
                    if err := process(ctx, job); err != nil {
                        return err
                    }
                }
            }
        })
    }
    return g.Wait()
}

Notice: the dispatcher closes jobs when it has no more work. Each worker exits cleanly when the channel drains. Cancellation through ctx short-circuits everything.

Pattern 6: sync.Once for lazy initialization

If you have something expensive to initialize and want it computed exactly once across goroutines:

var (
    onceCfg sync.Once
    cfg     *Config
    cfgErr  error
)

func GetConfig() (*Config, error) {
    onceCfg.Do(func() {
        cfg, cfgErr = loadConfig()
    })
    return cfg, cfgErr
}

sync.Once is one of the few primitives where the obvious-looking thing is also the correct thing. Use it. (For Go 1.21+, sync.OnceFunc/OnceValue/OnceValues are even cleaner.)

Pattern 7: sync.Pool for short-lived allocations

If your hot path allocates the same shape of object thousands of times per second, sync.Pool is the GC-pressure fix:

var bufPool = sync.Pool{
    New: func() any {
        b := make([]byte, 0, 4096)
        return &b
    },
}

func handle(req []byte) {
    bp := bufPool.Get().(*[]byte)
    buf := (*bp)[:0]
    defer func() {
        *bp = buf
        bufPool.Put(bp)
    }()
    // ... use buf ...
}

Two rules: pool things that are expensive to allocate, not cheap and frequent; and never assume a pooled item retains state — always reset.

The deadlocks you will hit

In rough order of frequency:

Sending to an unbuffered channel with no receiver. Either nobody is listening, or you are sending in a goroutine you also Wait() on without a separate consumer.
Closing a channel from the receiver side. The convention is that the sender closes. Violate it and you get panics.
Holding a mutex across a channel operation. A receiver waiting for the channel needs to acquire the same mutex; classic.
for-range over a channel that nobody closes. The receiver hangs forever; usually paired with a WaitGroup that never completes.

The Go race detector (-race) catches data races but not deadlocks. For deadlocks, your tools are: pprof goroutine dumps in production (/debug/pprof/goroutine?debug=2), and reviewing the lifecycle of every channel before you merge.

Channels vs mutexes: pick correctly

The Go community line is “share memory by communicating.” It is good advice, often misapplied.

Use a channel when ownership of a value transfers from one goroutine to another. “Worker A is done with this job; Worker B should pick it up.”
Use a mutex when multiple goroutines need to read or modify the same long-lived state. “This counter is incremented from many places.”

A counter behind a channel is awkward and slow. State protected by a mutex is fine. Idiomatic Go uses both.

When not to use Go for concurrency

Honest take: Go’s concurrency model is excellent for I/O-bound and moderately CPU-bound workloads. It is not the right tool for:

Tight numerical kernels (use C, Rust, or Cython).
Hard-real-time work (the GC is a problem).
Single-threaded high-throughput I/O where epoll directly wins (rare; usually Go’s runtime is fine).

Pick the right tool. Most of the time, that is Go. Sometimes it isn’t.

The closing principle

Goroutines are easy. Lifetimes are hard. Every goroutine you start needs an answer to: who waits for it, who cancels it, where do its errors go. Get those three answers right, and the rest is mostly mechanics.

Cloud Agnostic Engineering: The Real Cost of Multi-Cloud Portability Control Planes for Distributed Systems: A Practitioner's Guide