>_
GolangStepByStep
Software Engineer

Performance Profiling

pprof CPU/heap, benchmarking, allocation hunting, optimization

# What is Profiling? (The MRI Analogy)

Imagine you have a patient (your Go application) who comes into the hospital saying they are exhausted, slow, and consuming too much food (CPU & Memory).

A bad doctor blindly guesses the problem and immediately starts doing surgery (rewriting code) hoping they fix the right organ. A good doctor does not guess. They put the patient in an MRI Machine to get a highly detailed scan of exactly which organ is failing.

In Go, that MRI machine is called pprof (Performance Profiler). It allows you to see down to the exact line of code where your CPU time or memory allocations are being wasted.

# Level 1: Benchmarking (Beginner)

Before we can diagnose the whole app, we need to test isolated functions to see if they are slow. Go has a built-in speed-testing framework called Benchmarking.

It lives right next to your Unit Tests, but instead of func TestX(t *testing.T), you use func BenchmarkX(b *testing.B).

package math_test

import "testing"
import "strings"

// We want to test how fast appending strings is
func BenchmarkStringAppend(b *testing.B) {
    // Setup (does not count towards time)
    base := "hello"
    
    // START THE CLOCK!
    b.ResetTimer()
    
    // The loop MUST run b.N times
    for i := 0; i < b.N; i++ {
        _ = base + " world" // Try to optimize this later!
    }
}

Running it: Open your terminal and type go test -bench=.. Go will run that loop millions of times, dynamically increasing b.N until it can definitively tell you: "This function takes exactly 41.2 nanoseconds per operation."

# Level 2: CPU Profiling in Production (Intermediate)

Benchmarking tests a single function in a vacuum. But what if your live web server is suddenly using 100% CPU on AWS? You need the MRI scan.

Go makes this magically easy. If you are running an http.DefaultServeMux, you literally just add an empty import to your main.go:

package main

import (
    "net/http"
    _ "net/http/pprof" // THIS IS THE MAGIC IMPORT!
)

func main() {
    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        // ... some slow handler ...
    })

    // pprof automatically attaches itself to the default Mux!
    http.ListenAndServe(":8080", nil)
}

With that server running, you can open a new terminal and type:
go tool pprof http://localhost:8080/debug/pprof/profile?seconds=10

Go will watch your server for 10 seconds, download the data, and drop you into a terminal prompt. Type top, and it will instantly list exactly which function ate the CPU during those 10 seconds!

# Level 3: Heap and Memory Profiling (Advanced)

Sometimes your CPU is fine, but your server is constantly crashing from "Out of Memory" (OOM) errors. You have a memory leak!

Using the exact same tool, you can ask for the Heap Profile (Heap is where Go stores actively used memory).

// In your terminal:
go tool pprof http://localhost:8080/debug/pprof/heap

(pprof) top
Showing nodes accounting for 1.50GB, 90% of 1.60GB total
      flat  flat%   sum%        cum   cum%
    1.20GB 75.00% 75.00%     1.20GB 75.00%  main.CacheUserData
    0.30GB 18.75% 93.75%     0.30GB 18.75%  json.Unmarshal
  • inuse_space: (Default) Shows you what memory is ALIVE right now. Great for finding global maps that keep growing forever (memory leaks).
  • alloc_space: Shows you memory that was created, but then instantly destroyed by the Garbage Collector. Great for finding inefficient code that is churning memory too fast.

# Level 4: Escape Analysis & The Garbage Collector (Expert)

Why do we care about "alloc_space" if the Garbage Collector (GC) just cleans it up anyway?

Because the Garbage Collector is not free. Every time it runs, it steals CPU cycles from your application. If it gets overwhelmed, it literally halts your code ("Stop The World") to catch up. Most "CPU Problems" in Go are actually Memory Allocation Problems wearing a disguise!

Stack vs Heap

When a function runs, its variables are placed on the incredibly fast Stack. When the function finishes, the Stack instantly deletes everything. ZERO overhead.

However, if you return a pointer to a variable, the Go Compiler realizes that variable cannot die yet. It "escapes" the function. The compiler is forced to move it to the Heap. Things on the heap stay there until the Garbage Collector sweeps them away.

// 1. STACK: Fast, zero garbage!
func FastCalculation() int {
    x := 10 // Stays inside the function. Dies instantly.
    return x * 2
}

// 2. HEAP: Slow, creates Garbage!
func SlowCalculation() *int {
    x := 10 // The compiler notices we are returning a pointer!
    // "Escape Analysis" forces 'x' to live on the Heap.
    return &x 
}

You can ask the Go compiler exactly what it is putting on the heap bypassing building with the flag:
go build -gcflags="-m" main.go.
The terminal will spit out: "moved to heap: x". An expert Go programmer hunts down these heap allocations in hot-paths and removes them to make the GC completely invisible!

practice & review