Benchmarking: What You Can't Miss in Go 1.24

2025-03-11 Go

For us Go developers, writing less code means fewer opportunities to make mistakes. But how about achieving more by doing less? That sounds appealing. Isn’t it? And how about spending less time writing code to force the compiler not to optimise the benchmarking functions?

So, what is the new feature in the testing package, and how will it impact writing benchmarks?

The package adds a new tool to our testing toolbox. It’s the Loop method. The method abstracts a lot of functionality, which we needed to write in earlier Go versions to get correct benchmark results.

We don’t need to do mental gymnastics to trick the Go compiler into not optimising benchmarking functions. Let’s look at an example package and write benchmarks in a new and, for comparison, the old way.

Writing benchmarks the new way

We will start with designing a new package scrabble. The package provides functionality for the Scrabble game. It has one exported function Score. The function takes a word (type string) and returns a number (type int) representing the word’s score. That’s it. A simple package with one exported function.

scrabble.go

func Score(word string) int {
    // logic to calculate the score
}

As always, we add benchmarks to the scrabble_test package in the file scrabble_test.go. From the beginning, we focus on testing behaviour (package API) but not the package implementation details, for example, private (package internal) helper functions.

scrabble_test.go

package scrabble_test

We feed our function Score with test data stored in the tt variable (a slice of structs). Each piece of test data contains a name (test case name, description), input value - a word (strings) and the result we want.

The tt variable is declared on the package level. This way tests and benchmarks can grab the same input data and call the scrabble.Score function.

scrabble_test.go

var tt = []struct {
    desc  string
    word  string
    want  int
}{
 {
        desc: "medium, valuable word",
        word: "quirky",
        want: 22,
 },
 {
        desc: "empty input",
        word: "",
        want: 0,
 },
 {
        desc: "entire alphabet available",
        word: "abcdefghijklmnopqrstuvwxyz",
        want: 87,
 },
    // more structs with different test data 
}

As always, we prefix our benchmarking function with Benchmark followed by the benchmarked function name. The benchmarking function takes one parameter—a pointer to the testing.B struct. Nothing new so far. It’s a standard Go naming pattern.

scrabble_test.go

func BenchmarkScore(b *testing.B) {
    // benchmarking code
}

Now it’s time to use the new Loop method. We start with a for loop. In the loop’s scope, we call the benchmarked Score function.

scrabble_test.go

func BenchmarkScore(b *testing.B) {
    for b.Loop() {
        for _, tc := range tt {
            scrabble.Score(tc.input)
		}
 	}
}

We no longer need to loop from 0 to b.N. The b.Loop() takes care of the most important change in benchmarks: making sure the benchmarking function BenchmarkScore is not optimised by the compiler.

It means we get accurate results and don’t have to write code to prevent compiler optimisations. All we need to do is make sure functions we benchmark are called in the scope of the for loop.

scrabble_test.go

func BenchmarkScore(b *testing.B) {
    for b.Loop() {
        for _, tc := range tt {
            scrabble.Score(tc.input)
 		}
 	}
}

Writing benchmark the “old way”

So, how do we write a benchmark if we use the Go version earlier than 1.24?

From the first glance the function structure is not much different. We have a for loop that runs until the i var is less than b.N. The point is, the compiler can start optimising the benchmarking function and we may end up getting incorrect results, for example super fast, unrealistic function execution times.

func BenchmarkScore(b *testing.B) {
    for i := 0; i < b.N; i++ {
        for _, tc := range tt {
            scrabble.Score(tc.input)
 		}
 	}
}

We need to take a few additional steps to prevent the compiler from doing optimisations.

First, the Score function needs to return a value, which is stored in the got variable.

Second, in the BenchmarkScore function’s scope, we declare a variable got. This variable is local to the function. The got variable gets the value returned from the scrabble.Score function on each iteration.

scrabble_test.go

var Result int

func BenchmarkScore(b *testing.B) {
    var got int
    for i := 0; i < b.N; i++ {
        for _, tc := range tt {
            got = scrabble.Score(tc.input)
 		}
 	}
    Result = got
}

Third, when the for loop is finished, we assign the final value of the got variable to the global (declared on the package level) exported variable Result.

Because the variable Result is public, the compiler is not able to “prove” that another package importing the variable will not be able to see the Result value changing over time. That’s why the compiler can’t optimise away value assigning operations. And that’s the goal—to “trick” the compiler.

scrabble_test.go

var Result int

func BenchmarkScore(b *testing.B) {
    var got int
    for i := 0; i < b.N; i++ {
        for _, tc := range tt {
            got = scrabble.Score(tc.input)
 		}
 	}
    Result = got
}

So, now we can appreciate the new, simpler, and less error-prone way to write benchmarks. Note that we can still use the “old” approach and all other methods for stopping, starting, and resetting benchmark timers.