Go Optimizations 101, Go Details & Tips 101 and Go Generics 101 are all updated for Go 1.24 now. The most cost-effective way to get them is through this book bundle in the Leanpub book store.

TapirMD - a powerful, next-generation markup language that simplifies content creation (much more powerful than markdown). You can experience it online here.

Maps

In Go, the capacity of a map is unlimited in theory, it is only limited by available memory. That is why the built-in cap function doesn't apply to maps.

In the official standard Go runtime implementation, maps are implemented as hashtables internally. Each map/hashtable maintains a backing array to store map entries (key-value pairs). Along with more and more entries are put into a map, the size of the backing array might be thought as too small to store more entries, thus a new larger backing array will be allocated and the current entries (in the old backing array) will be moved to it, then the old backing array will be discarded.

In the official standard Go runtime implementation, the backing array of a map will never shrink, even if all entries are deleted from the map. This is a form of memory wasting. But in practice, this is seldom a problem and and actually often good for program performances.

Clear map entries

We could use the following loop to clear all entries in a map:

	for key := range aMap {
		delete(aMap, key)
	}

The loop is specially optimized (except entries with NaN keys exist) so that its execution is very fast. However, please note that, as mentioned above, the backing array of the cleared map doesn't shrink after the loop. Then how to release the backing array of the map? There are two ways:

	aMap = nil
	// or
	aMap = make(map[K]V)

If the backing array of the map is not referenced elsewhere, then the backing array will be collected eventually after being released.

If there will be many new entries to be put in the map after it is cleared, then the former way is preferred; otherwise, the latter (release) ways are preferred.

Since Go 1.21, there is a better way to do this job. Go 1.21 introduced a new built-in function, clear, which may be used to clear all entries in a map, including those ones with NaN keys.

Note: currently (Go toolchain 1.24), using the built-in clear function to clear a map with at least one entry takes time proportional to the size of the backing array of the map.

`aMap[key]++` is more efficient than `aMap[key] = aMap[key] + 1`

In the statement aMap[key] = aMap[key] + 1, the key are hashed twice, but in the statement aMap[key]++, it is only hashed once.

Similarly, aMap[key] += value is more efficient than aMap[key] = aMap[key] + value.

These could be proved by the following benchmark code:

package maps

import "testing"

var m = map[int]int{}

func Benchmark_increment(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99]++
	}
}

func Benchmark_plusone(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99] += 1
	}
}

func Benchmark_addition(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99] = m[99] + 1
	}
}

The benchmark results:

Benchmark_increment-4  11.31 ns/op
Benchmark_plusone-4    11.21 ns/op
Benchmark_addition-4   16.10 ns/op

Pointers in maps

If the key type and element type of a map both don't contain pointers, then in the scan phase of a GC cycle, the garbage collector will not scan the entries of the map. This could save much time.

This tip is also valid for other kinds of container in Go, such as slices, arrays and channels.

Using byte arrays instead of short strings as keys

Internally, each string contains a pointer, which points to the underlying bytes of that string. So if the key or element type of a map is a string type, then all the entries of the map needs to be scanned in GC cycles.

If we can make sure that the string values used in the entries of a map have a max length and the max length is small, then we could use the array type [N]byte to replace the string types (where N is the max string length). Doing this will save much garbage collection scanning time if the number of the entries in the map is very large.

For example, in the following code, the entries of mapB contain no pointers, but the (string) keys of mapA contain pointers. So garbage collector will skip mapB during the scan phase of a GC cycle.

	var mapA = make(map[string]int, 1 << 16)
	var mapB = make(map[[32]byte]int, 1 << 16)

And please note that, the official standard compiler makes special optimizations on hashing map keys whose sizes are 4 or 8 bytes. So, from the point of view of saving CPU, it is better to use map[[8]byte]V instead of map[[5]byte]V, and it is better to use map[int32]V instead of map[int16]V.

Lower map element modification frequency

In the previous "strings and byte slices" chapter, it has been mentioned that a byte-slice-to-string conversion appearing as the index key in a map element retrieval expression doesn't allocate, but such conversions in L-value map element index expressions will allocate.

So sometimes, we could lower the frequency of using such conversions in L-value map element index expressions to improve program performance.

In the following example, the B way (pointer element way) is more performant than the A way. The reason is the B way modifies element values rarely. The elements in the B way are pointers, once they are created, they are never changed.

package maps

import "testing"

var wordCounterA = make(map[string]int)
var wordCounterB = make(map[string]*int)
var key = make([]byte, 64)

func IncA(w []byte) {
	wordCounterA[string(w)]++
}

func IncB(w []byte) {
	p := wordCounterB[string(w)]
	if p == nil {
		p = new(int)
		wordCounterB[string(w)] = p
	}
	*p++
}

func Benchmark_A(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncA(key[:i])
		}
	}
}

func Benchmark_B(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncB(key[:i])
		}
	}
}

The benchmark results:

Benchmark_A-4  11600 ns/op  2336 B/op  62 allocs/op
Benchmark_B-4   1543 ns/op     0 B/op   0 allocs/op

Although the B way (pointer element way) is less CPU consuming, it creates many pointers, which increases the burden of pointer scanning in a GC cycle. But generally, the B way is more efficient.

We could use an extra counter table (a slice) and let the map record indexes to the table, to avoid making many allocations and creating many pointers, as the following code shows:

var wordIndexes = make(map[string]int)
var wordCounters []int

func IncC(w []byte) {
	if i, ok := wordIndexes[string(w)]; ok {
		wordCounters[i]++
	} else {
		wordIndexes[string(w)] = len(wordCounters)
		wordCounters = append(wordCounters, 1)
	}
}

func Benchmark_C(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncC(key[:i])
		}
	}
}

The benchmark results:

Benchmark_A-4  11600 ns/op  2336 B/op  62 allocs/op
Benchmark_B-4   1543 ns/op     0 B/op   0 allocs/op
Benchmark_C-4   1609 ns/op     0 B/op   0 allocs/op

From a short-period view, the C way is as almost performant as the B way, But as it uses much less pointers, it is actually more efficient than the B way in a long-period view.

Please note that the above benchmark results show the latter two ways both make zero allocations. This is actually not true. It is just that each of latter two benchmark runs makes less than one allocation averagely, which is truncated to zero. This is a deliberate design of the benchmark reports in the standard packages.

Try to grow a map in one step

If we could predict the max number of entries will be put into a map at coding time, we should create the map with the make function and pass the max number as the size argument of the make call, to avoid growing the map in multiple steps later.

Use index tables instead of maps which key types have only a small set of possible values

Some programmers like to use a map with bool key to reduce verbose if-else code block uses. For example, the following code

	// Within a function ...
	var condition bool
	condition = evaluateCondition()
	...
	if condition {
		counter++
	} else {
		counter--
	}
	...
	if condition {
		f()
	} else {
		g()
	}
	...

could be replaced with

// Package-level maps.
var boolToInt = map[bool]int{true: 1, false: 0}
var boolToFunc = map[bool]func(){true: f, false: g}

	// Within a function ...
	var condition bool
	condition = evaluateCondition()
	...
	counter += boolToInt[condition]
	...
	boolToFunc[condition]()
	...

If there are many such identical if-else blocks used in code, using maps with bool keys will reduce many boilerplates and make code look much cleaner. For most use cases, this is generally good. However, as of Go toolchain v1.24.n, the map way is not very efficient from the code execution performance view. The following benchmarks show the performance differences.

package maps

import "testing"

//go:noiline
func f() {}

//go:noiline
func g() {}

func IfElse(x bool) func() {
	if x {
		return f
	} else {
		return g
	}
}

var m = map[bool]func() {true: f, false: g}
func MapSwitch(x bool) func() {
	return m[x]
}

func Benchmark_IfElse(b *testing.B) {
	for i := 0; i < b.N; i++ {
		IfElse(true)()
		IfElse(false)()
	}
}

func Benchmark_MapSwitch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		MapSwitch(true)()
		MapSwitch(false)()
	}
}

The benchmark results:

Benchmark_IfElse-4      4.155 ns/op
Benchmark_MapSwitch-4  47.46 ns/op

From the benchmark results, we could get that the if-else block way is much more performant than the map-switch way.

For the use cases which require high code performance, we can simulate a bool-key map by using an index table to reduce if-else boilerplates, but still keep the simplicity of the map switch way, with the help of a bool-to-int function. The following benchmarks show how to use the index table way.

func b2i(b bool) (r int) {
	if b {
		r = 1
	}
	return
}

var boolMap = [2]func(){g, f}

func Benchmark_BoolMap(b *testing.B) {
	for i := 0; i < b.N; i++ {
		boolMap[b2i(true)]()
		boolMap[b2i(false)]()
	}
}

From the above code, we could find that the uses of the index table way are almost as clean as the map-switch way, though an extra tiny b2i function is needed. And from the following benchmark results, we know that the index table way is as performant as the if-else block way.

Benchmark_IfElse-4      4.155 ns/op
Benchmark_MapSwitch-4  47.46 ns/op
Benchmark_BoolMap-4     4.135 ns/op

(more articles ↡)

The Go 101 project is hosted on Github. Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies, description flaws, code bugs and broken links.

If you would like to learn some Go details and facts every serveral days, please follow Go 101's official Twitter account @zigo_101.

The digital versions of this book are available at the following places:

Leanpub store, $7.99+ (Or buy this book from this or this book bundle).
Apple Books store, $7.99.
Amazon Kindle store, $7.99.

Tapir, the author of Go 101, has been on writing the Go 101 series books and maintaining the go101.org website since 2016 July. New contents will be continually added to the book and the website from time to time. Tapir is also an indie game developer. You can also support Go 101 by playing Tapir's games (made for both Android and iPhone/iPad):

Color Infection (★★★★★), a physics based original casual puzzle game. 140+ levels.
Rectangle Pushers (★★★★★), an original casual puzzle game. Two modes, 104+ levels.
Let's Play With Particles, a casual action original game. Three mini games are included.

Individual donations via PayPal are also welcome.

Articles in this book:

Acknowledgments
About Go Optimizations 101
Value Parts and Value Sizes ^{(available in the paid ebooks)}

value/type sizes
memory alignments
struct padding
avoid larger copy costs

Memory Allocations
Stack and Escape Analysis ^{(available in the paid ebooks)}

escape analysis
how to control value allocation places
stacks growth and shrinkage
how to reduce stack grow times

Garbage Collection ^{(available in the paid ebooks)}

GC pacer
how to reduce GC pressure
control GC frequency

Pointers
Structs ^{(available in the paid ebooks)}

3 facts/suggestions

Arrays and Slices ^{(available in the paid ebooks)}

10+ facts/suggestions

String and Byte Slices ^{(available in the paid ebooks)}

10+ facts/suggestions

BCE (Bound Check Elimination)

the cases BCE works for
the cases BCE doesn't work for
the cases BCE works for when given hints

Maps
Channels ^{(available in the paid ebooks)}

3 facts/suggestions

Functions ^{(available in the paid ebooks)}

how to make a function inline-able
how to make a function not inline-able
pointer parameters/results vs. non-pointer ones
named results vs. anonymous ones
10+ facts/suggestions

Interfaces ^{(available in the paid ebooks)}

value boxing costs
3+ facts/suggestions

Maps

Clear map entries

aMap[key]++ is more efficient than aMap[key] = aMap[key] + 1

Pointers in maps

Using byte arrays instead of short strings as keys

Lower map element modification frequency

Try to grow a map in one step

Use index tables instead of maps which key types have only a small set of possible values

`aMap[key]++` is more efficient than `aMap[key] = aMap[key] + 1`