Pointers are a confusing topic when learning Go. They provide an explicit way to interact with values in memory. To understand how they work, we will look at how memory is managed in Go. Then we will look at the ways we can use pointers to share values, communicate intent and optimize performance.
Memory Management in Go
How are values stored?
When we declare a variable, we are storing the value it contains at a location in memory1. The location in memory is represented as an address. A pointer variable has a value that is an address that points to the value at that address.
a := 5
aPtr := &a
aPtrPtr := &aPtr
println(a) // 5
println(aPtr) // 0x01
println(aPtrPtr) // 0x02
We use the &
operator to get the address of a value.
Where are values stored?
When we run a program, a stack and heap2 are initialized in memory.
The stack is a linear memory space that grows and shrinks as we call functions. Each function call will allocate a block of memory in the stack called a frame. Values declared within a function will be stored in that function's frame. When a function returns, the frame is removed from the stack, along with any values that were declared.
The heap is memory space used to share values between function calls. As a function returns, it may return a pointer to a value. Since we know a function and its values will be cleared as the frame is removed from the stack - that value being returned will need to be moved to the heap in order to persist it after the frame is removed.
Memory management on the stack is simple, as functions complete the memory is cleared. This is not the case with the heap. As we return more pointers, the heap will continually grow. This is where a garbage collector3 kicks in to remove unused values in the heap. As more memory is consumed by the heap, a garbage collection event will occur. These events can impact performance, as it can be thought of as a pause to free unused memory before proceeding.
Examples
Passing by value
func main() {
a := 5
inner(a)
println(a) // 5
}
func inner(i int) {
i = 10
}
- When
main
is run, it allocates a frame in the stack and stores the value5
in the variablea
- When
inner
is run, it allocates a frame in the stack and is passed a copy ofa
's value storing it ini
inner
proceeds to updatesi
to10
inner
returns, freeing the frame from the stack andi
is no longer accessible
Because the value of a
was copied to the inner
function any changes to it within the inner
function are not reflected. This is called "passing by value".
Passing by reference
func main() {
a := 5
inner(&a)
println(a) // 10
}
func inner(i *int) {
*i = 10
}
- When
main
is run, it allocates a frame in the stack and stores the value5
in the variablea
- When
inner
is run, it allocates a frame in the stack and is passed a copy of the address toa
's value, the pointer, storing it ini
inner
proceeds to follow the pointer toa
's value and mutates it10
- this is called derferencinginner
returns, freeing the frame from the stack andi
is no longer accessible
Since inner
dereferenced the pointer to get to a
's actual value before proceeding to mutate it, that mutation persists.
Returning a value
func main() {
a := inner()
println(a) // 10
}
func inner() int {
i := 10
return i
}
- When
main
is run, it allocates a frame in the stack - When
inner
is run, it allocates a frame in the stack and stores the value10
in the variablei
inner
returns a copy of the value ati
and removes the frame from the stack- The copy of the value at
i
is stored in the variablea
ofmain
Returning a reference
func main() {
a := inner()
println(*a) // 10
}
func inner() *int {
i := 10
return &i
}
- When
main
is run, it allocates a frame in the stack - When
inner
is run, it allocates a frame in the stack and stores the value10
in the variablei
- Since
inner
is planning on returning a reference, the value ofi
is copied to the heap in order to persist it after the frame is removed from the stack inner
returns a pointer to the value on the heap, which is then assigned toa
in themain
frame
The curious case of slices
Slices are a dynamically sized array4. They seem to behave a lot like a pointer when passed around, but they are more accurately represented as a struct containing a header and a pointer to the underlying array of values5. The header contains the length and capacity values.
When a slice is passed by value, we can make some changes to values in the underlying array but are unable to modify the length and capacity of the slice by append
ing.
modify := func(i []int) {
i[0] = 2
i[1] = 2
}
a := []int{1, 1}
modify(a)
println(a) // {2, 2}
modify := func(i []int) {
i = append(i, 2)
}
a := []int{1, 1}
modify(a)
println(a) // {1, 1}
The reasoning behind this is when we pass by value, we are making a copy of the header and the pointer to the underlying array. When we perform an append
we are creating a new slice with the increased header length and capacity and potentially changing the pointer to the underlying array if the runtime determines the initial block of memory assigned is not large enough to hold the new value, requiring the array to be copied to a new location in memory. Since this is a new slice, we cannot observe this change outside the function performing the append
.
modify := func(i *[]int) {
*i = append(*i, 2)
}
a := []int{1, 1}
modify(&a)
println(a) // {1, 1, 2}
When we pass a pointer and assign the result of the append
to that pointer, we are assigning the new slice to that pointer location, which is observable outside the function.
Rule of thumb, if we are passing a slice by value, we can modify the values in the underlying array but can not do anything that would change the length or capacity of the slice6.
Using pointers
Performance
The theory is that copying values, especially large ones (e.g. slices, structs)7, can be expensive so we copy the pointer8 instead, which should be a fixed size. Though in reality it is not that simple. Copying values on the stack is relatively cheap and cleanup is automatic when functions return. Comparing this to copying values to the heap where the garbage collector will need to track usages and clean up once it becomes unused.
The only way to determine what is best is by profiling the memory and cpu usage9 for your program. Even then the gains may be not worthwhile when balanced against readability.
Mutation
When we pass by reference, that is provide a pointer to a function, the function can mutate that value.
This can become a problem when reading and debugging code, as any number of functions that take a pointer could be mutating that value, requiring us to jump into each function to investigate whether that is the case. While it may not be the most memory efficient, it may be reasonable to default to passing by value instead to improve readability.
func iCanAndWillMutate(i *int) {}
func iWillAcceptAndReturnACopy(i int) int {}
Represent missing values
A variable with a pointer type can be initialized as nil
. This is an explicit way to communicate this value may be missing and requires a nil
check.
func maybeReturnsSomething() *int {}
func maybeAcceptsSomething(i *int) {
if i == nil {
println("i is nil 😢")
}
println("i is something 😊")
}
Conclusion
This post hopefully clears up confusion around pointers and demonstrates how memory is managed in Go with examples. I personally opt to pass by value in most cases, opting into pointers when I wish to communicate a missing value or have identified a performance bottleneck that is worth optimizing.
Footnotes
-
https://dev.to/karankumarshreds/memory-allocations-in-go-1bpa ↩
-
https://www.ardanlabs.com/blog/2018/12/garbage-collection-in-go-part1-semantics.html ↩
-
https://go.dev/blog/slices#:~:text=Passing%20slices%20to%20functions ↩
-
https://medium.com/swlh/golang-tips-why-pointers-to-slices-are-useful-and-how-ignoring-them-can-lead-to-tricky-bugs-cac90f72e77b ↩
-
https://medium.com/@vCabbage/go-are-pointers-a-performance-optimization-a95840d3ef85 ↩