go内存GC介绍之一-天翼云开发者社区

Introduction

This guide is intended to aid advanced Go users in better understanding their application costs by providing insights into the Go garbage collector. It also provides guidance on how Go users may use these insights to improve their applications' resource utilization. It does not assume any knowledge of garbage collection, but does assume familiarity with the Go programming language.

The Go language takes responsibility for arranging the storage of Go values; in most cases, a Go developer need not care about where these values are stored, or why, if at all. In practice, however, these values often need to be stored in computer physical memory and physical memory is a finite resource. Because it is finite, memory must be managed carefully and recycled in order to avoid running out of it while executing a Go program. It's the job of a Go implementation to allocate and recycle memory as needed.

Another term for automatically recycling memory is garbage collection. At a high level, a garbage collector (or GC, for short) is a system that recycles memory on behalf of the application by identifying which parts of memory are no longer needed. The Go standard toolchain provides a runtime library that ships with every application, and this runtime library includes a garbage collector.

Note that the existence of a garbage collector as described by this guide is not guaranteed by the Go specification, only that the underlying storage for Go values is managed by the language itself. This omission is intentional and enables the use of radically different memory management techniques.

Therefore, this guide is about a specific implementation of the Go programming language and may not apply to other implementations. Specifically, this following guide applies to the standard toolchain (the gc Go compiler and tools). Gccgo and Gollvm both use a very similar GC implementation so many of the same concepts apply, but details may vary.

Furthermore, this is a living document and will change over time to best reflect the latest release of Go. This document currently describes the garbage collector as of Go 1.19.

Where Go Values Live

Before we dive into the GC, let's first discuss the memory that doesn't need to be managed by the GC.

For instance, non-pointer Go values stored in local variables will likely not be managed by the Go GC at all, and Go will instead arrange for memory to be allocated that's tied to the lexical scope in which it's created. In general, this is more efficient than relying on the GC, because the Go compiler is able to predetermine when that memory may be freed and emit machine instructions that clean up. Typically, we refer to allocating memory for Go values this way as "stack allocation," because the space is stored on the goroutine stack.

Go values whose memory cannot be allocated this way, because the Go compiler cannot determine its lifetime, are said to escape to the heap. "The heap" can be thought of as a catch-all for memory allocation, for when Go values need to be placed somewhere. The act of allocating memory on the heap is typically referred to as "dynamic memory allocation" because both the compiler and the runtime can make very few assumptions as to how this memory is used and when it can be cleaned up. That's where a GC comes in: it's a system that specifically identifies and cleans up dynamic memory allocations.

There are many reasons why a Go value might need to escape to the heap. One reason could be that its size is dynamically determined. Consider for instance the backing array of a slice whose initial size is determined by a variable, rather than a constant. Note that escaping to the heap must also be transitive: if a reference to a Go value is written into another Go value that has already been determined to escape, that value must also escape.

Whether a Go value escapes or not is a function of the context in which it is used and the Go compiler's escape analysis algorithm. It would be fragile and difficult to try to enumerate precisely when values escape: the algorithm itself is fairly sophisticated and changes between Go releases. For more details on how to identify which values escape and which do not, see the section on eliminating heap allocations.

Tracing Garbage Collection

Garbage collection may refer to many different methods of automatically recycling memory; for example, reference counting. In the context of this document, garbage collection refers to tracing garbage collection, which identifies in-use, so-called live, objects by following pointers transitively.

Let's define these terms more rigorously.

Object—An object is a dynamically allocated piece of memory that contains one or more Go values.
Pointer—A memory address that references any value within an object. This naturally includes Go values of the form *T, but also includes parts of built-in Go values. Strings, slices, channels, maps, and interface values all contain memory addresses that the GC must trace.

Together, objects and pointers to other objects form the object graph. To identify live memory, the GC walks the object graph starting at the program's roots, pointers that identify objects that are definitely in-use by the program. Two examples of roots are local variables and global variables. The process of walking the object graph is referred to as scanning.

This basic algorithm is common to all tracing GCs. Where tracing GCs differ is what they do once they discover memory is live. Go's GC uses the mark-sweep technique, which means that in order to keep track of its progress, the GC also marks the values it encounters as live. Once tracing is complete, the GC then walks over all memory in the heap and makes all memory that is not marked available for allocation. This process is called sweeping.

One alternative technique you may be familiar with is to actually move the objects to a new part of memory and leave behind a forwarding pointer that is later used to update all the application's pointers. We call a GC that moves objects in this way a moving GC; Go has a non-moving GC.

The GC cycle

Because the Go GC is a mark-sweep GC, it broadly operates in two phases: the mark phase, and the sweep phase. While this statement might seem tautological, it contains an important insight: it's not possible to release memory back to be allocated until all memory has been traced, because there may still be an un-scanned pointer keeping an object alive. As a result, the act of sweeping must be entirely separated from the act of marking. Furthermore, the GC may also not be active at all, when there's no GC-related work to do. The GC continuously rotates through these three phases of sweeping, off, and marking in what's known as the GC cycle. For the purposes of this document, consider the GC cycle starting with sweeping, turning off, then marking.

The next few sections will focus on building intuition for the costs of the GC to aid users in tweaking GC parameters for their own benefit.

Understanding costs

The GC is inherently a complex piece of software built on even more complex systems. It's easy to become mired in detail when trying to understand the GC and tweak its behavior. This section is intended to provide a framework for reasoning about the cost of the Go GC and tuning parameters.

To begin with, consider this model of GC cost based on three simple axioms.

The GC involves only two resources: CPU time, and physical memory.
The GC's memory costs consist of live heap memory, new heap memory allocated before the mark phase, and space for metadata that, even if proportional to the previous costs, are small in comparison.

Note: live heap memory is memory that was determined to be live by the previous GC cycle, while new heap memory is any memory allocated in the current cycle, which may or may not be live by the end.
The GC's CPU costs are modeled as a fixed cost per cycle, and a marginal cost that scales proportionally with the size of the live heap.

Note: Asymptotically speaking, sweeping scales worse than marking and scanning, as it must perform work proportional to the size of the whole heap, including memory that is determined to be not live (i.e. "dead"). However, in the current implementation sweeping is so much faster than marking and scanning that its associated costs can be ignored in this discussion.

This model is simple but effective: it accurately categorizes the dominant costs of the GC. However, this model says nothing about the magnitude of these costs, nor how they interact. To model that, consider the following situation, referred to from here on as the steady-state.

The rate at which the application allocates new memory (in bytes per second) is constant.

Note: it's important to understand that this allocation rate is completely separate from whether or not this new memory is live. None of it could be live, all of it could be live, or some of it could be live. (On top of this, some old heap memory could also die, so it's not necessarily the case that if that memory is live, the live heap size grows.)

To put this more concretely, consider a web service that allocates 2 MiB of total heap memory for each request that it handles. During the request, at most 512 KiB of that 2 MiB stays live while the request is in flight, and when the service is finished handling the request, all that memory dies. Now, for the sake of simplicity suppose each request takes about 1 second to handle end-to-end. Then, a steady stream of requests, say 100 requests per second, results in an allocation rate of 200 MiB/s and a 50 MiB peak live heap.
The application's object graph looks roughly the same each time (objects are similarly sized, there's a roughly constant number of pointers, the maximum depth of the graph is roughly constant).

Another way to think about this is that the marginal costs of GC are constant.

Note: the steady-state may seem contrived, but it's representative of the behavior of an application under some constant workload. Naturally, workloads can change even while an application is executing, but typically application behavior looks like a bunch of these steady-states strung together with some transient behavior in between.

Note: the steady-state makes no assumptions about the live heap. It may be growing with each subsequent GC cycle, it may shrink, or it may stay the same. However, trying to encompass all of these situations in the explanations to follow is tedious and not very illustrative, so the guide will focus on examples where the live heap remains constant. The GOGC section explores the non-constant live heap scenario in some more detail.

In the steady-state while the live heap size is constant, every GC cycle is going to look identical in the cost model as long as the GC executes after the same amount of time has passed. That's because in that fixed amount of time, with a fixed rate of allocation by the application, a fixed amount of new heap memory will be allocated. So with the live heap size constant, and that new heap memory constant, memory use is always going to be the same. And because the live heap is the same size, the marginal GC CPU costs will be the same, and the fixed costs will be incurred at some regular interval.

Now consider if the GC were to shift the point at which it runs later in time. Then, more memory would be allocated but each GC cycle would still incur the same CPU cost. However over some other fixed window of time fewer GC cycles would finish, resulting in a lower overall CPU cost. The opposite would be true if the GC decided to start earlier in time: less memory would be allocated and CPU costs would be incurred more often.

This situation represents the fundamental trade-off between CPU time and memory that a GC can make, controlled by how often the GC actually executes. In other words, the trade-off is entirely defined by GC frequency.

One more detail remains to be defined, and that's when the GC should decide to start. Note that this directly sets the GC frequency in any particular steady-state, defining the trade-off. In Go, deciding when the GC should start is the main parameter which the user has control over.

Introduction

Furthermore, this is a living document and will change over time to best reflect the latest release of Go. This document currently describes the garbage collector as of Go 1.19.

Where Go Values Live

Before we dive into the GC, let's first discuss the memory that doesn't need to be managed by the GC.

Tracing Garbage Collection

Let's define these terms more rigorously.

Object—An object is a dynamically allocated piece of memory that contains one or more Go values.
Pointer—A memory address that references any value within an object. This naturally includes Go values of the form *T, but also includes parts of built-in Go values. Strings, slices, channels, maps, and interface values all contain memory addresses that the GC must trace.

The GC cycle

The next few sections will focus on building intuition for the costs of the GC to aid users in tweaking GC parameters for their own benefit.

Understanding costs

To begin with, consider this model of GC cost based on three simple axioms.

The GC involves only two resources: CPU time, and physical memory.
The GC's memory costs consist of live heap memory, new heap memory allocated before the mark phase, and space for metadata that, even if proportional to the previous costs, are small in comparison.

Note: live heap memory is memory that was determined to be live by the previous GC cycle, while new heap memory is any memory allocated in the current cycle, which may or may not be live by the end.
The GC's CPU costs are modeled as a fixed cost per cycle, and a marginal cost that scales proportionally with the size of the live heap.

Note: Asymptotically speaking, sweeping scales worse than marking and scanning, as it must perform work proportional to the size of the whole heap, including memory that is determined to be not live (i.e. "dead"). However, in the current implementation sweeping is so much faster than marking and scanning that its associated costs can be ignored in this discussion.

The rate at which the application allocates new memory (in bytes per second) is constant.

Note: it's important to understand that this allocation rate is completely separate from whether or not this new memory is live. None of it could be live, all of it could be live, or some of it could be live. (On top of this, some old heap memory could also die, so it's not necessarily the case that if that memory is live, the live heap size grows.)

To put this more concretely, consider a web service that allocates 2 MiB of total heap memory for each request that it handles. During the request, at most 512 KiB of that 2 MiB stays live while the request is in flight, and when the service is finished handling the request, all that memory dies. Now, for the sake of simplicity suppose each request takes about 1 second to handle end-to-end. Then, a steady stream of requests, say 100 requests per second, results in an allocation rate of 200 MiB/s and a 50 MiB peak live heap.
The application's object graph looks roughly the same each time (objects are similarly sized, there's a roughly constant number of pointers, the maximum depth of the graph is roughly constant).

Another way to think about this is that the marginal costs of GC are constant.

应用商城

合作伙伴

开发者

支持与服务

了解天翼云

go内存GC介绍之一

Introduction

Where Go Values Live

Tracing Garbage Collection

The GC cycle

Understanding costs

go内存GC介绍之一

Introduction

Where Go Values Live

Tracing Garbage Collection

The GC cycle

Understanding costs

活动

应用商城

合作伙伴

开发者

支持与服务

了解天翼云

go内存GC介绍之一

Introduction

Where Go Values Live

Tracing Garbage Collection

The GC cycle

Understanding costs

go内存GC介绍之一

Introduction

Where Go Values Live

Tracing Garbage Collection

The GC cycle

Understanding costs