How much RAM does Prometheus 2.x need -天翼云开发者社区

I previously looked at ingestion memory for 1.x, how about 2.x?

Prometheus 2.x has a very different ingestion system to 1.x, with many performance improvements. This time I'm also going to take into account the cost of cardinality in the head block. To start with I took a profile of a Prometheus 2.9.2 ingesting from a single target with 100k unique time series:

This gives a good starting point to find the relevant bits of code, but as my Prometheus has just started doesn't have quite everything. From here I can start digging through the code to understand what each bit of usage is.

So PromParser.Metric for example looks to be the length of the full timeseries name, while the scrapeCache is a constant cost of 145ish bytes per time series, and under getOrCreateWithID there's a mix of constants, usage per unique label value, usage per unique symbol, and per sample label. The usage under fanoutAppender.commit is from the initial writing of all the series to the WAL, which just hasn't been GCed yet. One thing missing is chunks, which work out as 192B for 128B of data which is a 50% overhead.

From here I take various worst case assumptions. For example half of the space in most lists is unused and chunks are practically empty. To simplify I ignore the number of label names, as there should never be many of those. This works out then as about 732B per series, another 32B per label pair, 120B per unique label value and on top of all that the time series name twice. Last, but not least, all of that must be doubled given how Go garbage collection works.

That's cardinality, for ingestion we can take the scrape interval, the number of time series, the 50% overhead, typical bytes per sample, and the doubling from GC. Given how head compaction works, we need to allow for up to 3 hours worth of data.

Rather than having to calculate all of this by hand, I've done up a calculator as a starting point:

<label for="fieldname2_1">Number of Time Series*</label>

<input id="fieldname2_1" class="field number small required" name="fieldname2_1" type="text" value="1000000" aria-label="Number of Time Series">max_over_time(prometheus_tsdb_head_series[1d])

<label for="fieldname3_1">Average Labels Per Time Series*</label>

<input id="fieldname3_1" class="field number small required" name="fieldname3_1" type="text" value="5" aria-label="Average Labels Per Time Series">Don't forget to include target labels, and the metric name

<label for="fieldname4_1">Number of Unique Label Pairs*</label>

<input id="fieldname4_1" class="field number small required" name="fieldname4_1" type="text" value="10000" aria-label="Number of Unique Label Pairs">Across all time series

<label for="fieldname5_1">Average Bytes per Label Pair*</label>

<input id="fieldname5_1" class="field number small required" name="fieldname5_1" type="text" value="20" aria-label="Average Bytes per Label Pair">Including the equals, double quotes, and comma

<label for="fieldname9_1">Cardinality Memory</label>

<label for="fieldname7_1">Scrape Interval (s)</label>

<label for="fieldname8_1">Bytes per Sample</label>

1.70

rate(prometheus_tsdb_compaction_chunk_size_bytes_sum[1d])
/
rate(prometheus_tsdb_compaction_chunk_samples_sum[1d])

<label for="fieldname13_1">Samples per Second</label>

<label for="fieldname1_1">Ingestion Memory</label>

<label for="fieldname10_1">Combined Memory</label>

<label id="fieldname11_1"></label>These values are approximate, and may differ in reality and vary by version. These do not represent all Prometheus memory usage.

</form>

This shows for example that a million series costs around 2GiB of RAM in terms of cardinality, plus with a 15s scrape interval and no churn around 2.5GiB for ingestion.

I previously looked at ingestion memory for 1.x, how about 2.x?

Rather than having to calculate all of this by hand, I've done up a calculator as a starting point:

<label for="fieldname2_1">Number of Time Series*</label>

<input id="fieldname2_1" class="field number small required" name="fieldname2_1" type="text" value="1000000" aria-label="Number of Time Series">max_over_time(prometheus_tsdb_head_series[1d])

<label for="fieldname3_1">Average Labels Per Time Series*</label>

<label for="fieldname4_1">Number of Unique Label Pairs*</label>

<input id="fieldname4_1" class="field number small required" name="fieldname4_1" type="text" value="10000" aria-label="Number of Unique Label Pairs">Across all time series

<label for="fieldname5_1">Average Bytes per Label Pair*</label>

<input id="fieldname5_1" class="field number small required" name="fieldname5_1" type="text" value="20" aria-label="Average Bytes per Label Pair">Including the equals, double quotes, and comma

<label for="fieldname9_1">Cardinality Memory</label>

<label for="fieldname7_1">Scrape Interval (s)</label>

<label for="fieldname8_1">Bytes per Sample</label>

1.70

rate(prometheus_tsdb_compaction_chunk_size_bytes_sum[1d])
/
rate(prometheus_tsdb_compaction_chunk_samples_sum[1d])

<label for="fieldname13_1">Samples per Second</label>

<label for="fieldname1_1">Ingestion Memory</label>

<label for="fieldname10_1">Combined Memory</label>

<label id="fieldname11_1"></label>These values are approximate, and may differ in reality and vary by version. These do not represent all Prometheus memory usage.

</form>

This shows for example that a million series costs around 2GiB of RAM in terms of cardinality, plus with a 15s scrape interval and no churn around 2.5GiB for ingestion.

智算服务

应用商城

合作伙伴

开发者

支持与服务

了解天翼云

How much RAM does Prometheus 2.x need

How much RAM does Prometheus 2.x need

活动

智算服务

应用商城

合作伙伴

开发者

支持与服务

了解天翼云

How much RAM does Prometheus 2.x need

How much RAM does Prometheus 2.x need