searchusermenu
  • 发布文章
  • 消息中心
点赞
收藏
评论
分享

How much RAM does Prometheus 2.x need

2024-09-10 09:23:41
1
0

I previously looked at ingestion memory for 1.x, how about 2.x?

Prometheus 2.x has a very different ingestion system to 1.x, with many performance improvements. This time I'm also going to take into account the cost of cardinality in the head block. To start with I took a profile of a Prometheus 2.9.2 ingesting from a single target with 100k unique time series:

This gives a good starting point to find the relevant bits of code, but as my Prometheus has just started doesn't have quite everything. From here I can start digging through the code to understand what each bit of usage is.

So PromParser.Metric for example looks to be the length of the full timeseries name, while the scrapeCache is a constant cost of 145ish bytes per time series, and under getOrCreateWithID there's a mix of constants, usage per unique label value, usage per unique symbol, and per sample label. The usage under fanoutAppender.commit is from the initial writing of all the series to the WAL, which just hasn't been GCed yet. One thing missing is chunks, which work out as 192B for 128B of data which is a 50% overhead.

From here I take various worst case assumptions. For example half of the space in most lists is unused and chunks are practically empty. To simplify I ignore the number of label names, as there should never be many of those. This works out then as about 732B per series, another 32B per label pair, 120B per unique label value and on top of all that the time series name twice. Last, but not least, all of that must be doubled given how Go garbage collection works.

That's cardinality, for ingestion we can take the scrape interval, the number of time series, the 50% overhead, typical bytes per sample, and the doubling from GC. Given how head compaction works, we need to allow for up to 3 hours worth of data.

Rather than having to calculate all of this by hand, I've done up a calculator as a starting point:

<form id="cp_calculatedfieldsf_pform_1" class="" action="https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion/?" autocomplete="on" enctype="multipart/form-data" method="post" name="cp_calculatedfieldsf_pform_1" novalidate="novalidate" data-evalequations="1" data-evalequationsevent="2">

 

<label for="fieldname2_1">Number of Time Series*</label>
<input id="fieldname2_1" class="field number small required" name="fieldname2_1" type="text" value="1000000" aria-label="Number of Time Series">max_over_time(prometheus_tsdb_head_series[1d])
 
<label for="fieldname3_1">Average Labels Per Time Series*</label>
<input id="fieldname3_1" class="field number small required" name="fieldname3_1" type="text" value="5" aria-label="Average Labels Per Time Series">Don't forget to include target labels, and the metric name
 
<label for="fieldname4_1">Number of Unique Label Pairs*</label>
<input id="fieldname4_1" class="field number small required" name="fieldname4_1" type="text" value="10000" aria-label="Number of Unique Label Pairs">Across all time series
 
<label for="fieldname5_1">Average Bytes per Label Pair*</label>
<input id="fieldname5_1" class="field number small required" name="fieldname5_1" type="text" value="20" aria-label="Average Bytes per Label Pair">Including the equals, double quotes, and comma
 
<label for="fieldname9_1">Cardinality Memory</label>
<input id="fieldname9_1" class="codepeoplecalculatedfield field small" name="fieldname9_1" readonly="readonly" type="text" value="" aria-label="Cardinality Memory">
 
<label></label>
 
<label for="fieldname7_1">Scrape Interval (s)</label>
 
15
 
 
<label for="fieldname8_1">Bytes per Sample</label>
 
1.70
 
  rate(prometheus_tsdb_compaction_chunk_size_bytes_sum[1d])
/
  rate(prometheus_tsdb_compaction_chunk_samples_sum[1d])
 
<label for="fieldname13_1">Samples per Second</label>
<input id="fieldname13_1" class="codepeoplecalculatedfield field small" name="fieldname13_1" readonly="readonly" type="text" value="" aria-label="Samples per Second">
 
<label for="fieldname1_1">Ingestion Memory</label>
<input id="fieldname1_1" class="codepeoplecalculatedfield field small" name="fieldname1_1" readonly="readonly" type="text" value="" aria-label="Ingestion Memory">
 
<label></label>
 
<label for="fieldname10_1">Combined Memory</label>
<input id="fieldname10_1" class="codepeoplecalculatedfield field small" name="fieldname10_1" readonly="readonly" type="text" value="" aria-label="Combined Memory">
 
<label id="fieldname11_1"></label>These values are approximate, and may differ in reality and vary by version. These do not represent all Prometheus memory usage.
 
 
</form>

 

This shows for example that a million series costs around 2GiB of RAM in terms of cardinality, plus with a 15s scrape interval and no churn around 2.5GiB for ingestion.

0条评论
作者已关闭评论
齐****军
14文章数
0粉丝数
齐****军
14 文章 | 0 粉丝
齐****军
14文章数
0粉丝数
齐****军
14 文章 | 0 粉丝

How much RAM does Prometheus 2.x need

2024-09-10 09:23:41
1
0

I previously looked at ingestion memory for 1.x, how about 2.x?

Prometheus 2.x has a very different ingestion system to 1.x, with many performance improvements. This time I'm also going to take into account the cost of cardinality in the head block. To start with I took a profile of a Prometheus 2.9.2 ingesting from a single target with 100k unique time series:

This gives a good starting point to find the relevant bits of code, but as my Prometheus has just started doesn't have quite everything. From here I can start digging through the code to understand what each bit of usage is.

So PromParser.Metric for example looks to be the length of the full timeseries name, while the scrapeCache is a constant cost of 145ish bytes per time series, and under getOrCreateWithID there's a mix of constants, usage per unique label value, usage per unique symbol, and per sample label. The usage under fanoutAppender.commit is from the initial writing of all the series to the WAL, which just hasn't been GCed yet. One thing missing is chunks, which work out as 192B for 128B of data which is a 50% overhead.

From here I take various worst case assumptions. For example half of the space in most lists is unused and chunks are practically empty. To simplify I ignore the number of label names, as there should never be many of those. This works out then as about 732B per series, another 32B per label pair, 120B per unique label value and on top of all that the time series name twice. Last, but not least, all of that must be doubled given how Go garbage collection works.

That's cardinality, for ingestion we can take the scrape interval, the number of time series, the 50% overhead, typical bytes per sample, and the doubling from GC. Given how head compaction works, we need to allow for up to 3 hours worth of data.

Rather than having to calculate all of this by hand, I've done up a calculator as a starting point:

<form id="cp_calculatedfieldsf_pform_1" class="" action="https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion/?" autocomplete="on" enctype="multipart/form-data" method="post" name="cp_calculatedfieldsf_pform_1" novalidate="novalidate" data-evalequations="1" data-evalequationsevent="2">

 

<label for="fieldname2_1">Number of Time Series*</label>
<input id="fieldname2_1" class="field number small required" name="fieldname2_1" type="text" value="1000000" aria-label="Number of Time Series">max_over_time(prometheus_tsdb_head_series[1d])
 
<label for="fieldname3_1">Average Labels Per Time Series*</label>
<input id="fieldname3_1" class="field number small required" name="fieldname3_1" type="text" value="5" aria-label="Average Labels Per Time Series">Don't forget to include target labels, and the metric name
 
<label for="fieldname4_1">Number of Unique Label Pairs*</label>
<input id="fieldname4_1" class="field number small required" name="fieldname4_1" type="text" value="10000" aria-label="Number of Unique Label Pairs">Across all time series
 
<label for="fieldname5_1">Average Bytes per Label Pair*</label>
<input id="fieldname5_1" class="field number small required" name="fieldname5_1" type="text" value="20" aria-label="Average Bytes per Label Pair">Including the equals, double quotes, and comma
 
<label for="fieldname9_1">Cardinality Memory</label>
<input id="fieldname9_1" class="codepeoplecalculatedfield field small" name="fieldname9_1" readonly="readonly" type="text" value="" aria-label="Cardinality Memory">
 
<label></label>
 
<label for="fieldname7_1">Scrape Interval (s)</label>
 
15
 
 
<label for="fieldname8_1">Bytes per Sample</label>
 
1.70
 
  rate(prometheus_tsdb_compaction_chunk_size_bytes_sum[1d])
/
  rate(prometheus_tsdb_compaction_chunk_samples_sum[1d])
 
<label for="fieldname13_1">Samples per Second</label>
<input id="fieldname13_1" class="codepeoplecalculatedfield field small" name="fieldname13_1" readonly="readonly" type="text" value="" aria-label="Samples per Second">
 
<label for="fieldname1_1">Ingestion Memory</label>
<input id="fieldname1_1" class="codepeoplecalculatedfield field small" name="fieldname1_1" readonly="readonly" type="text" value="" aria-label="Ingestion Memory">
 
<label></label>
 
<label for="fieldname10_1">Combined Memory</label>
<input id="fieldname10_1" class="codepeoplecalculatedfield field small" name="fieldname10_1" readonly="readonly" type="text" value="" aria-label="Combined Memory">
 
<label id="fieldname11_1"></label>These values are approximate, and may differ in reality and vary by version. These do not represent all Prometheus memory usage.
 
 
</form>

 

This shows for example that a million series costs around 2GiB of RAM in terms of cardinality, plus with a 15s scrape interval and no churn around 2.5GiB for ingestion.

文章来自个人专栏
云监控
13 文章 | 1 订阅
0条评论
作者已关闭评论
作者已关闭评论
0
0