环境说明
主机概况
硬件信息:
CPU:2*8C
内存:128GB
硬盘:2*480GBSSD(系统)12*4TSATA(数据)
JAVA环境信息:
测试集群架构
官方建议需要6台机器:
3台用于运行ZooKeeper 集群,建议使用性能较弱的机器,Pulsar仅将ZooKeeper用于与协调有关的定期任务和与配置有关的任务,而不用于基本操作。
3台用于运行bookkeeper 集群和broker 集群,建议使用性能强劲的机器。
但是也可以在一台机器上同时部署ZooKeeper,bookkeeper ,broker ,也就是最少需要三台机器就可以部署一个Pulsar 集群。
本次测试选用上面的3台最小化部署方案:
在10.33.0.18-10.33.0.20上,同时部署ZooKeeper,bookkeeper ,broker
组件对应的数据目录挂载于SSD
启动参数
PULSAR_MEM=${PULSAR_MEM:-"-Xms2g -Xmx2g -XX:MaxDirectMemorySize=4g"}
# Garbage collection options
PULSAR_GC=${PULSAR_GC:-"-XX:+UseG1GC -XX:MaxGCPauseMillis=10 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC"}
基于Pulsar-Perf的性能基准测试
基准测试固定如下参数:
Broker数量:3
Topic数量:1
Partition分区数:3
单个message大小:1024b(1kb)
写入/读取message数量:10000000条
生产者性能测试
配置参数
Usage: pulsar-perf produce [options] persistent://prop/ns/my-topic
Options:
-am, --access-mode
Producer access mode
Default: Shared
Possible Values: [Shared, Exclusive, WaitForExclusive]
-au, --admin-url
Pulsar Admin URL
--auth-params
Authentication parameters, whose format is determined by the
implementation of method `configure` in authentication plugin class, for
example "key1:val1,key2:val2" or "{"key1":"val1","key2":"val2"}.
--auth-plugin
Authentication plugin class name
-bb, --batch-max-bytes
Maximum number of bytes per batch
Default: 4194304
-bm, --batch-max-messages
Maximum number of messages per batch
Default: 1000
-b, --batch-time-window
Batch messages in 'x' ms window (Default: 1ms)
Default: 1.0
-bw, --busy-wait
Enable Busy-Wait on the Pulsar client
Default: false
-ch, --chunking
Should split the message and publish in chunks if message size is larger
than allowed max size
Default: false
-z, --compression
Compress messages payload
Default: NONE
Possible Values: [NONE, LZ4, ZLIB, ZSTD, SNAPPY]
-cf, --conf-file
Configuration file
-d, --delay
Mark messages with a given delay in seconds
Default: 0
-k, --encryption-key-name
The public key name to encrypt payload
-v, --encryption-key-value-file
The file which contains the public key to encrypt payload
-ef, --exit-on-failure
the timestamp nanoseconds.
Default: false
-h, --help
Help message
--histogram-file
HdrHistogram output file
--listener-name
Listener name for the broker.
-c, --max-connections
Max number of TCP connections to a single broker
Default: 100
-o, --max-outstanding
Max number of outstanding messages
Default: 0
-p, --max-outstanding-across-partitions
Max number of outstanding messages across partitions
Default: 0
-mk, --message-key-generation-mode
The generation mode of message key, valid options are: [autoIncrement,
random]
-ioThreads, --num-io-threads
Set the number of threads to be used for handling connections to
brokers. The default value is 1.
Default: 1
-m, --num-messages
Number of messages to publish in total. If <= 0, it will keep publishing
Default: 0
-n, --num-producers
Number of producers (per topic)
Default: 1
-threads, --num-test-threads
Number of test threads
Default: 1
-t, --num-topic
Number of topics
Default: 1
-nmt, --numMessage-perTransaction
The number of messages sent by a transaction. (After --txn-enable
setting to true, -nmt takes effect)
Default: 50
-np, --partitions
Create partitioned topics with the given number of partitions, set 0 to
not try to create the topic
-e, --payload-delimiter
The delimiter used to split lines when using payload from a file
Default: \n
-f, --payload-file
Use payload from an UTF-8 encoded text file and a payload will be
randomly selected when publishing messages
-pn, --producer-name
Producer Name
-r, --rate
Publish rate msg/s across topics
Default: 100
--send-timeout
Set the sendTimeout value default 0 to keep compatibility with previous
version of pulsar-perf
Default: 0
--separator
Separator between the topic and topic number
Default: -
-u, --service-url
Pulsar Service URL
-s, --size
Message size (bytes)
Default: 1024
-i, --stats-interval-seconds
Statistics Interval Seconds. If 0, statistics will be disabled
Default: 0
-time, --test-duration
Test duration in secs. If <= 0, it will keep publishing
Default: 0
--tls-allow-insecure
Allow insecure TLS connection
--trust-cert-file
Path for the trusted TLS certificate file
Default: <empty string>
-txn, --txn-enable
Enable or disable the transaction
Default: false
-tto, --txn-timeout
Set the time value of transaction timeout, and the time unit is second.
(After --txn-enable setting to true, --txn-timeout takes effect)
Default: 10
--warmup-time
Warm-up time in seconds (Default: 1 sec)
Default: 1.0
-abort
Abort the transaction. (After --txn-enable setting to true, -abort takes
effect)
Default: false
测试命令
bin/pulsar-perf produce fff-test-topic-p -m 100000 -np 3
平均每秒99.3条消息(0.776Mb/s)
平均时延2.971ms,95%的消息时延低于2.959ms,99%的消息时延低于3.455ms
考虑到pulsar-perf不指定消息频率的情况下,默认频率为每秒100条,之后的测试中将会把总消息条数定为10000000,从10000条每秒开始不断提高发送频率并观察消息时延出现明显增加这一现象出现的消息频率值。
执行结果
r值 |
实测消息频率(N/s) |
实测平均传输速率(Mb/s) |
实测平均时延(ms) |
实测95%消息时延低于(ms) |
实测99%消息时延低于(ms) |
实测99.9%消息时延低于(ms) |
实测99.99%消息时延低于(ms) |
实测确认发送消息数 |
实测确认消费端接收消息数 |
10000 |
9934.727 |
77.615 |
11.691 |
3.414 |
361.451 |
590.595 |
658.571 |
10000000 |
10000000 |
20000 |
19738.584 |
154.208 |
22.245 |
159.198 |
487.189 |
603.715 |
654.691 |
10000000 |
10000000 |
30000 |
29424.077 |
229.876 |
36.608 |
315.033 |
563.883 |
663.071 |
688.735 |
10000000 |
10000000 |
40000 |
38971.097 |
304.462 |
44.881 |
379.415 |
586.131 |
692.935 |
722.635 |
10000000 |
10000000 |
60000 |
57741.831 |
451.108 |
76.165 |
485.649 |
621.431 |
715.247 |
785.311 |
10000000 |
10000000 |
80000 |
76069.884 |
594.296 |
97.534 |
526.691 |
632.147 |
695.787 |
715.803 |
10000000 |
10000000 |
100000 |
93842.511 |
733.145 |
100.046 |
533.363 |
633.735 |
692.555 |
732.063 |
10000000 |
10000000 |
120000 |
111297.094 |
869.509 |
130.838 |
574.791 |
671.707 |
753.891 |
798.607 |
10000000 |
10000000 |
140000 |
128220.986 |
1001.726 |
236.493 |
657.551 |
760.715 |
859.659 |
899.635 |
10000000 |
10000000 |
160000 |
144684.179 |
1130.345 |
330.656 |
727.843 |
818.287 |
885.539 |
928.547 |
10000000 |
10000000 |
180000 |
161055.858 |
1258.249 |
402.923 |
798.135 |
976.223 |
1225.279 |
1307.447 |
10000000 |
10000000 |
200000 |
176205.701 |
1376.607 |
995.248 |
2381.615 |
2909.119 |
3254.015 |
3309.071 |
10000000 |
10000000 |
220000 |
189699.661 |
1482.029 |
2984.169 |
7031.071 |
7509.055 |
7764.671 |
7799.391 |
9877146 |
9905342 |
结果说明
可以看到平均时延随消息频率稳定上升,在r>180000之后,性能开始出现不稳定现象。时延开始大幅度上升,并且在测试过程中偶有出现消息丢失现象。
所以,该pulsar最小化部署集群极限情况下吞吐率在180000条每秒左右。在该吞吐率下平均时延403ms,95%的消息时延低于798ms,99%的消息时延低于976ms,99.9%的消息时延低于1225ms。
该配置下,系统CPU和内存占用情况如下:
生产者pulsar主机节点:
其他pulsar主机节点:
消费者性能测试
配置参数项
Usage: pulsar-perf consume [options] persistent://prop/ns/my-topic
Options:
--acks-delay-millis
Acknowledgements grouping delay in millis
Default: 100
--auth-params
Authentication parameters, whose format is determined by the
implementation of method `configure` in authentication plugin class, for
example "key1:val1,key2:val2" or "{"key1":"val1","key2":"val2"}.
--auth-plugin
Authentication plugin class name
-ac, --auto_ack_chunk_q_full
Auto ack for oldest message on queue is full
Default: false
--batch-index-ack
Enable or disable the batch index acknowledgment
Default: false
-bw, --busy-wait
Enable Busy-Wait on the Pulsar client
Default: false
-cf, --conf-file
Configuration file
-v, --encryption-key-value-file
The file which contains the private key to decrypt payload
-e, --expire_time_incomplete_chunked_messages
Expire time in ms for incomplete chunk messages
Default: 0
-h, --help
Help message
--histogram-file
HdrHistogram output file
--listener-name
Listener name for the broker.
-c, --max-connections
Max number of TCP connections to a single broker
Default: 100
-mc, --max_chunked_msg
Max pending chunk messages
Default: 0
-n, --num-consumers
Number of consumers (per subscription), only one consumer is allowed
when subscriptionType is Exclusive
Default: 1
-ioThreads, --num-io-threads
Set the number of threads to be used for handling connections to
brokers. The default value is 1.
Default: 1
-lt, --num-listener-threads
Set the number of threads to be used for message listeners
Default: 1
-m, --num-messages
Number of messages to consume in total. If <= 0, it will keep consuming
Default: 0
-ns, --num-subscriptions
Number of subscriptions (per topic)
Default: 1
-t, --num-topics
Number of topics
Default: 1
-nmt, --numMessage-perTransaction
The number of messages acknowledged by a transaction. (After
--txn-enable setting to true, -numMessage-perTransaction takes effect
Default: 50
-pm, --pool-messages
Use the pooled message
Default: true
-r, --rate
Simulate a slow message consumer (rate in msg/s)
Default: 0.0
-q, --receiver-queue-size
Size of the receiver queue
Default: 1000
-p, --receiver-queue-size-across-partitions
Max total size of the receiver queue across partitions
Default: 50000
--replicated
Whether the subscription status should be replicated
Default: false
-u, --service-url
Pulsar Service URL
-i, --stats-interval-seconds
Statistics Interval Seconds. If 0, statistics will be disabled
Default: 0
-sp, --subscription-position
Subscription position
Default: Latest
-st, --subscription-type
Subscription type
Default: Exclusive
Possible Values: [Exclusive, Shared, Failover, Key_Shared]
-ss, --subscriptions
A list of subscriptions to consume (for example, sub1,sub2)
Default: [sub]
-time, --test-duration
Test duration in secs. If <= 0, it will keep consuming
Default: 0
--tls-allow-insecure
Allow insecure TLS connection
--trust-cert-file
Path for the trusted TLS certificate file
Default: <empty string>
-txn, --txn-enable
Enable or disable the transaction
Default: false
-tto, --txn-timeout
Set the time value of transaction timeout, and the time unit is second.
(After --txn-enable setting to true, --txn-timeout takes effect)
Default: 10
-abort
Abort the transaction. (After --txn-enable setting to true, -abort takes
effect)
Default: false
-ntxn
The number of opened transactions, 0 means keeping open.(After
--txn-enable setting to true, -ntxn takes effect.)
Default: 0
测试命令
在集群全部节点运行
bin/pulsar-perf produce fff-test-topic-p -r 200000 -np 3
产生大量消息,并在任一节点运行
bin/pulsar-perf consume fff-test-topic-p
考察消息消费速率
执行结果
峰值速率能达到
455485.658 msg/s
3558.482 Mbit/s
此时消费者所在主机,系统CPU和内存占用情况如下:
结果说明
基础测试中,测得消费峰值速率在
304052.826 msg/s
2375.413 Mbit/s
约为生产速率的2.5倍
基准读写性能总结
吞吐量(N/s) |
吞吐量(MB/s) |
|
写性能 |
176205.701 |
1376.607 |
读性能 |
455485.658 |
3558.482 |
不同参数下Pulsar测试集群读写性能测试
修改消息体大小
以基准测试参数为基础,改变单个message大小为:512byte、1024 byte、2048 byte、4096 byte、8192byte分别发送至broker进行读写测试
写性能
消息大小(byte) |
极限吞吐率(N/s) |
极限吞吐率(Mb/s) |
512 |
342866.808 |
1339.323 |
1024(基准) |
176205.701 |
1376.607 |
2048 |
85013.314 |
1328.333 |
4096 |
43610.857 |
1362.839 |
8192 |
18774.898 |
1173.431 |
读性能
消息大小(byte) |
极限吞吐率(N/s) |
极限吞吐率(Mb/s) |
512 |
625980.050 |
2445.242 |
1024(基准) |
455485.658 |
3558.482 |
2048 |
317949.245 |
4967.959 |
4096 |
209893.782 |
6559.181 |
8192 |
120568.939 |
7535.565 |
结论
生产者端写入速率(Mb/s)受到消息大小影响不大,极限吞吐率都受到约1350Mb/s的限制。但是当消息大小大于8192byte时,吞吐率会开始下降。
消费端的读取速率(Mb/s)会随着消息大小而上升
多个生产者并发写入
在基准测试(1个生产者)的基础上,分别启动3、4、5、6、7、9个生产者用于测试Apache pulsar写入性能,相关性能(总吞吐量、单个客户端的吞吐量)表现如下:
生产者 |
极限吞吐率(N/s) |
极限吞吐率(Mb/s) |
单个极限吞吐率(N/s) |
单个极限吞吐率(Mb/s) |
1(基准) |
176205.701 |
1376.607 |
176205.701 |
1376.607 |
3 |
271180.516 |
2118.598 |
90393.505 |
706.199 |
4 |
271536.500 |
2121.379 |
67,884.125 |
530.345 |
5 |
199854.577 |
1561.364 |
39970.915 |
312.273 |
6 |
266507.055 |
2082.086 |
44417.843 |
347.014 |
7 |
176788.802 |
1381.163 |
25255.543 |
197.309 |
9 |
184564.584 |
1441.911 |
20507.176 |
160.212 |
结论
生产者数量为broker倍数的时候,总体性能略有提升,可能是受到默认topic配置吞吐上限的限制。并且当数量超过cpu内核数的情况下性能反而出现下降
多个消费者并发读取
在基准测试(1个消费者)的基础上,分别启动3、4、5、6、7、9个消费者用于测试Apache pulsar读取性能,相关性能(总吞吐量、单个客户端的吞吐量)表现如下:
消费者 |
极限吞吐率(N/s) |
极限吞吐率(Mb/s) |
单个极限吞吐率(N/s) |
单个极限吞吐率(Mb/s) |
1(基准) |
455485.658 |
3558.482 |
455485.658 |
3558.482 |
3 |
434838.834 |
3397.178 |
144946.278 |
1132.393 |
4 |
456351.338 |
3565.245 |
114087.835 |
891.311 |
5 |
468225.325 |
3658.010 |
93645.065 |
713.602 |
6 |
503207.172 |
3931.305 |
83867.862 |
655.218 |
7 |
479153.602 |
3743.387 |
68450.515 |
534.770 |
9 |
488351.764 |
3815.248 |
54261.307 |
423.916 |
结论
总体性能无变化,可能是受到默认topic配置吞吐上限的限制
调整topic分区数量
分别设计topic的partition分区数量分别为1、3、6、9、15、25进行读写性能测试如下:
写性能
分区数 |
极限吞吐率(N/s) |
极限吞吐率(Mb/s) |
1 |
136708.957 |
1068.039 |
3(基准) |
176205.701 |
1376.607 |
6 |
218749.630 |
1708.981 |
9 |
236448.887 |
1847.257 |
15 |
235339.390 |
1838.589 |
25 |
222276.105 |
1736.532 |
读性能
分区数 |
极限吞吐率(N/s) |
极限吞吐率(Mb/s) |
1 |
160793.628 |
1256.200 |
3(基准) |
455485.658 |
3558.482 |
6 |
544265.625 |
4252.074 |
9 |
543851.390 |
4248.841 |
15 |
577095.070 |
4508.555 |
25 |
566150.879 |
4423.054 |
结论
分区数量达到broker数量2倍时获得最大吞吐率,之后继续扩大分区数量几乎没有影响。