-
测试数据准备:
teledb=# CREATE TABLE t1(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1); CREATE TABLE teledb=# CREATE TABLE t2(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1); CREATE TABLE teledb=# insert into t1 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000000) as t; INSERT 0 1000000 teledb=# insert into t2 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000000) as t; INSERT 0 1000000 teledb=# analyze t1; ANALYZE teledb=# analyze t2; ANALYZE teledb=# teledb=# \timing Timing is on.
-
优化前SQL执行计划和耗时:
此处是将prefer_olap参数设置为off的场景,该参数默认为on。
teledb=# explain select t1.* from t1,t2 where t1.f1=t2.f1 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.25..1.65 rows=10 width=367) -> Merge Join (cost=0.25..140446.26 rows=1000000 width=367) Merge Cond: (t1.f1 = t2.f1) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..434823.13 rows=1000000 width=367) -> Index Scan using t1_f1_key on t1 (cost=0.12..62723.13 rows=1000000 width=367) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..71823.13 rows=1000000 width=4) -> Index Only Scan using t2_f1_key on t2 (cost=0.12..62723.13 rows=1000000 width=4) (7 rows) Time: 1.372 ms teledb=# explain analyze select t1.* from t1,t2 where t1.f1=t2.f1 limit 10; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.25..1.65 rows=10 width=367) (actual time=2675.437..2948.199 rows=10 loops=1) -> Merge Join (cost=0.25..140446.26 rows=1000000 width=367) (actual time=2675.431..2675.508 rows=10 loops=1) Merge Cond: (t1.f1 = t2.f1) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..434823.13 rows=1000000 width=367) (actual time=1.661..1.704 rows=10 loops=1) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.12..71823.13 rows=1000000 width=4) (actual time=2673.761..2673.783 rows=10 loops=1) Planning time: 0.358 ms Execution time: 2973.948 ms (7 rows) Time: 2976.008 ms (00:02.976) teledb=#
可以看到,join+limit写法,在prefer_olap=off的场景下,会拉大量的数据到CN节点进行计算、排序和limit过滤,消耗了大量的网络开销。
-
优化后执行计划和耗时:
设置参数prefer_olap=on(默认值),将join下推到DN节点执行:
teledb=# set prefer_olap to on; SET Time: 0.291 ms teledb=# explain select t1.* from t1,t2 where t1.f1=t2.f1 limit 10; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Limit (cost=100.25..101.70 rows=10 width=367) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.25..101.70 rows=10 width=367) -> Limit (cost=0.25..1.65 rows=10 width=367) -> Merge Join (cost=0.25..140446.26 rows=1000000 width=367) Merge Cond: (t1.f1 = t2.f1) -> Index Scan using t1_f1_key on t1 (cost=0.12..62723.13 rows=1000000 width=367) -> Index Only Scan using t2_f1_key on t2 (cost=0.12..62723.13 rows=1000000 width=4) (7 rows) Time: 1.061 ms teledb=# explain analyze select t1.* from t1,t2 where t1.f1=t2.f1 limit 10; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=100.25..101.70 rows=10 width=367) (actual time=1.527..3.899 rows=10 loops=1) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.25..101.70 rows=10 width=367) (actual time=1.525..1.529 rows=10 loops=1) Planning time: 0.360 ms Execution time: 18.193 ms (4 rows) Time: 19.921 ms
可以看到,prefer_olap设置为on后,计算、排序和limit过滤在DN上完成,返回CN的数量是每个DN节点limit之后的数据,解决了上面SQL优化前大量数据上拉CN节点的问题,性能提升了150倍。
这里也说明,prefer_olap参数使用也和SQL特点有关,如果将prefer_olap设置off,SQL需要拉大量数据到CN计算的话,性能会明显下降,我们在设置prefer_olap参数前一定要做好评估。