通过调大work_mem减少io访问达到提高性能的目的,但内存不可能无限扩大,下面通过改写语句也可以达到提高查询的性能。
例如,下面的SQL,就是将not in改写为anti join的执行计划和耗时:
teledb=# explain select * from t1 left outer join t2 on t1.f2 = t2.f2 where t2.f2 is null;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10)
(cost=6405.00..9260.75 rows=1 width=734)
-> Hash Anti Join (cost=6405.00..9260.75 rows=1 width=734)
Hash Cond: (t1.f2 = t2.f2)
-> Remote Subquery Scan on all
(dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..682.00 rows=1000 width=367)
Distribute results by S: f2
-> Seq Scan on t1 (cost=0.00..210.00 rows=1000 width=367)
-> Hash (cost=21940.00..21940.00 rows=50000 width=367)
-> Remote Subquery Scan on all
(dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..21940.00 rows=50000 width=367)
Distribute results by S: f2
-> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=367)
(10 rows)
Time: 1.047 ms
teledb=# select * from t1 left outer join t2 on t1.f2 = t2.f2 where t2.f2 is null;
f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12
----+----+----+----+----+----+----+----+----+-----+-----+-----+----+----+----+----+----+----+----+----+----+-----+-----+-----
(0 rows)
Time: 107.233 ms
也可以改写为not exists写法,改写后的SQL执行计划和耗时如下:
teledb=# explain select * from t1 where not exists( select 1 from t2 where t1.f2=t2.f2);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
-------------------------
Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=3865.00..4078.75 rows=1 width=367)
-> Hash Anti Join (cost=3865.00..4078.75 rows=1 width=367)
Hash Cond: (t1.f2 = t2.f2)
-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..682.00 rows=1000 width=367)
Distribute results by S: f2
-> Seq Scan on t1 (cost=0.00..210.00 rows=1000 width=367)
-> Hash (cost=5240.00..5240.00 rows=50000 width=33)
-> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=100.00..5240.00 rows=50000 width=33)
Distribute results by S: f2
-> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33)
(10 rows)
Time: 0.974 ms
teledb=# select * from t1 where not exists( select 1 from t2 where t1.f2=t2.f2);
f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12
----+----+----+----+----+----+----+----+----+-----+-----+-----
(0 rows)
Time: 42.944 ms
可以看到改写为not exists后效果也很好。