work_mem参数为SQL中排序、hash散列操作时可用的内存,当排序、hash散列操作需要的内存超过work_mem时,会将超出部分写入临时文件,当触发临时文件写时,会产生I/O,大量的临时文件写会严重影响SQL性能。
work_mem默认为4MB,可以针对需要更多work_mem的SQL进行会话级/语句级设置,不建议全局设置过大,避免因内存消耗过多导致的OOM。
下面是用到work_mem参数调整的SQL优化案例:
-
测试数据准备:
teledb=# CREATE TABLE t1(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1); NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command. CREATE TABLE Time: 70.545 ms teledb=# CREATE TABLE t2(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1); NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command. CREATE TABLE Time: 61.913 ms teledb=# insert into t1 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000) as t; INSERT 0 1000 Time: 48.866 ms teledb=# insert into t2 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,50000) as t; INSERT 0 50000 Time: 792.858 ms teledb=# analyze t1; ANALYZE Time: 175.946 ms teledb=# analyze t2; ANALYZE Time: 318.802 ms teledb=#
-
优化前的SQL执行计划和耗时:
默认work_mem为4MB:
teledb=# explain select * from t1 where f2 not in (select f2 from t2); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..2076712.50 rows=500 width=367) -> Seq Scan on t1 (cost=0.00..2076712.50 rows=500 width=367) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Materialize (cost=0.00..4028.00 rows=50000 width=33) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..3240.00 rows=50000 width=33) -> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33) (7 rows) Time: 0.916 ms teledb=# select * from t1 where f2 not in (select f2 from t2); f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 ----+----+----+----+----+----+----+----+----+-----+-----+----- (0 rows) Time: 4226.825 ms (00:04.227)
-
优化后的SQL执行计划和耗时:
设置work_mem为8MB:
teledb=# set work_mem to '8MB'; SET Time: 0.289 ms teledb=# explain select * from t1 where f2 not in (select f2 from t2); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=3365.00..3577.50 rows=500 width=367) -> Seq Scan on t1 (cost=3365.00..3577.50 rows=500 width=367) Filter: (NOT (hashed SubPlan 1)) SubPlan 1 -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..3240.00 rows=50000 width=33) -> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33) (6 rows) Time: 0.890 ms teledb=# select * from t1 where f2 not in (select f2 from t2); f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 ----+----+----+----+----+----+----+----+----+-----+-----+----- (0 rows) Time: 105.249 ms
可以看到,调整大work_mem前发生了Materialize物化动作,即将结果集存储在了内存或磁盘上;而调大work_mem后,work_mem足够放下filter的数据,不需要再做 Materialize物化,filter 由原来的subplan变成了hash subplan,直接在内存 hash表中filter;
优化后SQL性能提高了40倍,效果明显。