开发相关规范-分布式融合数据库HTAP-私有云产品-开发手册-进阶开发-开发规范

建议对DB object尤其是COLUMN加COMMENT，便于后续了解业务及维护注释前后的数据表可读性对比，有注释的一看就明白。

teledb=# \d+ t_oids;
                                              Table "public.t_oids"
 Column |              Type              | Collation | Nullable | Default | Storage  | Stats target | Descripti
on 
--------+--------------------------------+-----------+----------+---------+----------+--------------+----------
---
 id     | integer                        |           | not null |         | plain    |              | 
 name   | character varying              |           |          |         | extended |              | 
 birth  | timestamp(0) without time zone |           |          |         | plain    |              | 
 city   | character varying              |           |          |         | extended |              | 
Indexes:
    "t_oids_pkey" PRIMARY KEY, btree (id)
Has OIDs: yes
Distribute By: SHARD(id)
Location Nodes: ALL DATANODES
                                      ^
teledb=# comment on column t_oids.name is '姓名';
COMMENT
teledb=# comment on column t_oids.city is '居住城市';
COMMENT
teledb=# \d+ t_oids;
                                              Table "public.t_oids"
 Column |              Type              | Collation | Nullable | Default | Storage  | Stats target | Descripti
on 
--------+--------------------------------+-----------+----------+---------+----------+--------------+----------
---
 id     | integer                        |           | not null |         | plain    |              | 
 name   | character varying              |           |          |         | extended |              | 姓名
 birth  | timestamp(0) without time zone |           |          |         | plain    |              | 
 city   | character varying              |           |          |         | extended |              | 居住城市
Indexes:
    "t_oids_pkey" PRIMARY KEY, btree (id)
Has OIDs: yes
Distribute By: SHARD(id)
Location Nodes: ALL DATANODES

建议非必须时避免select *，只取所需字段，以减少包括不限于网络带宽消耗。

teledb=# explain select * from t_oids;
                           QUERY PLAN   
----------------------------------------------------------------
 Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
   Node/s: dn01, dn02
   ->  Seq Scan on t_oids  (cost=0.00..16.30 rows=630 width=76)
(3 rows)

teledb=# explain select id from t_oids;
                          QUERY PLAN   
---------------------------------------------------------------
 Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
   Node/s: dn01, dn02
   ->  Seq Scan on t_oids  (cost=0.00..16.30 rows=630 width=4)
(3 rows)

建议update时尽量做<>判断，如update table_a set column_b = c where column_b <> c；

teledb=# update t_oids set city = '测试';
UPDATE 4
teledb=# select xmin,* from t_oids;
 xmin | id | name |        birth        | city 
------+----+------+---------------------+------
 1181 |  1 | 张三 | 2000-12-01 00:00:00 | 测试
 1147 |  3 | 王五 | 2004-09-01 00:00:00 | 测试
 1147 |  4 | 陈六 | 2022-01-01 00:00:00 | 测试
 1181 |  2 | 李四 | 1997-03-24 00:00:00 | 测试
(4 rows)

teledb=# update t_oids set city = '测试';
UPDATE 4
teledb=# select xmin,* from t_oids;
 xmin | id | name |        birth        | city 
------+----+------+---------------------+------
 1182 |  1 | 张三 | 2000-12-01 00:00:00 | 测试
 1182 |  2 | 李四 | 1997-03-24 00:00:00 | 测试
 1148 |  3 | 王五 | 2004-09-01 00:00:00 | 测试
 1148 |  4 | 陈六 | 2022-01-01 00:00:00 | 测试
(4 rows)

teledb=# update t_oids set city = '测试' where city != '测试';
UPDATE 0
teledb=# select xmin,* from t_oids;
 xmin | id | name |        birth        | city 
------+----+------+---------------------+------
 1182 |  1 | 张三 | 2000-12-01 00:00:00 | 测试
 1182 |  2 | 李四 | 1997-03-24 00:00:00 | 测试
 1148 |  3 | 王五 | 2004-09-01 00:00:00 | 测试
 1148 |  4 | 陈六 | 2022-01-01 00:00:00 | 测试

上面的效果是一样的，但带条件的更新不会产生一个新的版本记录，不需要系统执行vacuum 回收垃圾数据。

建议将单个事务的多条SQL操作，分解、拆分，或者不放在一个事务里，让每个事务的粒度尽可能小，尽量lock少的资源，避免lock 、dead lock的产生。

会话1 把所有数据都更新但不提交，锁住了所有数据
```
teledb=# begin;
BEGIN
teledb=# update t_oids set city = 'city';
UPDATE 4
```
会话2 等待
```
teledb=# update t_oids set city = 'session2';
```
会话3 等待
```
teledb=# update t_oids set city = 'session3';
```
如果会话1分批更新的话，则会话2和会话3中就能部分提前完成，这样可以避免大量的锁等待和出现大量的session占用系统资源，在做全表更新时请使用这种方法来执行。
建议大批量的数据入库时，使用copy，不建议使用insert，以提高写入速度。

建议对报表类的或生成基础数据的查询，使用物化视图(MATERIALIZED VIEW)定期固化数据快照，避免对多表（尤其是读写频繁的表）重复跑相同的查询，且物化视图支持REFRESH MATERIALIZED VIEW CONCURRENTLY，支持并发更新。

teledb=# select count(1) from teledb_pg1;
  count  
---------
 1000000
(1 row)

Time: 57.713 ms
teledb=# create materialized view count_view as select count(1) as num from teledb_pg1;
SELECT 1
Time: 107.264 ms
teledb=# select num from count_view;
   num   
---------
 1000000
(1 row)

Time: 0.548 ms

性能提高上百倍。

有数据变化时刷新方法。

teledb=# insert into teledb_pg1 select t,md5(random()::text) from generate_series(1,1000000) as t;
INSERT 0 1000000
Time: 2817.703 ms (00:02.818)
teledb=# select count(1) from teledb_pg1;
  count  
---------
 2000000
(1 row)

Time: 147.231 ms
teledb=# refresh materialized view count_view;
REFRESH MATERIALIZED VIEW
Time: 259.462 ms
teledb=# select num from count_view;
   num   
---------
 2000000
(1 row)

Time: 0.555 ms

建议复杂的统计查询可以尝试窗口函数。
两表join时尽量的使用分布key进行join。
分布键用唯一索引代替主键
```
teledb=# create unique index t_oid_id_uidx on t_oids using btree(id);
CREATE INDEX
```
因为唯一索引后期的维护成本比主键要低很多。
分布键无法建立唯一索引则要建立普通索引，提高查询的效率
```
teledb=# create index t_oids_name_idx on t_oids using btree(name);
CREATE INDEX
```
这样两表在join查询时返回少量数据时的效率才会高。

不要对字段建立外键

说明
目前TeleDB还不支持多 DN 外键约束。

息壤智算

应用商城

定价

合作伙伴

开发者

支持与服务

了解天翼云

分布式融合数据库HTAP

分布式融合数据库HTAP

活动

息壤智算

应用商城

定价

合作伙伴

开发者

支持与服务

了解天翼云

分布式融合数据库HTAP

分布式融合数据库HTAP