1. Transaction Histories
1.1 并发(Concurrency)
- 交叉并发方式
- 单处理机系统;
- 并行事务实际上是轮流交叉串行执行的;
- 同时并发方式
- 多处理机系统;
- 每个处理机可以运行一个事务,实现多事务并行;
1.2 并发操作会出现的DB问题
- Lost Update(丢失修改):两个事务T1、T2读统一数据并修改;
- Non-Repeatable Read(不可重复读):T1读完后,T2更新,使得T1无法再现前次读取的结果;
- Dirty Read(读脏数据):脏读就是指当一个事务正在访问数据,并且对数据进行了修改,而这种修改还没有提交到数据库中,这时,另外一个事务也访问这个数据,然后使用了这个数据。
1.3 并发控制技术
- 事务(Transaction)
- 封锁(Locking)
1.4 封锁
操作数据前先加锁,在T释放锁前其他事务不能对其进行更新
1.4.1 锁类型
- 排他锁(写锁,X锁):只准T事务读或修改数据,指导T释放此锁;
- 共享锁(读锁,S锁):T只能读Data,不能修改Data;其他事务只能对Data再加S锁,不能加X锁。
1.4.2 事务锁之间的相容矩阵
1.5 Locking Protocol
约定何时加S/X锁,以及持续/施放时间等。
1.5.1 一级封锁协议
内容:修改Data前先加X锁,事务结束后才释放锁,R操作不加锁。
特点:防止“Lost Updata”;
1.5.2 二级封锁协议
内容:修改Data前加X锁,读Data前加S锁,读完后立即释放锁。
特点:防止“Lost Updata”,保证“Repeat Read”一致;
1.5.3 三级封锁协议
内容:修改Data前加X锁,读Data前加S锁,事务结束后再释放S锁。
特点:可避免前述三种并发控制问题。
2. Notations Transaction Histories
2.1 Notations of Atomic Read and Write in DB
- Ri(A) : transaction T i reads data A. Ri(A) can be mapped as SQL:
select val into :pgmval1 from Table1 where uniqueid = A ;
- Wj(B) : transaction T j writes data B. Wj(B) can be mapped as SQL:
update Table1 set val = :pgmval1 where uniqueid = B ;
- Expand Form : Ri(A, val1) Wi(B, val2)
2.2 举例
Question: Two accounts A and B have balance 50 to begin with. T1 is adding up of two accounts, T2 is transferring 30 units from A to B.
Wrong Answer: R2(A, 50) W2(A, 20) R1(A, 20) R1(B, 50) R2(B, 50) W2(B, 80) C1 C2
Reason: T1: balance total 70 (A+B=20+50), Wrong ! T2: A=20; B=80 Right !
Right Answer:
- R2(A, 50) W2(A, 20) R2(B, 50) W2(B, 80) C2 R1(A, 20) R1(B, 80) C1
- R1(A, 50) R1(B, 50) C1 R2(A, 50) W2(A, 20) R2(B, 50) W2(B, 80) C2
Reason: T1: balance total 100, T2: A=20; B=80
3. Conflicting Operations
3.1 Define
- Two operations Xi(A) and Yj(B) in a history are said to be conflict
if and only if the three conditions hold:
- A = B. 相同对象上的操作才会有冲突。
- i ≠ j. 不同事务发出的两个操作才可能冲突。
- One of the two operations X or Y is a write, (Other can be R or W.)
3.2 Precedence Graph
A precedence graph for a history H is a directed graph denoted by PG(H). (事务调度顺序图)
- Vertices of PG(H)
correspond to the transactions Ti that have COMMITTED in H, transaction Ti which exists as an operation in H.
- Edge Ti -> Tj exists in PG(H)
when two conflicting operations Xi and Yj occur in that order in H. Ti -> Tj should be interpreted to mean that Ti must precede Tj in any equivalent serial history S (H).
Note: When a pair of operations conflict in H for COMMITTED transactions, we can draw the corresponding direct arc( 回路) in the PG(H).
3.3 死锁的解决方法
3.3.1 死锁预防
- 一次封锁法:每个事务必须一次将所有要使用的数据加锁。
缺陷:降低了并发度。
- 顺序封锁法:预定一个加锁顺序规则,所有的事务都按此顺序加锁。
缺点:维护资源封锁顺序成本高。
优点:数据并发度高。
3.3.2 死锁等待与解除
- 超时法: 测加锁时间 ≥ 规定时限,认为出现死锁。
特点:处理简单;但误判较多,时限不易确定。
- 等待图法:有向图=G(T,V),T结点集表示正在运行的事务,V边集表示事务等待情况。
特点:若图中出现回路,则系统中出现了死锁。
4. Two Phase Locking
4.1 Define
Locks taken in released following three rules.
- Before Ti read a data Ri(A), scheduler attempts to Read Lock the Data A on it’s behalf, RLi(A). Also before Wi(A), try Write Lock, WLi(A).(写尝试前加写锁,读前尝试加读锁)
- If conflicting lock on Data exists, requesting Ti must WAIT.(如果加锁冲突就等待)
- There are two phases to locking, the growing and the shrinking phase.(封锁有两个阶段,加锁和解锁)
When locks are released : RUi(A) or WUi(A)
Slock A Slock B Slock C Unlock A Unlock B Unlock C
| <---- Grow phase — >| <------ Shrink phase ------- >|systems is known as Two-Phase Locking, or 2PL
二段锁协议:所有事务须分两个阶段对 Data 加锁、解锁
- 获得锁(gain) 阶段:事务可获得任何类型数据的锁,且不能释放。
- 释放锁(shrink) 阶段:事务可释放任何类型数据的锁,但不能再申请。
4.2 Note
- Rule (3) implies can release locks before Commit.(规定三意味着解锁一定在提交之前)
- A transaction never conflict with its own locks !(一个事务不会和自身的锁相冲突)
- Ti with a WL doesn’t need a RL (WL more powerful than RL).(拥有写锁的事务不需要额外的读锁,因为写锁比读锁的约束力更强)
- Locking is defined to guarantee that a circuit in the Precedence Graph can never occur.(封锁被定义去保证PG图(事务调度顺序图)中不可能出现回路)
- 并发执行+两段锁协议 两段锁协议 ⇔任何并发调度均为可串行化的 任何并发调度均为可串行化的
- 遵守两段锁协议的事务其并发调度为可串行化,但也可能发生死锁,原因是未一次将所用数据全部封锁。
4.3 Deadlock
- Ti lock an item A, forces any other Tx want to A “come later”.(Ti封锁项目A,促使其他事务晚点再试)
- WAIT rules of locking means NEITHER Tx CAN EVER GO FORWARD AGAIN. This is a Side effect (副作用) of 2PL ----Dead Lock.(等待规则意味着Tx事务无法继续向前进行。死锁是二段锁协议的副作用)
- When a deadlock occurs, scheduler will recognize it and force one of the Txs involved to Abort.(当死锁发生时,调度器将认出它并停止其中一个相关进程)
5. Level of Isolation
(1)Read Uncommitted (also called as the “dirty reads” isolation level)
(2)Read Committed
(3)Repeatable Read
(4)Serializable
- Read Uncommitted
- no Read Lock, 相当于 “dirty read”。
- Read Committed
- take W locks and hold to EOT.
take R locks and release immediately after read.
- Repeatable Read
- In Repeatable Read Isolation, it provides long-term locking for all data, and
short-term locks only for something called “Read Predicate Locks”.
- Serializable
- All data read and written have RLs and WLs, and held long-term until EOT
6. 粒度
6.1 概述
- 封锁粒度
- 物理单元:页(数据页、索引页)、块等。
- 逻辑单元:属性值/组、元组、关系、索引项/整个索引、库
- “ 封锁粒度、并发度、并发控制开销 ” 的关系:细、高、大
6.2 多粒度封锁
一个系统中支持多种封锁粒度,形成一棵多粒度封锁树。
eg: 三级粒度封锁树:
6.3 多粒度封锁协议
允许树中每个结点独立加锁;其所有后裔结点可被加同样类型的锁。
一个Data object 两种封锁方式:
- 显式封锁
- 直接加到数据对象上的封锁
- 隐式封锁
- 该数据对象没有独立加锁,是由于其上级结点加锁而使该数据对象加上了锁。