扩容卷
注意事项:
在扩展分布式复制卷和分布式纠错卷时,需要添加的 brick 块数量是副本或纠错卷数量的倍数。 例如,要扩展副本数为 2 的分布式复制卷,则需要以 2 的倍数(例如 4、6、8 等)添加砖块。
1. 如果扩容是新主机,还不是受信任池的一部分,需要先探测
# gluster peer probe server4
Probe successful
2. 添加 brick
# gluster volume add-brick test-volume server4:/exp4
Add Brick successful
3. 使用以下命令检查卷信息
# gluster vol info test-volune
Volume Name: test-volume
Type: Distribute
Status: Started
Number of Bricks: 4
Bricks:
Brick1: server1:/exp1
Brick2: server2:/exp2
Brick3: server3:/exp3
Brick4: server4:/exp4
4. 进行 rebalance 以确保将文件分发到新的 brick,rebalance 操作参见下文:“再平衡卷”部分。
替换 brick
在复制卷/分布式复制卷(distributed replica volume)中替换 brick
以下是在副本数为 2 的卷 r2 中将 Server1:/home/gfs/r2_0 替换为 Server1:/home/gfs/r2_5的步骤:
初始卷信息:
Volume Name: r2
Type: Distributed-Replicate
Volume ID: 24a0437a-daa0-4044-8acf-7aa82efd76fd
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: Server1:/home/gfs/r2_0
Brick2: Server2:/home/gfs/r2_1
Brick3: Server1:/home/gfs/r2_2
Brick4: Server2:/home/gfs/r2_3
1. 确认 Server1:/home/gfs/r2_5 上没有任何数据
2. 如果需要替换的 brick 为 running,查看 brick 的 pid,kill 它
# gluster volume status
Status of volume: r2
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick Server1:/home/gfs/r2_0 49152 Y 5342
Brick Server2:/home/gfs/r2_1 49153 Y 5354
Brick Server1:/home/gfs/r2_2 49154 Y 5365
Brick Server2:/home/gfs/r2_3 49155 Y 5376
# kill -15 5342
# gluster volume status
Status of volume: r2
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick Server1:/home/gfs/r2_0 N/A N 5342 <<---- brick is not running, others are running fine.
Brick Server2:/home/gfs/r2_1 49153 Y 5354
Brick Server1:/home/gfs/r2_2 49154 Y 5365
Brick Server2:/home/gfs/r2_3 49155 Y 5376
3. 挂载 gluster 卷(在本例中:/mnt/r2),设置元数据,以便将数据同步到新 brick(在本例中,它是从 Server1:/home/gfs/r2_1 到 Server1:/home/gfs/r2_5)。在挂载点上创建一个不存在的目录后删除它,通过setfattr设置元数据属性后删除它,检查要被替换 brick 的副本上是否有待处理的 xattrs。这些操作会标记元数据的 changelog,告诉 self-daemon 从 /home/gfs/r2_1 到 /home/gfs/r2_5 执行自愈。
mkdir /mnt/r2/<name-of-nonexistent-dir>
rmdir /mnt/r2/<name-of-nonexistent-dir>
setfattr -n trusted.non-existent-key -v abc /mnt/r2
setfattr -x trusted.non-existent-key /mnt/r2
getfattr -d -m. -e hex /home/gfs/r2_1
# file: home/gfs/r2_1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000000000000300000002 <<---- xattrs are marked from source brick Server2:/home/gfs/r2_1
trusted.afr.r2-client-1=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0xde822e25ebd049ea83bfaa3c4be2b440
4. 卷修复信息将显示“/”需要修复。(根据工作负载可能会有更多条目。但“/”必须存在)
# gluster volume heal r2 info
Brick Server1:/home/gfs/r2_0
Status: Transport endpoint is not connected
Brick Server2:/home/gfs/r2_1
/
Number of entries: 1
Brick Server1:/home/gfs/r2_2
Number of entries: 0
Brick Server2:/home/gfs/r2_3
Number of entries: 0
5. 用“commit force”选项替换 brick,一旦self-heal 完成,pending changelog 即消除,heal info 显示为 0。
# gluster volume replace-brick r2 Server1:/home/gfs/r2_0 Server1:/home/gfs/r2_5 commit force
volume replace-brick: success: replace-brick commit successful
# gluster volume status
Status of volume: r2
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick Server1:/home/gfs/r2_5 49156 Y 5731 <<<---- new brick is online
Brick Server2:/home/gfs/r2_1 49153 Y 5354
Brick Server1:/home/gfs/r2_2 49154 Y 5365
Brick Server2:/home/gfs/r2_3 49155 Y 5376
# getfattr -d -m. -e hex /home/gfs/r2_1
getfattr: Removing leading '/' from absolute path names
# file: home/gfs/r2_1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000000000000000000000 <<---- Pending changelogs are cleared.
trusted.afr.r2-client-1=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0xde822e25ebd049ea83bfaa3c4be2b440
# gluster volume heal r2 info
Brick Server1:/home/gfs/r2_5
Number of entries: 0
Brick Server2:/home/gfs/r2_1
Number of entries: 0
Brick Server1:/home/gfs/r2_2
Number of entries: 0
Brick Server2:/home/gfs/r2_3
Number of entries: 0
再平衡卷(rebalance)
布局结构对于给定目录是静态的。即使在卷中添加了新砖后,现有目录中新创建的文件仍将仅分布在原始砖中。命令 gluster volume rebalance <volname> fix-layout start 将修复布局信息,以便可以在新添加的砖块上创建文件。发出此命令时,所有已缓存的文件统计信息都将重新验证。
从 GlusterFS 3.6 开始,将文件分配给 brick 将考虑 brick 的大小。例如,分配给 20TB brick 的文件将是 10TB brick 的两倍。在 3.6 之前的版本中,无论大小如何,两个 brick 都被视为平等,并且会被分配平等的文件份额。
fix-layout rebalance 只会修复布局更改,不会迁移数据。如果要迁移现有数据,请使用 gluster volume rebalance <volume> start 命令在服务器之间重新平衡数据。
使用以下命令检查重新平衡操作的状态:
# gluster volume rebalance test-volume status
Node Rebalanced-files size scanned status
--------- ---------------- ---- ------- -----------
617c923e-6450-4065-8e33-865e28d9428f 416 1463 312 in progress
self-heal 触发
self- heal daemon 进程在后台运行,诊断问题并每 10 分钟对需要修复的文件自动启动一次自我修复。
1. 仅在需要修复的文件上触发自我修复:
# gluster volume heal test-volume
Heal operation on volume test-volume has been successful
2. 在卷的所有文件上触发自我修复:
# gluster volume heal test-volume full
Heal operation on volume test-volume has been successful
3. 查看需要修复的文件列表:
# gluster volume heal test-volume info
Brick server1:/gfs/test-volume_0
Number of entries: 0
Brick server2:/gfs/test-volume_1
Number of entries: 101
/95.txt
/32.txt
/66.txt
/35.txt
/18.txt
/26.txt
/47.txt
/55.txt
/85.txt
...
4. 查看已自愈的文件列表:
# gluster volume heal test-volume info healed
Brick Server1:/gfs/test-volume_0
Number of entries: 0
Brick Server2:/gfs/test-volume_1
Number of entries: 69
/99.txt
/93.txt
/76.txt
/11.txt
/27.txt
/64.txt
/80.txt
/19.txt
/41.txt
/29.txt
/37.txt
/46.txt
...
5. 查看自愈失败的文件列表:
# gluster volume heal test-volume info failed
Brick Server1:/gfs/test-volume_0
Number of entries: 0
Brick Server2:/gfs/test-volume_3
Number of entries: 72
/90.txt
/95.txt
/77.txt
/71.txt
/87.txt
/24.txt
...
6. 查看处于裂脑状态的文件列表:
# gluster volume heal test-volume info split-brain
Brick Server1:/gfs/test-volume_2
Number of entries: 12
/83.txt
/28.txt
/69.txt
...
Brick Server2:/gfs/test-volume_3
Number of entries: 12
/83.txt
/28.txt
/69.txt
...