针对MySQL物理备份redo log拷贝前被覆盖导致xtrabackup备份失败的问题的排查。
线上出现一例xtrabackup备份失败事件,具体报错如下:
xtrabackup: error: log block numbers mismatch:
xtrabackup: error: expected log block no. 273665700, but got no. 277859996 from the log file.
xtrabackup: error: it looks like InnoDB log has wrapped around before xtrabackup could process all records due to either log copying being too slow,
or log files being too small.
xtrabackup: Error: xtrabackup_copy_logfile() failed.
备份失败原因在xtrabackup的输出信息中已经有说明:log block numbers mismatch,XtraBackup在顺序拷贝完redo log末尾的数据后,重新从redo log的起始位置去拷贝时,发现起始位置的log block no.与刚才尾部的no.不连续。
expected log block no. 273665700, but got no. 277859996 from the log file.,本应该读取的redo 块是no. 273665700,但是只能获取到no. 277859996
读redo错误的原因:it looks like InnoDB log has wrapped around before xtrabackup could process all records due to either log copying being too slow, 要么xtrabackup读取redo的速度太慢了,或者redo 文件太小了,导致读取速度跟不上redo文件的切换速度,在读取之前,相应的redo块已经被覆盖了。
解决办法:
1、检查redo log相关的系统变量:
innodb_log_file_size
innodb_log_files_in_group
设置的redo log大小是innodb_log_file_size * innodb_log_files_in_group,如果设置过小,需要调整后重启。
2、检查实例在备份时间左右的机器load负载、硬件资源(网络、磁盘、CPU、内存)使用情况,如果出现资源瓶颈需要调整,比如出现其他资源占用的任务,需要适当对这些任务进行迁移,或者调整备份时间,选择在相对低峰的时间进行备份。