Last_Error: Could not execute Write_rows event on table db1.t1; Duplicate entry '28646' for key 't1.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000052, end_log_pos 437
二、MySQL 8.0 + Group Replication + 持续写入数据期间执行重建后,change master to && start group_replication 报错:
1 2 3 4 5
2020-08-21T14:51:09.977606+08:0061 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. 2020-08-21T14:51:09.987494+08:0061 [ERROR] [MY-013124] [Repl] Slave SQL for channel 'group_replication_applier': Slave failed to initialize relay log info structure from the repository, Error_code: MY-013124 2020-08-21T14:51:09.987542+08:0061 [ERROR] [MY-011534] [Repl] Plugin group_replication reported: 'Error while starting the group replication applier thread' 2020-08-21T14:51:09.987651+08:007 [ERROR] [MY-011669] [Repl] Plugin group_replication reported: 'Unable to initialize the Group Replication applier module.' 2020-08-21T14:51:09.987831+08:007 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'
要解释这个问题,首先要弄清楚 xtrabackup 2.4 和 8.0 的区别。
xtrabackup 2.4 和 8.0 区别
google 查到 xtrabackup 8.0 与 2.4 版本行为有所不同:
Xtrabackup 2.4 备份后生成的 xtrabackup_binlog_info 文件记录的 GTID 信息是准确的,但是备份恢复后 show master status 显示的 GTID 是不准确的。
Xtrabackup 8.0 在备份只有 InnoDB 表的实例时,xtrabackup_binlog_info 文件记录的 GTID 信息不一定是准确的,但是备份恢复后 show master status 显示的 GTID 是准确的。
Xtrabackup 8.0 在备份有非 InnoDB 表格的实例时,xtrabackup_binlog_info 文件记录的 GTID 信息是准确的,备份恢复后 show master status 显示的 GTID 也是准确的。
The log_status table provides information that enables an online backup tool to copy the required log files without locking those resources for the duration of the copy process.
When the log_status table is queried, the server blocks logging and related administrative changes for just long enough to populate the table, then releases the resources. The log_status table informs the online backup which point it should copy up to in the source’s binary log and gtid_executed record, and the relay log for each replication channel. It also provides relevant information for individual storage engines, such as the last log sequence number (LSN) and the LSN of the last checkpoint taken for the InnoDB storage engine.
从上述手册描述可知,performance_schema.log_status是 MySQL 8.0 提供给在线备份工具获取复制信息的表格,查询该表时,mysql server 将阻止日志的记录和相关的更改来获取足够的时间以填充该表,然后释放资源。
$ mysqlbinlog -vv /data/mysql/mysql-bin.000096 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #20082711:26:47 server id 575010000 end_log_pos 124 CRC32 0xb026e372 Start: binlog v 4, server v 8.0.19-10 created 20082711:26:47 # Warning: this binlog is either in useor was not closed properly. BINLOG ' 9ydHXw/Q9EUieAAAAHwAAAABAAQAOC4wLjE5LTEwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAEwANAAgAAAAABAAEAAAAYAAEGggAAAAICAgCAAAACgoKKioAEjQA CgFy4yaw '/*!*/; # at 124 #20082711:26:47 server id 575010000 end_log_pos 195 CRC32 0xad060415 Previous-GTIDs # 6b437e80-e5d5-11ea-88e3-52549922fdbb:1-465503 SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog *//*!*/; DELIMITER ; # End of log file /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
问题定位
坑一:MySQL 8.0 + Semi-Sync 重建问题
xenon 原有的重建逻辑是适配于 MySQL 5.6、5.7 的(重建过程中 xenon 进程存活),一直无问题:
根据报错信息 Slave failed to initialize relay log info structure from the repository看,应该是 xtrabackup 重建后的数据目录保留了 slave 复制信息导致的,尝试在启动组复制前执行 reset slave或reset slave all即可解决。
总结
Xtrabackup 2.4 备份后生成的 xtrabackup_binlog_info 文件记录的 GTID 信息是准确的,但是备份恢复后 show master status 显示的 GTID 是不准确的。
Xtrabackup 8.0 在备份只有 InnoDB 表的实例时,xtrabackup_binlog_info 文件记录的 GTID 信息不一定是准确的,但是备份恢复后 show master status 显示的 GTID 是准确的。
Xtrabackup 8.0 在备份有非 InnoDB 表格的实例时,xtrabackup_binlog_info 文件记录的 GTID 信息是准确的,备份恢复后 show master status 显示的 GTID 也是准确的。
使用 Xtrabackup 8.0 重建集群节点后,无需执行 reset master & set gtid_purged 操作。