MySQL的主主互备结合KEEPALIVED实现高可用

xiaoxiao2021-02-27 136

MySQL的主主互备结合KEEPALIVED实现高可用试验环境：主： localhost102 192.168.0.102（CentOS6.6）从属：localhost105 192.168.0.102（CentOS6.6）

VIP： 192.168.1.208

Keepalived: keepalived-1.2.20

该过程的第一部分就是master记录二进制日志。在每个事务更新数据完成之前，master在二日志记录这些改变。MySQL将事务写入二进制日志。在事件写入二进制日志完成后，master通知存储引擎提交事务。下一步就是slave将master的binary log拷贝到它自己的中继日志。首先，slave开始一个工作线程——I/O线程。I/O线程在master上打开一个普通的连接，然后开始binlog dump process。Binlog dump process从master的二进制日志中读取事件，如果已经同步了master，它会睡眠并等待master产生新的事件。I/O线程将这些事件写入中继日志。 SQL slave thread（SQL从线程）处理该过程的最后一步。SQL线程从中继日志读取事件，并重放其中的事件而更新slave的数据，使其与master中的数据一致。只要该线程与I/O线程保持一致，中继日志通常会位于OS的缓存中，所以中继日志的开销很小。

主主同步就是两台机器互为主的关系，在任何一台机器上写入都会同步。

若mysql主机开启了防火墙，需要关闭防火墙或创建规则。

1、修改MySQL配置文件

两台MySQL均要开启binlog日志功能，开启方法：在MySQL配置文件[MySQLd]段中加上log-bin=MySQL-bin选项，两台MySQL的server-ID不能一样

MySQL的主主互备模式配置，只有一个主（写），另外一个备是用来备用的主（读）。然后也可以加多个slave机器（读）。情况1：MASTER和salve都是新库的情况，如果MASTER不是新库，有数据的话，就使用MySQLdump或者其他备份工具flush tables with read lock;备份完成，然后恢复salve库之后。再搭建主主复制 1.1 master的my.CNF文件 vi /etc/my.cnf [mysqld] log-bin = mysql-bin server-id=1111 修改之后保存，并重启服务 1.2 sal replive的my.CNF文件 vi /etc/my.cnf [mysqld] log-bin = mysql-bin server-id=2222 修改之后保存，并重启服务保证slave和master的server的ID不一样 1.3确保主主的二台机器开启了binlog日志功能 [root@localhost102 softwares]# mysql> show variables like "%log_bin%"; +---------------------------------+----------------------------------+ | Variable_name | Value | +---------------------------------+----------------------------------+ | log_bin | ON | | log_bin_basename | /usr/local/mysql/mysql_bin | | log_bin_index | /usr/local/mysql/mysql_bin.index | | log_bin_trust_function_creators | OFF | | log_bin_use_v1_row_events | OFF | | sql_log_bin | ON | +---------------------------------+----------------------------------+ 6 rows in set (0.01 sec) [mysql@localhost105 ~] mysql> show variables like "%log_bin%"; +---------------------------------+----------------------------------+ | Variable_name | Value | +---------------------------------+----------------------------------+ | log_bin | ON | | log_bin_basename | /usr/local/mysql/mysql_bin | | log_bin_index | /usr/local/mysql/mysql_bin.index | | log_bin_trust_function_creators | OFF | | log_bin_use_v1_row_events | OFF | | sql_log_bin | ON | +---------------------------------+----------------------------------+ 6 rows in set (0.00 sec) 1.4创建用户用于传输副本，并且配置192.168.0.102和192.168.0.105互为主。实现主主的功能。 1.4.1将192.168.0.102设为192.168.0.105的主服务器。在192.168.0.102上新建用户，这个用户可以了解主备机器，需要有replication的权限用mysql客服端连接到master主节点，创建用户repl并授予权限 [root@localhost102 softwares]# hostname -i 192.168.0.102 [root@localhost102 ~]# mysql -uroot -p --socket=/usr/local/mysql/mysql.sock mysql> create user repl identified by '123456'; Query OK, 0 rows affected (0.05 sec) mysql> grant replication slave on *.* to repl; Query OK, 0 rows affected (0.07 sec) 获取master上binary log的坐标（检查点）选中这个坐标，相当于检查点，在这个检查点之后的数据变化都会写到binary log文件里面，而这个检查点之前的数据不会写到bin log文件里面。会被忽略。注意:在获取检查点的必须要停止所有在执行的SQL语句，并将缓存写人到磁盘 1 ：打开master节点上，执行flush tables with read lock 语句来将所有的表缓存写到磁盘并阻塞随后的写人操作 mysql> flush tables with read lock; Query OK, 0 rows affected (0.01 sec) 2: 执行show master status 语句来确定当前的bin log文件名和检查点的位置 mysql> show master status; +------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-------------------+ | mysql_bin.000002 | 839 | | | | +------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) 并记录相关数据，mysql_bin.000002 position=839等等信息 3：得到bin log的检查点后，可以将表解锁 MySQL>unlock tables; 在192.168.0.105上，将192.168.0.102设置为自己的主服务器 [mysql@localhost105 ~]$ hostname -i 192.168.0.105 [mysql@localhost105 ~]$ mysql -uroot -p --socket=/usr/local/mysql/mysql.sock --port=3306 Enter password: 连接到slave，执行change master to 命令来告诉slave如何进行连接到master，并找到bin log文件，这里假设主节点的 IP为10.63.62.175,。只会复制mysql_bin.000002日志中839之后产生的日志。之前的日志不会复制。由于二台机器都是新的，没有数据。所以不用全备份把数据同步。再搭建复制。 MySQL> change master to master_host = '192.168.0.102',master_user = 'repl',master_password ='123456', master_log_file ='mysql_bin.000002', master_log_pos=839; Query OK, 0 rows affected, 2 warnings (0.08 sec) 开启备库192.168.0.105复制 mysql> start slave; Query OK, 0 rows affected (0.12 sec) 查看状态，如果Slave_IO_Running: 和Slave_SQL_Running: 进程状态是yes，Last_IO_Error没有错误等等信息就可以判断备的复制已经配置好了。只要主更新数据的命令，那么备库就会接受一样的命令。 mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.0.102 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql_bin.000002 Read_Master_Log_Pos: 839 Relay_Log_File: localhost105-relay-bin.000002 Relay_Log_Pos: 320 Relay_Master_Log_File: mysql_bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 839 Relay_Log_Space: 534 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_UUID: f78533a4-7751-11e7-9a27-000c29b31319 Master_Info_File: /usr/local/mysql/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) 在192.168.0.102上，将192.168.0.105设为自己的主服务器。在192.168.0.105上建binlog传输的用户 [mysql@localhost105 ~]$ mysql> create user repl identified by '123456'; Query OK, 0 rows affected (0.05 sec) mysql> grant replication slave on *.* to repl; Query OK, 0 rows affected (0.07 sec) mysql> show status; +------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-------------------+ | mysql_bin.000003 | 747 | | | | +------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) 在192.168.0.102机器上，将192.168.0.105设为自己的主服务器。把192.168.0.105机器的binlog状态信息配置在192.168.0.102上。这样192.168.0.102就知道从那里开始接受192.168.0.105的更新命令。 [mysql@localhost102 ~]$ mysql> change master to master_host = '192.168.0.105',master_user = 'repl',master_password ='123456', master_log_file ='mysql_bin.000003', master_log_pos=747; Query OK, 0 rows affected, 2 warnings (0.10 sec) mysql> start slave; Query OK, 0 rows affected (0.01 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Connecting to master Master_Host: 192.168.0.105 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql_bin.000003 Read_Master_Log_Pos: 747 Relay_Log_File: localhost102-relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: mysql_bin.000003 Slave_IO_Running: Connecting Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 747 Relay_Log_Space: 154 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 2003 Last_IO_Error: error connecting to master 'repl@192.168.0.105:3306' - retry-time: 60 retries: 1 Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 0 Master_UUID: Master_Info_File: /usr/local/mysql/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: 170802 17:47:50 Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) 有错误，Slave_IO_Running: Connecting状态说明io线程有错误，Last_IO_Errno: 2003 Last_IO_Error: error connecting to master 'repl@192.168.0.105:3306'要解决这个错误。初步怀疑应该是防火墙的问题。 2.排查过程 (1) 在配置文件中排查了与bindaddress和skip-networking与相关参数后，均没有配置； (2) 于是排查服务器的防火墙，发现防火墙处于开启状态 (3) 关闭防火墙后，重新执行start slave;发现主从配置成功先看状态 [root@localhost105 ~]# service iptables status; [mysql@localhost105 ~]$ /etc/init.d/iptables stop 再次查看状态： mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.0.102 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql_bin.000002 Read_Master_Log_Pos: 839 Relay_Log_File: localhost105-relay-bin.000002 Relay_Log_Pos: 320 Relay_Master_Log_File: mysql_bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 839 Relay_Log_Space: 534 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_UUID: f78533a4-7751-11e7-9a27-000c29b31319 Master_Info_File: /usr/local/mysql/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) 状态Slave_IO_Running: Yes和状态Slave_SQL_Running: Yes都是yes，都是正常状态，Last_IO_Error: 为空，没有错误。 2.1互为主的配置已经完成，现在需要进行相互测试，看数据是否会同步。如上述均正确配置，现在任何一台MySQL上更新数据都会同步到另一台MySQL。在192.168.0.102机器上创建一个testdb数据库,看192.168.0.105是否也会创建这个数据库。 mysql> create database testdb; Query OK, 1 row affected (0.01 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | testdb | +--------------------+ 5 rows in set (0.00 sec) 在192.168.0.105机器上查看，如果有testdb数据库。 mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | testdb | +--------------------+ 5 rows in set (0.00 sec) 在192.168.0.105机器上创建一个testdb_new数据库,看192.168.0.102是否也会创建这个数据库。 mysql> create database testdb_new; Query OK, 1 row affected (0.01 sec) mysql>show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | testdb | | testdb_new | +--------------------+ 6 rows in set (0.00 sec) 查看192.168.0.102是否有test_new这个数据库 mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | testdb | +--------------------+ 5 rows in set (0.00 sec) 并没有test_new这个数据库，说明有问题， mysql> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Connecting to master Master_Host: 192.168.0.105 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql_bin.000003 Read_Master_Log_Pos: 747 Relay_Log_File: localhost102-relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: mysql_bin.000003 Slave_IO_Running: Connecting Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 747 Relay_Log_Space: 368 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 2003 Last_IO_Error: error connecting to master 'repl@192.168.0.105:3306' - retry-time: 60 retries: 37 Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 0 Master_UUID: Master_Info_File: /usr/local/mysql/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: 170802 18:23:55 Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) 应该还是防火墙的问题，这是为什么呢？上面已经配置关闭防火墙了。再找原因。设置SELINUX=disabled [root@localhost105 ~]# vi /etc/selinux/config SELINUX=disabled [root@localhost102 ~]# vi /etc/selinux/config SELINUX=disabled [root@localhost105 ~]# service iptables status; Table: filter Chain INPUT (policy ACCEPT) num target prot opt source destination 1 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 2 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0 3 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 4 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22 5 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT) num target prot opt source destination 1 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited Chain OUTPUT (policy ACCEPT) num target prot opt source destination [root@localhost105 ~]# service iptables stop [root@localhost102 ~]# service iptables stop iptables: Setting chains to policy ACCEPT: filter [ OK ] iptables: Flushing firewall rules: [ OK ] iptables: Unloading modules: [ OK ] [root@localhost105 ~]# [root@localhost105 ~]# [root@localhost105 ~]# [root@localhost105 ~]# service iptables status; iptables: Firewall is not running. [root@localhost102 ~]# service iptables status; iptables: Firewall is not running. 在192.168.0.105上需要然后stop slave; start slave; 就搞定了。这样才关闭防火墙 mysql> stop slave; Query OK, 0 rows affected (0.00 sec) mysql> start slave; Query OK, 0 rows affected (0.07 sec) 然后在查看192.168.0.102是否存在testdb_new数据库。如果存在说明互为主的测试成功！ 3.1下载和安装keepalived,(主备都要安装keepalived)（下载官方网址：http://www.keepalived.org/download.html）一、概述　　keepalived介绍：Keepalived的作用是检测服务器的状态，如果有一台服务器死机，或工作出现故障，Keepalived将检测到，并将有故障的服务器从系统中剔除，当服务器工作正常后Keepalived自动将服务器加入到服务器群中，这些工作全部自动完成，不需要人工干涉，需要人工做的只是修复故障的服务器。第一步：安装KEEPALIVED 方法一：使用百胜安装KEEPALIVED，需要安装EPEL释放源 [root@localhost102 softwares]#rpm -ivh http://mirrors.opencas.cn/epel/6/i386/epel-release-6-8.noarch.rpm [root@localhost102 softwares]#rpm -ivh http：/ /mirrors.opencas.cn/epel/6/i386/epel-release-6-8.noarch.rpm [root@localhost102 softwares]#yum -y install keepalived 查看KEEPALIVED相关目录 [root@slave ~]# ls /usr/sbin/keepalived /usr/sbin/keepalived [root@slave ~]# ls /etc/init.d/keepalived /etc/init.d/keepalived [root@slave ~]# ls /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf 方法二：从KEEPALIVED网站官方http://www.keepalived.org下载源代码包compile-安装 1，下载KEEPALIVED最新版 [root@localhost102 softwares]#wget -c http://www.keepalived.org/software/keepalived-1.2.20.tar.gz [root@localhost102 softwares]#tar -zxvf keepalived-1.2.20.tar.gz -C /opt 2安装编译环境： [root@localhost102 softwares]# yum install gcc* [root@localhost102 softwares]# yum install gcc gcc-c++ openssl openssl-devel popt-devel kernel-devel pcre-devel libnl-devel 二台服务器都要以root用户安装openssl-devel包，不然会报!!! OpenSSL is not properly installed on your system. !!!和!!! Can not include OpenSSL headers files.的错误。 [root@localhost102 softwares]#yum -y install openssl-devel 二台服务器以root用户安装libnl libnl-devel，不然会报错：*** WARNING - this build will not support IPVS with IPv6. Please install libnl/libnl-3 dev libraries to support IPv6 with IPVS [root@localhost102 keepalived-1.2.20]# yum -y install libnl libnl-devel 二台服务器以root用户安装libnfnetlink-devel，不然会报错configure: error: libnfnetlink headers missing [root@localhost102 keepalived-1.2.20]#yum install -y libnfnetlink-devel [root@localhost102 keepalived-1.2.20]#yum install pcre-devel openssl-devel popt-devel libnl-devel 3解压并安装的keepalived [root@localhost102 keepalived-1.2.20]# uname -a Linux localhost102 2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux 查看内核版本为2.6.32-504.el6.x86_64。--with-kernel-dir参数是内核版本目录，这个是重要的参数，这个参数并不表示我们要把KEEPALIVED统进内核，而是指使用内核源码里面的头文件，也就是包括目录，--prefix参数是keepalived安装路径 [root@localhost102 keepalived-1.2.20]# ./configure --prefix=/usr/local/keepalived --with-kernel-dir=/usr/src/kernels/2.6.32-504.el6.x86_64 如果报错了，解决错误之后需要重新执行./configure命令。重新加载新的。 4 [root@localhost102 keepalived-1.2.20]#make && make install keepalived warning: ‘sgid_num’ may be used uninitialized in this function 如果有许多警告，就在执行一次make && make install ，可能警告会不见了。 5 查看KEEPALIVED相关的文件 [root@master keepalived-1.2.20]# ls /etc/keepalived/ keepalived.conf samples [root@master keepalived-1.2.20]# ls /etc/init.d/keepalived /etc/init.d/keepalived 6 链接在/ usr /本地/ KEEPALIVED / sbin目录/ KEEPALIVED到/ sbin目录/目录 [root@master keepalived-1.2.20]# ln -s /usr/local/keepalived/sbin/keepalived /sbin/ 7设置KEEPALIVED启动级别 [root@master keepalived-1.2.20]# chkconfig --add keepalived [root@master keepalived-1.2.20]# chkconfig --level 35 keepalived on 4.1配置keepalived.conf文件 [root@localhost102 init.d]# vi /usr/local/keepalived/etc/keepalived/keepalived.conf localhost102主的配置为: [root@localhost102 bin]# cat /usr/local/keepalived/etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { notification_email { root@huangmingming.cn 470950247@qq.com #自己的接收邮箱 } notification_email_from keepalived@localhost smtp_server 127.0.0.1 #本机的回环地址为邮件服务器 smtp_connect_timeout 30 #邮件的连接超时时间 router_id LVS_DEVEL } vrrp_instance HA_1 { state BACKUP #master和slave都建议配置为BACKUP,即是BACKUP-BACKUP模式，这样的话，如果某一台机器挂了，VIP就会漂移到另外一台 #机器。如果挂了的那台机器dba恢复完成之后，恢复的那台机器就是以slave加入复制。如果说主挂了，备变成主 #之后，原来的主不会去抢VIP成为主，而是以slave备加入复制。如果把master的状态设置为master和slave的状态设置 #为BACKUP，即是MASTER-BACKUP模式，如果说主挂了，备变成主.然后dba恢复原来的主之后，原来主把VIP强抢过来， #又一次成为主。这样如果经常挂机，VIP就会经常漂移。不利稳定。 interface eth0 #指定HA检测的网络接口 virtual_router_id 80 #虚拟路由标识，主备必须相同 priority 100 #定义优先级，slave设置90，只要小于100就可以 advert_int 1 #设定master和slave之间同步检查的时间间隔 nopreempt #不抢占模式。只在优先级高的机器上设置即可 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { #设置虚拟IP，可以设置多个，每行一个。指定这192.168.1.208个IP为VIP。就是接收web服务器的VIP。 192.168.1.208/24 dev eth0 #MySQL对外服务的IP，即，要求VIP安装前应该是个ping不通的，启动服务之后，就可以ping通 } } virtual_server 192.168.1.208 3306 { delay_loop 2 #每隔2秒查询real server状态 lb_algo wrr #lvs 算法 lb_kinf DR #LVS模式（Direct Route） persistence_timeout 50 protocol TCP #网络传输协议 real_server 192.168.0.102 3306 { #监听本机的IP，192.168.0.102为本机的IP地址，3306为MySQL默认的端口 weight 1 notify_down /usr/local/keepalived/bin/mysqlstatus_check.sh #MySQL服务器如果故障，需要执行的脚本 TCP_CHECK { connect_timeout 10 #10秒无响应超时 bingto 192.168.1.208 nb_get_retry 3 delay_before_retry 3 connect_port 3306 #连接的端口为MySQL的3306默认端口 } } } KEEPALIVED检测脚本，当其中一台MySQL的服务出现故障向下掉时，实现自动切换到正常的MySQL的服务器继续提供服务。和配置文件的notify_down的脚本路径要一致。MySQL的服务出现故障需要执行的脚本 [root@localhost102 init.d]# vim /usr/local/keepalived/bin/mysql.sh #如果MySQL服务器故障了，就杀死keepalived服务。这样VIP #!/bin/bash pkill keepalived 给新建的脚本加执行权限 [root@localhost102 init.d]#chmod 755 /usr/local/keepalived/bin/mysqlstatus_check.sh 3.1.3配置备机的keepalived.conf文件 [root@localhost105 bin]# cat /usr/local/keepalived/etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { notification_email { root@huangmingming.cn 470950247@qq.com } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id LVS_DEVEL } vrrp_instance HA_1 { state BACKUP #master和slave都建议配置为BACKUP,即是BACKUP-BACKUP模式，这样的话，如果某一台机器挂了，VIP就会漂移到另外一台 #机器。如果挂了的那台机器dba恢复完成之后，恢复的那台机器就是以slave加入复制。如果说主挂了，备变成主 #之后，原来的主不会去抢VIP成为主，而是以slave备加入复制。如果把master的状态设置为master和slave的状态设置 #为BACKUP，即是MASTER-BACKUP模式，如果说主挂了，备变成主.然后dba恢复原来的主之后，原来主把VIP强抢过来， #又一次成为主。这样如果经常挂机，VIP就会经常漂移。不利稳定。 interface eth0 #指定HA检测的网络接口 virtual_router_id 80 #虚拟路由标识，主备相同 priority 90 #定义优先级，slave设置90 advert_int 1 #设定master和slave之间同步检查的时间间隔 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { #设置虚拟IP，可以设置多个，每行一个，要求VIP安装前应该是个ping不通的，启动服务之后，就可以ping通 192.168.1.208/24 dev eth0 #MySQL对外服务的IP，即VIP } } virtual_server 192.168.1.208 3306 { delay_loop 2 lb_algo wrr lb_kinf DR persistence_timeout 50 protocol TCP real_server 192.168.0.105 3306 { #监听本机的IP weight 1 notify_down /usr/local/keepalived/bin/mysqlstatus_check.sh TCP_CHECK { connect_timeout 10 bingto 192.168.1.208 nb_get_retry 3 delay_before_retry 3 connect_port 3306 } } } KEEPALIVED检测脚本，当其中一台MySQL的服务出现故障向下掉时，实现自动切换到正常的MySQL的服务器继续提供服务。和配置文件的notify_down的脚本路径要一致。MySQL的服务出现故障需要执行的脚本 [root@localhost105 init.d]# vim /usr/local/keepalived/bin/mysqlstatus_check.sh #如果MySQL服务器故障了，就杀死keepalived服务。这样VIP #!/bin/bash pkill keepalived 给新建的脚本加执行权限 [root@localhost105 init.d]#chmod 755 /usr/local/keepalived/bin/mysqlstatus_check.sh #4.1启动keepalived服务（二台机器都要） [root@localhost102 keepalived-1.2.20]# pwd /opt/keepalived-1.2.20 [root@localhost102 keepalived-1.2.20]# ll total 568 -rw-rw-r-- 1 1000 1000 41 Feb 5 2016 AUTHOR drwxrwxr-x 2 1000 1000 4096 Aug 3 20:25 bin #存放着启动keepalived的程序 -rw-rw-r-- 1 1000 1000 209382 Apr 3 2016 ChangeLog -rw-r--r-- 1 root root 31631 Aug 3 20:18 config.log -rwxr-xr-x 1 root root 27120 Aug 3 20:18 config.status -rwxrwxr-x 1 1000 1000 185566 Mar 6 2016 configure -rw-rw-r-- 1 1000 1000 19068 Mar 6 2016 configure.ac -rw-rw-r-- 1 1000 1000 830 Feb 5 2016 CONTRIBUTORS -rw-rw-r-- 1 1000 1000 18092 Feb 5 2016 COPYING drwxrwxr-x 5 1000 1000 4096 Mar 21 2016 doc drwxrwxr-x 3 1000 1000 4096 Aug 3 20:23 genhash -rw-rw-r-- 1 1000 1000 1601 Feb 5 2016 INSTALL -rwxrwxr-x 1 1000 1000 5598 Feb 5 2016 install-sh drwxrwxr-x 9 1000 1000 4096 Aug 3 20:18 keepalived -rw-r--r-- 1 root root 5230 Aug 3 20:18 keepalived.spec -rw-rw-r-- 1 1000 1000 5233 Feb 5 2016 keepalived.spec.in drwxrwxr-x 2 1000 1000 4096 Aug 3 20:23 lib -rw-r--r-- 1 root root 2438 Aug 3 20:18 Makefile -rw-rw-r-- 1 1000 1000 2453 Mar 21 2016 Makefile.in -rw-rw-r-- 1 1000 1000 1007 Feb 5 2016 README -rw-rw-r-- 1 1000 1000 20 Feb 5 2016 TODO -rw-rw-r-- 1 1000 1000 7 Mar 7 2016 VERSION 启动命令 [root@localhost102 keepalived-1.2.20]# /opt/keepalived-1.2.20/bin/keepalived -f /usr/local/keepalived/etc/keepalived/keepalived.conf 查看进程是否存在 [root@localhost102 keepalived-1.2.20]# ps -ef | grep keepalived root 14864 1 0 20:05 ? 00:00:00 /opt/keepalived-1.2.20/bin/keepalived -f /usr/local/keepalived/etc/keepalived/keepalived.conf root 14865 14864 0 20:05 ? 00:00:00 /opt/keepalived-1.2.20/bin/keepalived -f /usr/local/keepalived/etc/keepalived/keepalived.conf root 14866 14864 0 20:05 ? 00:00:00 /opt/keepalived-1.2.20/bin/keepalived -f /usr/local/keepalived/etc/keepalived/keepalived.conf 制作快捷启动在源文件的目录./keepalived-1.2.20/keepalived/etc/init.d下有3个三个快捷启动的文件。 cd /opt/keepalived-1.2.20/bin # cp ./keepalived-1.2.20/keepalived/etc/init.d/keepalived /etc/init.d/ # cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/ # cp ./keepalived-1.2.20/keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived ＃keepalived的快捷启动，必须要执行以上三步，且文件路径必须一致，也不知道什么原因， # 可能是init.d中的keepalived中已经指明了吧这样可以执行service keepalived [start | stop | reload | restart ],这样很方便启动和停止keepalived服务。查看启动的日志 [root@localhost102 support-files]# tail -f /var/log/messages Aug 4 20:05:56 localhost102 Keepalived[14862]: Starting Keepalived v1.2.20 (08/03,2017) Aug 4 20:05:56 localhost102 Keepalived[14864]: Starting Healthcheck child process, pid=14865 Aug 4 20:05:56 localhost102 Keepalived[14864]: Starting VRRP child process, pid=14866 Aug 4 20:05:56 localhost102 Keepalived_vrrp[14866]: Registering Kernel netlink reflector Aug 4 20:05:56 localhost102 Keepalived_vrrp[14866]: Registering Kernel netlink command channel Aug 4 20:05:56 localhost102 Keepalived_vrrp[14866]: Registering gratuitous ARP shared channel Aug 4 20:05:56 localhost102 Keepalived_vrrp[14866]: Opening file '/usr/local/keepalived/etc/keepalived/keepalived.conf'. #使用那个keepalived配置文件启动 Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Registering Kernel netlink reflector Aug 4 20:05:56 localhost102 Keepalived_vrrp[14866]: Using LinkWatch kernel netlink reflector... Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Registering Kernel netlink command channel Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Opening file '/usr/local/keepalived/etc/keepalived/keepalived.conf'. Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Unknown keyword 'lb_kinf' #keepalived.conf配置文件里面的不知名参数 Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Unknown keyword 'bingto' #keepalived.conf配置文件里面的参数 Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Unknown keyword 'nb_get_retry' #keepalived.conf配置文件里面的参数 Aug 4 20:05:56 localhost102 Keepalived_vrrp[14866]: VRRP_Instance(HA_1) Entering BACKUP STATE Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Using LinkWatch kernel netlink reflector... Aug 4 20:05:56 localhost102 Keepalived_healthcheckers[14865]: Activating healthchecker for service [192.168.0.102]:3306 #机器[192.168.0.102]:3306的端口就是MySQLD服务的默认端口，说明192.168.0.102机器的mysqld的服务可用 Aug 4 20:05:59 localhost102 Keepalived_vrrp[14866]: VRRP_Instance(HA_1) Transition to MASTER STATE Aug 4 20:06:00 localhost102 Keepalived_vrrp[14866]: VRRP_Instance(HA_1) Entering MASTER STATE #已经成为主服务器状态第五步：授权VIP的根用户权限授权远程主机可以通过VIP登录的MySQL，并测试数据复制功能.VIP只能连接主服务器。 [root@localhost102 keepalived]# mysql> grant all on *.* to root@'192.168.1.208' identified by '123456'; mysql> flush privileges; [root@localhost105 keepalived]# mysql> grant all on *.* to root@'192.168.1.208' identified by '123456'; mysql> flush privileges; 第六步，故障测试当前状态：localhost102为主，localhost105为从用VIP只能连接localhost102的MySQL服务器，说明只有主服务器才可以拥有VIP，用VIP连接不上localhost105的MySQL等备服务器。VIP就是MySQL对外服务的IP，web服务器连接VIP。备服务器的Keepalived服务的日志信息： [root@localhost102 support-files]# tail -f /var/log/messages Aug 4 07:03:21 localhost102 Keepalived_vrrp[14731]: VRRP_Instance(HA_1) Transition to MASTER STATE Aug 4 07:03:21 localhost102 Keepalived_vrrp[14731]: VRRP_Instance(HA_1) Entering MASTER STATE [root@localhost102 keepalived-1.2.20]# mysql -uroot -h192.168.1.208 -p -P3306 Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2332 Server version: 5.7.18-log MySQL Community Server (GPL) Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective mysql> mysql> show variables like 'hostname%'; +---------------+--------------+ | Variable_name | Value | +---------------+--------------+ | hostname | localhost102 | +---------------+--------------+ 1 row in set (0.02 sec) 说明localhost102是主服务器备服务器的Keepalived日志： [root@localhost105 keepalived]# tail -f /var/log/messages Aug 4 07:03:21 localhost105 Keepalived_vrrp[25720]: VRRP_Instance(HA_1) Entering BACKUP STATE [root@localhost105 keepalived-1.2.20]# mysql -uroot -h192.168.1.208 -p -P3306 在备通过VIP是连接不上localhost105服务器。现在进行故障测试，手工停到localhost102主服务器。然后查看状态：关闭主服务器: [root@localhost102 keepalived-1.2.20]# /usr/local/mysql/bin/mysqladmin -uroot -p shutdown Enter password: 2017-08-04T12:52:17.832119Z mysqld_safe mysqld from pid file /usr/local/mysql/mysql.pid ended [11] Done /usr/local/mysql/bin/mysqld_safe -defaults-file=/etc/my.cnf (wd: /opt/keepalived-1.2.20/bin) (wd now: /opt/keepalived-1.2.20) 查看localhost102信息： [root@localhost102 support-files]# tail -f /var/log/messages Aug 4 20:05:23 localhost102 Keepalived[14729]: Stopping Aug 4 20:05:23 localhost102 kernel: IPVS: __ip_vs_del_service: enter Aug 4 20:05:23 localhost102 Keepalived_healthcheckers[14730]: Removing service [192.168.0.102]:3306 from VS [192.168.1.208]:3306 #停止MySQL服务 Aug 4 20:05:23 localhost102 Keepalived_healthcheckers[14730]: Stopped Aug 4 20:05:23 localhost102 Keepalived_vrrp[14731]: VRRP_Instance(HA_1) sent 0 priority #设置优先级为0.最低级别 Aug 4 20:05:24 localhost102 Keepalived_vrrp[14731]: Stopped Aug 4 20:05:24 localhost102 Keepalived[14729]: Stopped Keepalived v1.2.20 (08/03,2017) #停止Keepalived服务这是同时也看localhost105的状态： [root@localhost105 support-files]# tail -f /var/log/messages Aug 4 20:05:24 localhost105 Keepalived_vrrp[26589]: VRRP_Instance(HA_1) Transition to MASTER STATE Aug 4 20:05:27 localhost105 Keepalived_vrrp[26589]: VRRP_Instance(HA_1) Entering MASTER STATE #成为主的状态可以使用VIP连接服务器： [root@localhost105 mysql]# mysql -uroot -h192.168.1.208 -p -P3306 Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 618 Server version: 5.7.18-log MySQL Community Server (GPL) Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective mysql> show variables like 'hostname%'; +---------------+--------------+ | Variable_name | Value | +---------------+--------------+ | hostname | localhost105 | +---------------+--------------+ 1 row in set (0.02 sec) 故障之后，localhost105成为新的主服务器，提供对外服务功能！即是以前的主故障了，也不影响MySQL服务！ [root@localhost102 mysql]# mysql -uroot -h192.168.1.208 -p -P3306 没有反映。连接不上这时需要恢复，加入复制的一员。slave 结论：使用keepalived提供双主模式（一主，一备用主）的高可用，但是只有一个主服务器是写。故障切换，不用担心服务不可用。都是自动完成高可用。

转载请注明原文地址: https://www.6miu.com/read-13919.html

技术

最新回复(0)