为了账号安全,请及时绑定邮箱和手机立即绑定

centos下mysql高可用架构MHA搭建及测试故障转移

标签:
MySQL


MHA项目网站

https://code.google.com/p/mysql-master-ha/

一.环境介绍

1.主机部署

manager机:10.10.54.154

master机:10.10.54.156

slave1机:10.10.54.155(备用master)

slave2机:10.10.54.157

2.大致步骤

A.首先用ssh-keygen实现四台主机之间相互免密钥登录

B.安装MHAmha4mysql-node,mha4mysql-manager 软件包

C.建立master,slave1,slave2之间主从复制

D.管理机manager上配置MHA文件

E.masterha_check_ssh工具验证ssh信任登录是否成功

F.masterha_check_repl工具验证mysql复制是否成功

G.启动MHA manager,并监控日志文件

H.测试master(156)宕机后,是否会自动切换

3.说明:下面中括号中的主机名说明了当前操作是在哪台机子上进行的

二.首先用ssh-keygen实现四台主机之间相互免密钥登录

[manager机]

shell> ssh-keygen -t rsa -b 2048

shell> scp-copy-id root@10.10.54.155

shell> scp-copy-id root@10.10.54.156

shell> scp-copy-id root@10.10.54.157

在另外三台机子重复此步骤,使四台机子中的任何两台之间可以免密码登录

三.安装MHAmha4mysql-node,mha4mysql-manager 软件包

1.四台主机上安装MHAmha4mysql-node

[manager,master,slave1,slave2]

shell> yum update

shell> yum -y install perl-DBD-MySQL ncftp

shell> wget http://mysql-master-ha.googlecode.com/files/mha4mysql-node-0.53-0.noarch.rpm

sehll> rpm -ivh mha4mysql-node-0.53-0.noarch.rpm

2.在manager机子上安装mha4mysql-manager

[manager]

shell> yum install perl

shell> yum install cpan

shell> rpm -ivh mha4mysql-manager-0.53-0.el6.noarch.rpm

error:

perl(Config::Tiny) is needed by mha4mysql-manager-0.53-0.noarch

perl(Log::Dispatch) is needed by mha4mysql-manager-0.53-0.noarch

perl(Log::Dispatch::File) is needed by mha4mysql-manager-0.53-0.noarch

perl(Log::Dispatch::Screen) is needed by mha4mysql-manager-0.53-0.noarch

perl(Parallel::ForkManager) is needed by mha4mysql-manager-0.53-0.noarch

perl(Time::HiRes) is needed by mha4mysql-manager-0.53-0.noarch

[solution]

shell> wget ftp://ftp.muug.mb.ca/mirror/centos/5.10/os/x86_64/CentOS/perl-5.8.8-41.el5.x86_64.rpm

shell> wget ftp://ftp.muug.mb.ca/mirror/centos/6.5/os/x86_64/Packages/compat-db43-4.3.29-15.el6.x86_64.rpm

shell> wget http://downloads.naulinux.ru/pub/NauLinux/6x/i386/sites/School/RPMS/perl-Log-Dispatch-2.27-1.el6.noarch.rpm

shell> wget http://dl.fedoraproject.org/pub/epel/6/i386/perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm

shell> wget http://dl.fedoraproject.org/pub/epel/6/i386/perl-Mail-Sender-0.8.16-3.el6.noarch.rpm

shell> wget http://dl.fedoraproject.org/pub/epel/6/i386/perl-Mail-Sendmail-0.79-12.el6.noarch.rpm

shell> wget http://mirror.centos.org/centos/6/os/x86_64/Packages/perl-Time-HiRes-1.9721-136.el6.x86_64.rpm

shell> rpm -ivh perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm perl-Log-Dispatch-2.27-1.el6.noarch.rpm perl-Mail-Sender-0.8.16-3.el6.noarch.rpm perl-Mail-Sendmail-0.79-12.el6.noarch.rpm perl-Time-HiRes-1.9721-136.el6.x86_64.rpm

shell> rpm -ivh mha4mysql-manager-0.53-0.el6.noarch.rpm

四.建立master,slave1,slave2之间主从复制

[master:156]

1.shell> vim /etc/my.cnf

#server-id 改为1

server-id=1

log-bin=mysql-bin

binlog_format=mixed

#授权操作

2.mysql> GRANT ALL PRIVILEGES ON *.* TO 'rep'@'10.10.54.%' IDENTIFIED BY 'rep123';

mysql> flush privileges;

3.mysql> show master status;

[slave1,slave2]

4.change master操作

mysql> change master to

master_host='10.10.54.156',

master_port=3306,

master_user='rep',

master_password='rep123',

master_log_file='mysql-bin.000001',

master_log_pos=112;

注意:slave1机子上也要授权,因为这个是备用master

[slave1:155]

5.mysql> GRANT ALL PRIVILEGES ON *.* TO 'rep'@'10.10.54.%' IDENTIFIED BY 'rep123';

[master,slave1,slave2]

6.查看主从复制是否成功的一些命令

mysql> start slave;

mysql> stop slave;

mysql> reset slave;

mysql> show slave status\G;

五.所有主机上设置复制权限帐号

mysql> GRANT ALL PRIVILEGES ON *.* TO 'mha_rep'@'10.10.2.10' IDENTIFIED BY '123456';

六.manager上配置MHA文件,管理各个节点

[manager:154]

shell> mkdir -p /masterha/app1

shell> mkdir /etc/masterha

shell> vim /etc/masterha/app1.cnf

[server default]

user=mha_rep    ##mysql管理用戶名

password=123456

manager_workdir=/masterha/app1   #目录可以任意指定

manager_log=/masterha/app1/manager.log

remote_workdir=/masterha/app1

ssh_user=root     #ssh免密钥登录的帐号名

repl_user=rep     #mysql复制帐号,用来在主从机之间同步二进制日志等

repl_password=rep123

ping_interval=1    #ping间隔时间,用来检测master是否正常

[server1]

hostname=10.10.54.155

#ssh_port=9999

master_binlog_dir=/data/ndb   #mysql数据库目录

candidate_master=1    #master机宕掉后,优先启用这台作为新master

[server2]

hostname=10.10.54.156

#ssh_port=9999

master_binlog_dir=/data/ndb

candidate_master=1

[server3]

hostname=10.10.54.157

#ssh_port=9999

master_binlog_dir=/data/ndb

no_master=1     #设置no_master=1使主机不能成为新master

七.验证ssh信任登陆和mysql主从复制是否成功

1.masterha_check_ssh 验证ssh信任登陆

[manager:154]

shell> masterha_check_ssh --conf=/etc/masterha/app1.cnf

Sun Mar  2 17:45:38 2014 - [debug]   ok.

Sun Mar  2 17:45:38 2014 - [info] All SSH connection tests passed successfully.

2.masterha_check_repl 验证mysql复制是否成功

[manager:154]

shell> masterha_check_repl --conf=/etc/masterha/app1.cnf

---------------------------------------------------------

Sun Mar  2 13:16:57 2014 - [info] Slaves settings check done.

Sun Mar  2 13:16:57 2014 - [info]

10.10.54.156 (current master)

 +--10.10.54.155

 +--10.10.54.157

...

MySQL Replication Health is OK.

---------------------------------------------------------------

八.启动MHA manager,并监控日志文件

[manager:154]

shell> nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log 2>&1

shell> tail -f /masterha/app1/manager.log

---------------------------------------------------------------

 10.10.54.156 (current master)

 +--10.10.54.155

 +--10.10.54.157

...

Sun Mar  2 13:09:25 2014 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

-----------------------------------------------------------------

监控的manager.log文件表明MHA运行良好,正在 "waiting until MySQL doesn't respond"

九.测试master(156)宕机后,是否会自动切换

1.测试自动切换是否成功

当掉master机子

shell> /etc/init.d/myqld stop

当掉master后,manager上的监控文件/masterha/app1/manager.log显示错误信息,表示不能自动切换:

[error]

-----------------------------------------------------------

Sun Mar  2 13:13:46 2014 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln178] Got ERROR: Use of uninitialized value $msg in scalar chomp at /usr/share/perl5/vendor_perl/MHA/ManagerConst.pm line 90.

-----------------------------------------------------------

解决这个错误是在文件/usr/share/perl5/vendor_perl/MHA/ManagerConst.pm 第90行(chomp $msg)前加入一行:

$msg = "" unless($msg);

好了,错误解决了,下面我们再次重复上面步骤:

master上mysql服务:shell> /etc/init.d/mysqld stop

再次查看manager机子上监控文件内容

shell> tail -f tail -f /masterha/app1/manager.log

日志文件显示:

-----------------------------------------------------------

----- Failover Report -----

app1: MySQL Master failover 10.10.54.156 to 10.10.54.155 succeeded

Master 10.10.54.156 is down!

Check MHA Manager logs at mycentos4:/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.

The latest slave 10.10.54.155(10.10.54.155:3306) has all relay logs for recovery.

Selected 10.10.54.155 as a new master.

10.10.54.155: OK: Applying all logs succeeded.

10.10.54.157: This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

10.10.54.157: OK: Applying all logs succeeded. Slave started, replicating from 10.10.54.155.

10.10.54.155: Resetting slave info succeeded.

Master failover to 10.10.54.155(10.10.54.155:3306) completed successfully.

--------------------------------------------------------

2.切换成功后,检查replication状态

[master:156]

shell> /etc/init.d/mysqld start

[manager:154]

shell> masterha_check_repl --conf=/etc/masterha/app1.cnf

--------------------------------------------------------------

Sun Mar  2 13:22:11 2014 - [info] Slaves settings check done.

Sun Mar  2 13:22:11 2014 - [info] 

10.10.54.155 (current master)

+--10.10.54.156

+--10.10.54.157

...

MySQL Replication Health is OK.

---------------------------------------------------------------

上面的"10.10.54.155 (current master)" 这句表明master成功切换到155机子上

十.上一步测试之后,新master机为155,宕掉155机子,再次测试故障转移

1.启动管理节点

shell> nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log 2>&1

2.启动日志检测,然后当掉新master(155),然后查看监控文件变化

shell> tail -f /masterha/app1/manager.log

3.当掉155机子(即新的master)

shell> /etc/init.d/mysqld stop

4.查看manager主机上的监控文件变化

[error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln295] Last failover was done at 2014/03/02 13:02:47. Current time is too early to do failover again. If you want to do failover, manually remove /masterha/app1/app1.failover.complete and run this script again.

错误解决办法

1.日志文件提示切换master过快,需要删除/masterha/app1/app1.failover.complete

1.删除app1.failover.complete

shell> rm /masterha/app1/app1.failover.complete

5.重新测试:

master转移成功,重新转为156机子

--------------------------------------------------------

Master 10.10.54.155 is down!

Check MHA Manager logs at mycentos4:/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.

The latest slave 10.10.54.156(10.10.54.156:3306) has all relay logs for recovery.

Selected 10.10.54.156 as a new master.

10.10.54.156: OK: Applying all logs succeeded.

10.10.54.157: This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

10.10.54.157: OK: Applying all logs succeeded. Slave started, replicating from 10.10.54.156.

10.10.54.156: Resetting slave info succeeded.

Master failover to 10.10.54.156(10.10.54.156:3306) completed successfully.

-----------------------------------------------------------

//附:故障转移后,用命令恢复原来的master

[manager:154]

1.在旧master上执行

mysql> reset master;

mysql> change master to master_host='10.10.54.155', master_port=3306, master_user='rep', master_password='rep123', master_log_file='mysql-bin.000031', master_log_pos=112;

mysql> start slave;      #暂时先把旧master变为从

2.然后在manager节点上:

[manager:154]

shell> masterha_master_switch --master_state=alive --conf=/etc/masterha/app1.cnf

##master成功切换回

©著作权归作者所有:来自51CTO博客作者ZIJIAN1012的原创作品,如需转载,请注明出处,否则将追究法律责任

mysql故障转移MHA


点击查看更多内容
1人点赞

若觉得本文不错,就分享一下吧!

评论

作者其他优质文章

正在加载中
感谢您的支持,我会继续努力的~
扫码打赏,你说多少就多少
赞赏金额会直接到老师账户
支付方式
打开微信扫一扫,即可进行扫码打赏哦
今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与 放弃机会
意见反馈 帮助中心 APP下载
官方微信

举报

0/150
提交
取消