一、高可用集群框架

HA Cluster.png

資源類型：

primitive(native)：表示主資源

group：表示組資源，組資源里包含多個主資源

clone：表示克隆資源

master/slave：表示主從資源

資源約束方式：

位置約束：定義資源對節點的傾向性

排序約束：定義資源彼此能否運行在同一節點的傾向性

順序約束：多個資源啟動順序的依賴關系

HA集群常用的工作模型：

A/P：兩節點，active/passive，工作于主備模型

A/A：兩節點，active/active，工作于主主模型

N-M：N>M，N個節點，M個服務，假設每個節點運行一個服務，活動節點數為N，備用節點數為N-M

在集群分裂(split-brain)時需要使用到資源隔離，有兩種隔離級別：

STONITH：節點級別的隔離，通過斷開一個節點的電源或者重新啟動節點

fencing：資源級別的隔離，類似通過向交換機發出隔離信號，特意讓數據無法通過此接口

當集群分裂，即分裂后的一個集群的法定票數小于總票數一半時采取對資源的控制策略，

二、在centos7上建立Ha cluster

centos7(corosync v2 + pacemaker)

集群的全生命周期管理工具：

pcs: agent(pcsd)

crmsh: agentless (pssh)

1、集群配置前提

時間同步，基于當前正在使用的主機名互相訪問，是否會用到仲裁設備

#更改主機名，兩臺主機的都要修改（192.168.1.114為ns2.xinfeng.com）（192.168.1.113為ns3.xinfeng.com）
[root@ns2 ~]# hostnamectl set-hostname ns2.xinfeng.com
[root@ns2 ~]# uname -n
ns2.xinfeng.com
[root@ns2 ~]# vim /etc/hosts
192.168.1.114   ns2.xinfeng.com
192.168.1.113   ns3.xinfeng.com
#同步時間
[root@ns2 ~]# ntpdate s1a.time.edu.cn
[root@ns2 ~]# ssh 192.168.1.113 'date';date
The authenticity of host '192.168.1.113 (192.168.1.113)' can't be established.
ECDSA key fingerprint is 09:f9:39:8c:35:4d:ba:2d:13:4f:3c:9c:b1:58:54:ec.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.1.113' (ECDSA) to the list of known hosts.
root@192.168.1.113's password: 
2016年 05月 28日 星期六 13:18:07 CST
2016年 05月 28日 星期六 13:18:07 CST

2、安裝pcs并啟動集群

#192.168.1.113
[root@ns3 ~]# yum install pcs
#192.168.1.114
[root@ns2 ~]# yum install pcs
#用ansible開起服務并使服務開機運行
[root@ns2 ~]# vim /etc/ansible/hosts
[ha]
192.168.1.114
192.168.1.113
[root@ns2 ~]# ansible ha -m service -a 'name=pcsd state=started enabled=yes'
192.168.1.113 | SUCCESS => {
    "changed": false, 
    "enabled": true, 
    "name": "pcsd", 
    "state": "started"
}
192.168.1.114 | SUCCESS => {
    "changed": true, 
    "enabled": true, 
    "name": "pcsd", 
    "state": "started"
}
#用ansible查看服務是否啟動
[root@ns2 ~]# ansible ha -m shell -a 'systemctl status pcsd'
192.168.1.114 | SUCCESS | rc=0 >>
● pcsd.service - PCS GUI and remote configuration interface
   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
   Active: active (running) since 六 2016-05-28 13:36:19 CST; 2min 32s ago
 Main PID: 2736 (pcsd)
   CGroup: /system.slice/pcsd.service
           ├─2736 /bin/sh /usr/lib/pcsd/pcsd start
           ├─2740 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb
           └─2741 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb

5月 28 13:36:16 ns2.xinfeng.com systemd[1]: Starting PCS GUI and remote configuration interface...
5月 28 13:36:19 ns2.xinfeng.com systemd[1]: Started PCS GUI and remote configuration interface.

192.168.1.113 | SUCCESS | rc=0 >>
● pcsd.service - PCS GUI and remote configuration interface
   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
   Active: active (running) since 六 2016-05-28 13:35:26 CST; 3min 24s ago
 Main PID: 2620 (pcsd)
   CGroup: /system.slice/pcsd.service
           ├─2620 /bin/sh /usr/lib/pcsd/pcsd start
           ├─2624 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb
           └─2625 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb

5月 28 13:35:24 ns3.xinfeng.com systemd[1]: Starting PCS GUI and remote configuration interface...
5月 28 13:35:26 ns3.xinfeng.com systemd[1]: Started PCS GUI and remote configuration interface.
#給hacluster用戶增加密碼
[root@ns2 ~]# ansible ha -m shell -a 'echo "123" | passwd --stdin hacluster'
192.168.1.113 | SUCCESS | rc=0 >>
更改用戶 hacluster 的密碼 。
passwd：所有的身份驗證令牌已經成功更新。

192.168.1.114 | SUCCESS | rc=0 >>
更改用戶 hacluster 的密碼 。
passwd：所有的身份驗證令牌已經成功更新。
#認證節點身份，用戶名和密碼為上面設置的hacluster和123，注意iptables規則，否則會出現無法聯系的情況
[root@ns2 ~]# pcs cluster auth ns2.xinfeng.com ns3.xinfeng.com
Username: hacluster
Password: 
ns2.xinfeng.com: Authorized
ns3.xinfeng.com: Authorized
#配置集群，集群名字為xinfengcluster，集群中有2個節點
[root@ns2 ~]# pcs cluster setup --name xinfengcluster ns2.xinfeng.com ns3.xinfeng.com
Shutting down pacemaker/corosync services...
Redirecting to /bin/systemctl stop  pacemaker.service
Redirecting to /bin/systemctl stop  corosync.service
Killing any remaining services...
Removing all cluster configuration files...
ns2.xinfeng.com: Succeeded
ns3.xinfeng.com: Succeeded
Synchronizing pcsd certificates on nodes ns2.xinfeng.com, ns3.xinfeng.com...
ns2.xinfeng.com: Success
ns3.xinfeng.com: Success

Restaring pcsd on the nodes in order to reload the certificates...
ns2.xinfeng.com: Success
ns3.xinfeng.com: Success
#查看下配置文件
[root@ns2 ~]# cat /etc/corosync/corosync.conf
totem {        #集群的信息
    version: 2    #版本
    secauth: off    #安全功能是否開起
    cluster_name: xinfengcluster    #集群名稱
    transport: udpu    #傳輸協議udpu也可以設置為udp
}

nodelist {    #集群中的所有節點
    node {
        ring0_addr: ns2.xinfeng.com
        nodeid: 1    #節點ID
    }

    node {
        ring0_addr: ns3.xinfeng.com
        nodeid: 2
    }
}

quorum {    #仲裁投票
    provider: corosync_votequorum    #投票系統
    two_node: 1    #是否為2節點集群
}

logging {    #日志
    to_logfile: yes    #是否記錄日志
    logfile: /var/log/cluster/corosync.log    #日志文件位置
    to_syslog: yes    #是否記錄系統日志
}
#啟動集群
[root@ns2 ~]# pcs cluster start --all
ns3.xinfeng.com: Starting Cluster...
ns2.xinfeng.com: Starting Cluster...
#查看ns2.xinfeng.com節點是否啟動
[root@ns2 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
	id	= 192.168.1.114
	status	= ring 0 active with no faults
#查看ns3.xinfeng.com節點是否啟動
[root@ns3 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 2
RING ID 0
	id	= 192.168.1.113
	status	= ring 0 active with no faults
#查看集群信息
[root@ns2 ~]# corosync-cmapctl | grep members
runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.1.114) 
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.1.113) 
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined
[root@ns2 ~]# pcs status
Cluster name: xinfengcluster
WARNING: no stonith devices and stonith-enabled is not false
Last updated: Sat May 28 14:38:23 2016		Last change: Sat May 28 14:33:15 2016 by hacluster via crmd on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
#DC為全局仲裁節點
2 nodes and 0 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:


PCSD Status:
  ns2.xinfeng.com: Online
  ns3.xinfeng.com: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

3、使用crmsh配置集群

安裝opensuse上的yum源

centos6

cd /etc/yum.repos.d/   
wget http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo
cd    
yum -y install crmsh

centos7

cd /etc/yum.repos.d/   
wget http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/network:ha-clustering:Stable.repo
cd    
yum -y install crmsh

在其中一臺主機上配置crmsh（192.1681.1.114）

#顯示當前的集群狀態
[root@ns2 yum.repos.d]# crm status
Last updated: Sat May 28 16:52:52 2016		Last change: Sat May 28 14:33:15 2016 by hacluster via crmd on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 0 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

4、兩個節點上分別裝上httpd

[root@ns2 ~]# ansible ha -m shell -a 'yum install httpd -y'
[root@ns2 ~]# echo "<h1>ns2.xinfeng.com</h1>" > /var/www/html/index.html
[root@ns3 ~]# echo "<h1>ns3.xinfeng.com</h1>" > /var/www/html/index.html
#測試能否正常啟動，頁面是否正常
[root@ns2 ~]# ansible ha -m service -a 'name=httpd state=started enabled=yes'
#centos6必須關閉服務，關閉開機啟動，之后服務會變成資源，所以請確保服務不啟動也不開機啟動
[root@ns2 ~]# ansible ha -m service -a 'name=httpd state=stopped enabled=no'
#centos7必須關閉服務，開起開機啟動
[root@ns2 ~]# ansible ha -m service -a 'name=httpd state=stopped enabled=yes'

5、配置集群

VIP為192.168.1.91，服務是httpd，將VIP和httpd作為資源來進行配置

[root@ns2 ~]# crm
crm(live)# ra    #進入資源代理
crm(live)ra# classes    #查看可以代理的資源類型
lsb
ocf / .isolation heartbeat openstack pacemaker
service
stonith
systemd
crm(live)ra# list systemd    #查看systemd類型可代理的服務，其中有httpd
NetworkManager                    NetworkManager-wait-online        auditd
brandbot                          corosync                          cpupower
crond                             dbus                              display-manager
dm-event                          dracut-shutdown                   ebtables
emergency                         exim                              firewalld
getty@tty1                        httpd                             ip6tables
iptables                          irqbalance                        kdump
kmod-static-nodes                 ldconfig                          libvirtd
lvm2-activation                   lvm2-lvmetad                      lvm2-lvmpolld
lvm2-monitor                      lvm2-pvscan@8:2                   microcode
network                           pacemaker                         pcsd
plymouth-quit                     plymouth-quit-wait                plymouth-read-write
plymouth-start                    polkit                            postfix
rc-local                          rescue                            rhel-autorelabel
rhel-autorelabel-mark             rhel-configure                    rhel-dmesg
rhel-import-state                 rhel-loadmodules                  rhel-readonly
rsyslog                           sendmail                          sshd
sshd-keygen                       syslog                            systemd-ask-password-console
systemd-ask-password-plymouth     systemd-ask-password-wall         systemd-binfmt
systemd-firstboot                 systemd-fsck-root                 systemd-hwdb-update
systemd-initctl                   systemd-journal-catalog-update    systemd-journal-flush
systemd-journald                  systemd-logind                    systemd-machine-id-commit
systemd-modules-load              systemd-random-seed               systemd-random-seed-load
systemd-readahead-collect         systemd-readahead-done            systemd-readahead-replay
systemd-reboot                    systemd-remount-fs                systemd-shutdownd
systemd-sysctl                    systemd-sysusers                  systemd-tmpfiles-clean
systemd-tmpfiles-setup            systemd-tmpfiles-setup-dev        systemd-udev-trigger
systemd-udevd                     systemd-update-done               systemd-update-utmp
systemd-update-utmp-runlevel      systemd-user-sessions             systemd-vconsole-setup
tuned                             wpa_supplicant
crm(live)ra# cd
crm(live)# configure
#配置資源，資源名為webip，ip為192.168.1.91
crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.91
crm(live)configure# show
node 1: ns2.xinfeng.com
node 2: ns3.xinfeng.com
primitive webip IPaddr \
	params ip=192.168.1.91
property cib-bootstrap-options: \
	have-watchdog=false \
	dc-version=1.1.13-10.el7_2.2-44eb2dd \
	cluster-infrastructure=corosync \
	cluster-name=xinfengcluster
crm(live)configure# verify    #校驗，因為沒有隔離設備所以報錯
ERROR: error: unpack_resources:	Resource start-up disabled since no STONITH resources have been defined
   error: unpack_resources:	Either configure some or disable STONITH with the stonith-enabled option
   error: unpack_resources:	NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
crm(live)configure# property stonith-enabled=false    #關閉隔離設備的設置
crm(live)configure# verify    #再次校驗
crm(live)configure# commit    #提交，是配置生效
crm(live)configure# cd
crm(live)# status
Last updated: Sat May 28 17:41:46 2016		Last change: Sat May 28 17:41:31 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 1 resource configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns2.xinfeng.com    #VIP已經啟動在了ns2上
 crm(live)# quit
bye
[root@ns2 ~]# ip addr    #驗證一下VIP是否已經配置到網卡上
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777728: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:91:57:d1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.114/24 brd 192.168.1.255 scope global dynamic eno16777728
       valid_lft 5918sec preferred_lft 5918sec
    inet 192.168.1.91/24 brd 192.168.1.255 scope global secondary eno16777728
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe91:57d1/64 scope link 
       valid_lft forever preferred_lft forever
#將當前節點切換為備用節點
[root@ns2 ~]# crm
crm(live)# node
crm(live)node# standby
crm(live)node# cd 
crm(live)# status
Last updated: Sat May 28 17:45:41 2016		Last change: Sat May 28 17:45:34 2016 by root via crm_attribute on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 1 resource configured

Node ns2.xinfeng.com: standby
Online: [ ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns3.xinfeng.com
#資源重新上線
crm(live)# node online
crm(live)# status
Last updated: Sat May 28 17:46:40 2016		Last change: Sat May 28 17:46:37 2016 by root via crm_attribute on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 1 resource configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns3.xinfeng.com
#配置httpd資源，資源名為httpd
[root@ns2 ~]# crm
crm(live)# configure
crm(live)configure# primitive webser systemd:httpd 
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sat May 28 17:50:15 2016		Last change: Sat May 28 17:49:56 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 2 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns3.xinfeng.com
 webser	(systemd:httpd):	Started ns2.xinfeng.com
#將兩個資源放在webhttp組中，資源啟動順序是webip，webser
crm(live)# configure
crm(live)configure# group webhttp webip webser
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sat May 28 17:52:48 2016		Last change: Sat May 28 17:52:41 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 2 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 Resource Group: webhttp
     webip	(ocf::heartbeat:IPaddr):	Started ns3.xinfeng.com
     webser	(systemd:httpd):	Started ns3.xinfeng.com

由于是2個節點，會存在法定票數不足導致的資源不轉移的情況，解決此問題的方法有四種：

1、可以增加一個ping node節點。

2、可以增加一個仲裁磁盤。

3、讓集群中的節點數成奇數個。

4、直接忽略當集群沒有法定票數時直接忽略。

這里我用的是第四種方式

[root@ns2 ~]# crm
crm(live)# configure
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# verify
crm(live)configure# commit

這樣做還不夠，因為沒有對資源進行監控，所以資源出現問題依然不會轉移

現在只能測試下資源是否在ns3.xinfeng.com上啟動了

要對資源進行監控需要在全局下命令primitive定義資源時一同定義，因此先把之前定義的資源刪掉后重新定義

[root@ns2 ~]# crm
crm(live)# resource
crm(live)resource# show
 Resource Group: webhttp
     webip	(ocf::heartbeat:IPaddr):	Started
     webser	(systemd:httpd):	Started
crm(live)resource# stop webhttp    #停掉所有資源
crm(live)resource# show
 Resource Group: webhttp
     webip	(ocf::heartbeat:IPaddr):	(target-role:Stopped) Stopped
     webser	(systemd:httpd):	(target-role:Stopped) Stopped
crm(live)configure# edit    #編輯資源定義配置文件

node 1: ns2.xinfeng.com \
        attributes standby=off
node 2: ns3.xinfeng.com
primitive webip IPaddr \            #刪除
        params ip=192.168.1.91        #刪除
primitive webser systemd:httpd        #刪除
group webhttp webip webser \        #刪除
        meta target-role=Stopped        #刪除
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.13-10.el7_2.2-44eb2dd \
        cluster-infrastructure=corosync \
        cluster-name=xinfengcluster \
        stonith-enabled=false \
        no-quorum-policy=ignore
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sat May 28 23:32:01 2016		Last change: Sat May 28 23:31:49 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 0 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

6、重新定義帶有監控的資源

crm(live)# configure
#每60秒監控一次，超時時長為20秒，時間不能小于建議時長，否則會報錯
crm(live)configure# primitive webip ocf:IPaddr params ip=192.168.1.91 op monitor timeout=20s interval=60s
crm(live)configure# primitive webser systemd:httpd op monitor timeout=20s interval=60s
crm(live)configure# group webhttp webip webser
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sat May 28 23:41:03 2016		Last change: Sat May 28 23:40:36 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 2 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 Resource Group: webhttp
     webip	(ocf::heartbeat:IPaddr):	Started ns2.xinfeng.com
     webser	(systemd:httpd):	Started ns2.xinfeng.com

測試一下，將服務停掉，20秒后服務又自動會啟動

[root@ns2 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777728: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:91:57:d1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.114/24 brd 192.168.1.255 scope global dynamic eno16777728
       valid_lft 6374sec preferred_lft 6374sec
    inet 192.168.1.91/24 brd 192.168.1.255 scope global secondary eno16777728
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe91:57d1/64 scope link 
       valid_lft forever preferred_lft forever
[root@ns2 ~]# service httpd stop
Redirecting to /bin/systemctl stop  httpd.service

編輯配置文件，隨便亂加幾行讓服務不能啟動

[root@ns2 ~]# vim /etc/httpd/conf/httpd.conf 
[root@ns2 ~]# service httpd stop
Redirecting to /bin/systemctl stop  httpd.service

服務成功切換到ns3上

7、【注意】當重新恢復httpd服務后記得清除資源的錯誤信息，否則無法啟動資源

crm(live)# resource
crm(live)resource# cleanup webser    #清楚webser之前的錯誤信息
Cleaning up webser on ns2.xinfeng.com, removing fail-count-webser
Cleaning up webser on ns3.xinfeng.com, removing fail-count-webser
Waiting for 2 replies from the CRMd.. OK
crm(live)resource# show
 webip	(ocf::heartbeat:IPaddr):	Started
 webser	(systemd:httpd):	Started

8、定義資源約束

刪除組資源

[root@ns2 ~]# crm
crm(live)# configure
crm(live)configure# delete webhttp    #刪除webhttp組
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sun May 29 09:22:57 2016		Last change: Sun May 29 09:22:48 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 2 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns2.xinfeng.com
 webser	(systemd:httpd):	Started ns3.xinfeng.com

排列約束

[root@ns2 ~]# crm
crm(live)# configure
crm(live)configure# colocation webser_with_webip inf: webser webip    #定義webser和webip兩個資源必須在一起
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sun May 29 09:40:36 2016		Last change: Sun May 29 09:40:28 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 2 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns2.xinfeng.com    #可以看到兩個資源都在ns2上啟動了
 webser	(systemd:httpd):	Started ns2.xinfeng.com

順序約束

crm(live)# configure
#webip先于webser啟動，強制的先啟動webip在啟動webser
crm(live)configure# order webip_before_webser mandatory: webip webser
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show xml    #查看之前詳細定義

位置約束

#先看下當前的位置
crm(live)# status
Last updated: Sun May 29 09:48:42 2016		Last change: Sun May 29 09:44:35 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 2 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns2.xinfeng.com
 webser	(systemd:httpd):	Started ns2.xinfeng.com
#定義位置約束讓資源更傾向于ns3上
crm(live)# configure
定義一個webservice的位置約束在節點2上，資源webip對ns3的傾向性是100
crm(live)configure# location webservice_pref_node2 webip 100: ns3.xinfeng.com
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sun May 29 10:12:12 2016		Last change: Sun May 29 10:12:04 2016 by root via cibadmin on ns2.xinfeng.com
Stack: corosync
Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 2 resources configured

Online: [ ns2.xinfeng.com ns3.xinfeng.com ]

Full list of resources:

 webip	(ocf::heartbeat:IPaddr):	Started ns3.xinfeng.com    #很明顯，資源都轉移到了ns3上了
 webser	(systemd:httpd):	Started ns3.xinfeng.com

三、總結

1、當重新恢復資源的服務后一定記得清除資源的錯誤信息，否則無法啟動資源

2、在利用corosync+pacemaker且是兩個節點實現高可用時，需要注意的是要設置全局屬性把stonith設備關閉，忽略法定票數不大于一半的機制

3、注意selinux和iptables對服務的影響

4、注意節點相互用/etc/hosts來解析

5、節點時間一定要保持同步

6、節點相互間進行無密鑰通信

原創文章，作者：N17_信風，如若轉載，請注明出處：http://www.www58058.com/16656

Centos7上利用corosync+pacemaker+crmsh構建高可用集群

一、高可用集群框架

二、在centos7上建立Ha cluster

1、集群配置前提

2、安裝pcs并啟動集群

3、使用crmsh配置集群

4、兩個節點上分別裝上httpd

5、配置集群

6、重新定義帶有監控的資源

7、【注意】當重新恢復httpd服務后記得清除資源的錯誤信息，否則無法啟動資源

8、定義資源約束

三、總結

評論列表（5條）

Centos7上利用corosync+pacemaker+crmsh構建高可用集群

一、高可用集群框架

二、在centos7上建立Ha cluster

1、集群配置前提

2、安裝pcs并啟動集群

3、使用crmsh配置集群

4、兩個節點上分別裝上httpd

5、配置集群

6、重新定義帶有監控的資源

7、【注意】當重新恢復httpd服務后記得清除資源的錯誤信息，否則無法啟動資源

8、定義資源約束

三、總結

相關推薦

接51CTO：13 用戶組和權限管理3

復習-RAID原理詳解

nginx 基礎筆記

2016/10/14作業

重定向和管道

20161028第9天作業

評論列表（5條）

一、高可用集群框架

1、集群配置前提

2、安裝pcs并啟動集群

4、兩個節點上分別裝上httpd

5、配置集群

7、【注意】當重新恢復httpd服務后記得清除資源的錯誤信息，否則無法啟動資源

8、定義資源約束

三、總結