CentOS Stream 9 : Pacemaker : ノードを追加する : Server World

Pacemaker : ノードを追加する

2023/12/05

既存のクラスターにノードを追加する場合の設定です。

例として、既存のクラスターに [node03] を新規に追加します。

                       +--------------------+
                       | [  ISCSI Target  ] |
                       |    dlp.srv.world   |
                       +----------+---------+
                         10.0.0.30|
                                  |
+----------------------+          |          +----------------------+
| [  Cluster Node#1  ] |10.0.0.51 | 10.0.0.52| [  Cluster Node#2  ] |
|   node01.srv.world   +----------+----------+   node02.srv.world   |
+----------------------+          |          +----------------------+
                                  |
                                  |10.0.0.53
                      +-----------------------+
                      | [  Cluster Node#3  ]  |
                      +   node03.srv.world    |
                      +-----------------------+

[1]	新規追加ノードで、こちらの [1], [2] を参考に Pacemaker をインストールしておきます。
[2]	既存のクラスターにノードを新規追加します。

[root@node01 ~]#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node01.srv.world (version 2.1.6-10.1.el9-6fdc9deea29) - partition with quorum
  * Last updated: Tue Dec  5 14:37:00 2023 on node01.srv.world
  * Last change:  Tue Dec  5 14:33:19 2023 by root via cibadmin on node01.srv.world
  * 2 nodes configured
  * 1 resource instance configured

Node List:
  * Online: [ node01.srv.world node02.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node01.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

# 新規ノードの認証を確立

[root@node01 ~]#

pcs host auth node03.srv.world

Username: hacluster
Password:
node03.srv.world: Authorized

# 新規ノードを追加

[root@node01 ~]#

pcs cluster node add node03.srv.world

No addresses specified for host 'node03.srv.world', using 'node03.srv.world'
Disabling sbd...
node03.srv.world: sbd disabled
Sending 'corosync authkey', 'pacemaker authkey' to 'node03.srv.world'
node03.srv.world: successful distribution of the file 'corosync authkey'
node03.srv.world: successful distribution of the file 'pacemaker authkey'
Sending updated corosync.conf to nodes...
node01.srv.world: Succeeded
node02.srv.world: Succeeded
node03.srv.world: Succeeded
node01.srv.world: Corosync configuration reloaded

[3]	フェンスデバイスの設定を更新します。当例のようにフェンスデバイスに SCSI フェンシングを設定している場合は、新規追加ノードでフェンスデバイス用の共有ストレージにログインして、SCSI フェンスエージェントをインストールしておきます ([2], [3])。その後、以下のようにフェンスデバイスの設定を更新します。

# フェンスデバイスのリストを更新

[root@node01 ~]#

pcs stonith update scsi-shooter pcmk_host_list="node01.srv.world node02.srv.world node03.srv.world"

[root@node01 ~]#

pcs stonith config scsi-shooter

Resource: scsi-shooter (class=stonith type=fence_scsi)
  Attributes: scsi-shooter-instance_attributes
    devices=/dev/disk/by-id/wwn-0x6001405f89df433c1ce4390afc6e0bad
    pcmk_host_list="node01.srv.world node02.srv.world node03.srv.world"
  Meta Attributes: scsi-shooter-meta_attributes provides=unfencing
  Operations: monitor: scsi-shooter-monitor-interval-60s interval=60s

[4]	既存のクラスターにすでにリソースを設定している場合は、フェイルオーバーした際に正常に新規追加ノードがアクティブとなれるように、各リソース用の設定が必要です。例として、こちらのように LVM 共有ストレージを設定している場合は、新規追加ノードで、事前に LVM 共有ストレージを認識させておく必要があります。

[root@node03 ~]#

iscsiadm -m discovery -t sendtargets -p 10.0.0.30

10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target01
10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target02

[root@node03 ~]#

iscsiadm -m node --login --target iqn.2022-01.world.srv:dlp.target02

[root@node03 ~]#

iscsiadm -m session -o show

tcp: [1] 10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target01 (non-flash)
tcp: [2] 10.0.0.30:3260,1 iqn.2022-01.world.srv:dlp.target02 (non-flash)

[root@node03 ~]#

lvmdevices --adddev /dev/sdb1

[root@node03 ~]#

lvm pvscan --cache --activate ay

  pvscan[17327] PV /dev/vda2 online, VG cs is complete.
  pvscan[17327] PV /dev/sdb1 ignore foreign VG.
  pvscan[17327] VG cs run autoactivation.
  2 logical volume(s) in volume group "cs" now active

[5]	既存のクラスターにすでにリソースを設定している場合は、フェイルオーバーした際に正常に新規追加ノードがアクティブとなれるように、各リソース用の設定が必要です。例として、こちらのように Apache httpd を設定している場合は、新規追加ノードで、リンク先の [1] の設定が必要です。
[6]	各リソース用の設定が全て終了したら、新規追加ノードのクラスターサービスを起動します。

# 新規追加ノード起動

[root@node01 ~]#

pcs cluster start node03.srv.world

node03.srv.world: Starting Cluster...
[root@node01 ~]#

pcs cluster enable node03.srv.world

node03.srv.world: Cluster Enabled

[root@node01 ~]#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node02.srv.world (version 2.1.6-10.1.el9-6fdc9deea29) - partition with quorum
  * Last updated: Tue Dec  5 15:31:11 2023 on node01.srv.world
  * Last change:  Tue Dec  5 15:30:11 2023 by hacluster via crmd on node02.srv.world
  * 3 nodes configured
  * 5 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world node03.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node01.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node01.srv.world
    * httpd_fs  (ocf:heartbeat:Filesystem):      Started node01.srv.world
    * httpd_vip (ocf:heartbeat:IPaddr2):         Started node01.srv.world
    * website   (ocf:heartbeat:apache):  Started node01.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[7]	フェンシングを実行して、新規追加ノードに正常にフェイルオーバーするか確認します。

[root@node03 ~]#

pcs stonith fence node01.srv.world

Node: node01.srv.world fenced

[root@node03 ~]#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node03.srv.world (version 2.1.6-10.1.el9-6fdc9deea29) - partition with quorum
  * Last updated: Tue Dec  5 16:18:09 2023 on node01.srv.world
  * Last change:  Tue Dec  5 16:15:44 2023 by hacluster via crmd on node03.srv.world
  * 3 nodes configured
  * 5 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world node03.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node03.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node03.srv.world
    * httpd_fs  (ocf:heartbeat:Filesystem):      Started node03.srv.world
    * httpd_vip (ocf:heartbeat:IPaddr2):         Started node03.srv.world
    * website   (ocf:heartbeat:apache):  Started node03.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[8]	ノードを削除する場合は、以下のように実行します。

[root@node01 ~]#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node02.srv.world (version 2.1.6-10.1.el9-6fdc9deea29) - partition with quorum
  * Last updated: Tue Dec  5 15:31:11 2023 on node01.srv.world
  * Last change:  Tue Dec  5 15:30:11 2023 by hacluster via crmd on node02.srv.world
  * 3 nodes configured
  * 5 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world node03.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node01.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node01.srv.world
    * httpd_fs  (ocf:heartbeat:Filesystem):      Started node01.srv.world
    * httpd_vip (ocf:heartbeat:IPaddr2):         Started node01.srv.world
    * website   (ocf:heartbeat:apache):  Started node01.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[root@node01 ~]#

pcs cluster node remove node03.srv.world

Destroying cluster on hosts: 'node03.srv.world'...
node03.srv.world: Successfully destroyed cluster
Sending updated corosync.conf to nodes...
node01.srv.world: Succeeded
node02.srv.world: Succeeded
node01.srv.world: Corosync configuration reloaded

# フェンスデバイスのリストを更新

[root@node01 ~]#

pcs stonith update scsi-shooter pcmk_host_list="node01.srv.world node02.srv.world"

[root@node01 ~]#

pcs status

Cluster name: ha_cluster
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node02.srv.world (version 2.1.6-10.1.el9-6fdc9deea29) - partition with quorum
  * Last updated: Tue Dec  5 16:23:21 2023 on node01.srv.world
  * Last change:  Tue Dec  5 16:22:57 2023 by root via cibadmin on node01.srv.world
  * 2 nodes configured
  * 5 resource instances configured

Node List:
  * Online: [ node01.srv.world node02.srv.world ]

Full List of Resources:
  * scsi-shooter        (stonith:fence_scsi):    Started node01.srv.world
  * Resource Group: ha_group:
    * lvm_ha    (ocf:heartbeat:LVM-activate):    Started node01.srv.world
    * httpd_fs  (ocf:heartbeat:Filesystem):      Started node01.srv.world
    * httpd_vip (ocf:heartbeat:IPaddr2):         Started node01.srv.world
    * website   (ocf:heartbeat:apache):  Started node01.srv.world

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled