SQL Server 2025 : Failover Cluster Instance2026/03/19 |
|
Configure SQL Server Failover Cluster Instance. This example is based on the environment like follows.
+-------------------+
| [dlp.srv.world] |
| ISCSI Target |
+---------+---------+
10.0.0.30|
|
+----------------------+ | +----------------------+
| [ node01.srv.world ] |10.0.0.51 | 10.0.0.52| [ node02.srv.world ] |
| CLVM +----------+----------+ CLVM |
| SQL Server | | | SQL Server |
+----------------------+ | +----------------------+
VIP:10.0.0.50
|
| [1] |
Configure basic Cluster setting on all Nodes of Failover Cluster Instance, refer to here. |
| [2] |
Configure Fence Device on all Nodes of Failover Cluster Instance, refer to here. |
| [3] |
Configure LVM shared storage on all Nodes of Failover Cluster Instance, refer to here. |
| [4] | Disable SQL Server on all Nodes of Failover Cluster Instance and also install SQL Server for HA package. |
|
[root@node01 ~]#
systemctl disable --now mssql-server [root@node01 ~]# dnf -y install mssql-server-ha
[root@node01 ~]#
vi /etc/hosts # add cluster nodes # [MSSQL_HA] ⇒ any name that is used as SQL server resource name 10.0.0.50 MSSQL_HA 10.0.0.51 node01 node01.srv.world 10.0.0.52 node02 node02.srv.world |
| [5] | Add filesystem resource and copy SQL Server Data to it on a Node that shared storage is active. |
|
[root@node01 ~]# pcs status
Cluster name: ha_cluster
Cluster Summary:
* Stack: corosync (Pacemaker is running)
* Current DC: node01.srv.world (version 2.1.10-2.el9-5693eaeee) - partition with quorum
* Last updated: Thu Mar 19 09:58:11 2026 on node01.srv.world
* Last change: Thu Mar 19 09:50:37 2026 by root via root on node01.srv.world
* 2 nodes configured
* 2 resource instances configured
Node List:
* Online: [ node01.srv.world node02.srv.world ]
Full List of Resources:
* scsi-shooter (stonith:fence_scsi): Started node01.srv.world
* Resource Group: ha_group:
* lvm_ha (ocf:heartbeat:LVM-activate): Started node01.srv.world
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@node01 ~]#
cp -pR /var/opt/mssql/data /var/opt/mssql/data.bk
# create filesystem resource [root@node01 ~]# pcs resource create mssql_fs ocf:heartbeat:Filesystem device=/dev/vg_ha/lv_ha directory=/var/opt/mssql/data fstype=ext4 group ha_group --future
pcs resource status
* Resource Group: ha_group:
* lvm_ha (ocf:heartbeat:LVM-activate): Started node01.srv.world
* mssql_fs (ocf:heartbeat:Filesystem): Started node01.srv.world
[root@node01 ~]#
[root@node01 ~]# pcs constraint colocation add mssql_fs with lvm_ha
df -hT /var/opt/mssql/data Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/vg_ha-lv_ha ext4 9.8G 24K 9.3G 1% /var/opt/mssql/data[root@node01 ~]# cp -pR /var/opt/mssql/data.bk/* /var/opt/mssql/data/ [root@node01 ~]# chown mssql:mssql /var/opt/mssql/data [root@node01 ~]# ll /var/opt/mssql/data total 76516 -rw-rw----. 1 mssql mssql 256 Mar 19 09:55 Entropy.bin drwx------. 2 root root 16384 Mar 19 09:49 lost+found -rw-rw----. 1 mssql mssql 4915200 Mar 19 09:56 master.mdf -rw-rw----. 1 mssql mssql 2097152 Mar 19 09:56 mastlog.ldf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:56 modellog.ldf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:56 model.mdf -rw-rw----. 1 mssql mssql 14548992 Mar 19 09:56 model_msdbdata.mdf -rw-rw----. 1 mssql mssql 524288 Mar 19 09:56 model_msdblog.ldf -rw-rw----. 1 mssql mssql 2359296 Mar 19 09:56 model_replicatedmaster.ldf -rw-rw----. 1 mssql mssql 4915200 Mar 19 09:56 model_replicatedmaster.mdf -rw-rw----. 1 mssql mssql 14548992 Mar 19 09:56 msdbdata.mdf -rw-rw----. 1 mssql mssql 524288 Mar 19 09:56 msdblog.ldf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb2.ndf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb3.ndf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb4.ndf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb5.ndf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb6.ndf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb7.ndf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb8.ndf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 tempdb.mdf -rw-rw----. 1 mssql mssql 8388608 Mar 19 09:55 templog.ldf |
| [6] | Start SQL Server and add an account for Pacemaker on a Node that shared storage is active. |
|
[root@node01 ~]# systemctl start mssql-server [root@node01 ~]# sqlcmd -C -S localhost -U SA Password: # comfirm the servername and change it to cluster resource name 1> select @@servername 2> go ----------------------------------- node01 (1 rows affected) 1> exec sp_dropserver 'node01' 2> exec sp_addserver 'MSSQL_HA', 'local' 3> go 1> use master; 2> go # add [hacluster] user 1> create login hacluster with PASSWORD= N'password'; 2> go # add [sysadmin] role 1> alter server role [sysadmin] add member hacluster; 2> go 1> exit[root@node01 ~]# systemctl stop mssql-server
|
| [7] | On all Nodes, add SQL Server account info that you created above. |
|
[root@node01 ~]# echo 'hacluster' > /var/opt/mssql/secrets/passwd [root@node01 ~]# echo 'password' >> /var/opt/mssql/secrets/passwd [root@node01 ~]# chown root:root /var/opt/mssql/secrets/passwd [root@node01 ~]# chmod 600 /var/opt/mssql/secrets/passwd |
| [8] | Add SQL Server and Virtual IP address to Cluster Resource. |
|
[root@node01 ~]# pcs resource create MSSQL_HA ocf:mssql:fci op start interval=60s group ha_group --future [root@node01 ~]# pcs resource create VIP ocf:heartbeat:IPaddr2 ip=10.0.0.50 cidr_netmask=24 op monitor interval=30s group ha_group --future [root@node01 ~]# pcs resource status
* Resource Group: ha_group:
* lvm_ha (ocf:heartbeat:LVM-activate): Started node01.srv.world
* mssql_fs (ocf:heartbeat:Filesystem): Started node01.srv.world
* MSSQL_HA (ocf:mssql:fci): Started node01.srv.world (Monitoring)
* VIP (ocf:heartbeat:IPaddr2): Started node01.srv.world
[root@node01 ~]# pcs constraint colocation add MSSQL_HA with mssql_fs [root@node01 ~]# pcs constraint config
Colocation Constraints:
resource 'lvm_ha' with resource 'scsi-
shooter'
score=INFINITY
resource 'mssql_fs' with resource
'lvm_ha'
score=INFINITY
resource 'MSSQL_HA' with resource
'mssql_fs'
score=INFINITY
[root@node01 ~]# pcs resource status
* Resource Group: ha_group:
* lvm_ha (ocf:heartbeat:LVM-activate): Started node01.srv.world
* mssql_fs (ocf:heartbeat:Filesystem): Started node01.srv.world
* MSSQL_HA (ocf:mssql:fci): Started node01.srv.world
* VIP (ocf:heartbeat:IPaddr2): Started node01.srv.world
|
| [9] | It's OK. Stop active node manually and verify failover normally. |
|
[root@node01 ~]# pcs resource status
* Resource Group: ha_group:
* lvm_ha (ocf:heartbeat:LVM-activate): Started node01.srv.world
* mssql_fs (ocf:heartbeat:Filesystem): Started node01.srv.world
* MSSQL_HA (ocf:mssql:fci): Started node01.srv.world
* VIP (ocf:heartbeat:IPaddr2): Started node01.srv.world
[root@node01 ~]# sqlcmd -C -S 10.0.0.50 -U SA -Q 'select @@servername' Password: ------------------------------------- MSSQL_HA (1 rows affected) # stop active node [root@node01 ~]# pcs cluster stop node01.srv.world
# on another node [root@node02 ~]# pcs resource status
* Resource Group: ha_group:
* lvm_ha (ocf:heartbeat:LVM-activate): Started node02.srv.world
* mssql_fs (ocf:heartbeat:Filesystem): Started node02.srv.world
* MSSQL_HA (ocf:mssql:fci): Started node02.srv.world
* VIP (ocf:heartbeat:IPaddr2): Started node02.srv.world
[root@node02 ~]# sqlcmd -C -S 10.0.0.50 -U SA -Q 'select @@servername' Password: ---------------------------------- MSSQL_HA (1 rows affected) |
| Sponsored Link |
|
|