SAN Storage Failover and Failback PowerShell Scripts for Failover Cluster
Script template to simplify failover and failback operations of SAN storage replication between sites for disaster recovery or drill tests
General storage failover and failback PowerShell template for Failover Cluster (e.g. Hyper-V) with an easy-to-use interactive console menu
Note: This is a template to ease development. The storage-vendor-specific part of the scripts have to be coded by yourself. An alternative way is to engage professionals to develop the script on one of the freelancing platforms.
Introduction
There sometimes comes a need to simply complex operations, in this case failover and failback operations of SAN storage replication between sites (e.g. production and DR), for reasons such as letting operators or the less technically-confident colleagues to more easily perform the operations in case of disasters or drill tests. To achieve that, this template has been created.
Written primarily in PowerShell, this package contains a set of SAN storage failover and failback scripts for Microsoft Failover Cluster (including Hyper-V cluster) and vendor-neutral pseudo code for SAN storage (for further modification to support different SAN vendors). Not only does it perform storage failover and failback, services running on top of it such as databases and virtual machines can also be catered.
Moreover, it features a user-friendly interactive console menu, where complex operations are handled by the scripts in the backend.
[Go to Download]
Served as a general template for implementation, one has to fill in the blanks — SAN-vendor-specific commands — to fit for his/her own use. Although the Failover Cluster parts and the interactive console menu are already there, further coding and testing cannot be avoided.
The scripts were designed to be as reusable (e.g. parameterized) as possible, in order to allow me to more easily adapt it to fit different projects. It has also been released as an open-source project on the GitHub repository ws-storage-failover which encourages IT pros to fork it to for their own SAN storage systems, and if possible, contribute their modified bits in returns, assisting others who might want to implement it for their own choice of SAN storage systems.
Features
Script failover and failback between replicated SAN storage with Microsoft Failover Cluster (Hyper-V and others)
Operator-friendly interactive console menus through which failover, failback and validation, status reporting can be performed
Steps can be performed individually or all at once
Scripted in PowerShell (Windows operations) and pseudo code (SAN-specific operations) as a template to be modified or customized
Define variables or fill in parameters in SAN-Parameters.xml and SAN-Parameters.ps1
Pseudo code of SAN operations is provided inside <# #> inline comment blocks to be replaced as required per command-line reference of SAN storage
Speed up development of automated/manual SAN failover/failback with this set of scripts
Leverage the command-line interface usually provided by SAN storage systems over SSH which makes it possible to use plink.exe (from PuTTY) to command SAN storage systems in scripts
Default scenario is for storage systems located in two sites with SAN-level replication
Console outputs are logged under Logs folder, with error messages separately stored
Open-source project on GitHub, encouraging forking
Requirements
SAN storage systems with SAN replication enabled (2 sites — production and DR sites assumed) with hosts running Failover Cluster (e.g. Hyper-V) in each site
A Windows client that runs the console menu with PowerShell 3.0 or above (comes with Windows 8.1 and Windows Server 2012 R2)
PuTTY should be installed under a location specified in SAN-Parameters.ps1
A one-time authentication may be required by connecting to SAN storage systems in both sites via SSH (with PuTTY or plink.exe) in order to cache the host key in registry
Run PowerShell as Administrator prior to running the Console Menu script
Script File Structure
¦ SAN-Console_Menu.ps1 // Menu for selecting among operations (failover, failback, validation, etc.)
¦
+---Logs // Console output and errors are recorded separately here
¦ ...
¦
+---Parameters
¦ SAN-Parameters.ps1 // Script options can be specified here (PowerShell)
¦ SAN-Parameters.xml // SAN replication options specified here (XML)
¦
+---Scripts
SAN-Failback.ps1 // Failback operation subscript called by Console Menu script
SAN-Failover.ps1 // Failover operation subscript called by Console Menu script
SAN-GetStatus.ps1 // Status querying subscript called by Console Menu script
SAN-Plink.ps1 // Functions which command SAN using plink.exe from PuTTY
SAN-Variables.ps1 // Variables from XML, PS1 parameter files and SAN are further processed here
SAN-Validate.ps1 // Validation subscript called by Console Menu script to confirm settings are valid
Getting Started
Edit SAN-Parameters.ps1 and .xml files for options such as changing the user account for SSH communication with your SAN.
All pseudo code (SAN-specific operations) inside <# #> should be replaced with the implementation of your SAN vendor. For example, change Echo <# display LUN #> to the actual command lsvdisk (IBM/Lenovo Storwize), lun show(NetApp), volcoll (HPE Nimble), etc.
Perform further development or customization according to your needs. For example:
Follow the comments in the script
Names of functions, variables and echo messages are self-explanatory
Core Functions
A. Failover from Site 1 to Site 2 (e.g. Production to DR)
1. Hyper-V Cluster
End Replication — from Production Site to DR Site
Create Clone from Replicated LUN in DR Site
Present Cloned LUN to DR Site Host, Take Disk Online, Run Replication from Production Site to DR Site
Import VM in DR Site
Disconnect VM Network Adapter in DR Site
2. General Failover Cluster
End Replication — from Production Site to DR Site
Create Clone from Replicated LUN in DR Site
Present Cloned LUN to DR Site Host, Take Disk Online, Run Replication from Production Site to DR Site
B. Failback from Site 2 to Site 1 (e.g. DR to Production)
1. Hyper-V Cluster
Stop and Delete Replication Group — from Production Site to DR Site
Stop and Delete Replication Group — from DR Site to Production Site
Stop VM in Production Site
Stop CSV in Production Site
Delete VM in Production Site
Unpresent LUN of CSV from Production Site Hosts
Delete LUN of CSV in Production Site
Create LUN of CSV in Production Site
Create Replication Group — from DR Site to Production Site
Add LUN to Replication Group — from DR Site to Production Site
Run Replication Group — from DR Site to Production Site
Stop VM in DR Site
Take Disk Offline in DR Site Host
Unpresent Cloned LUN from DR Site Host
Stop and Delete Replication Group — from DR Site to Production Site
Present LUN of Replicated CSV to Production Site Hosts
Run CSV in Production Site
Import VM in Production Site
Disconnect VM Network Adapters in Production Site
Run VM in Production Site
Create Replication Group — from Production Site to DR Site
Add LUN to Replication Group — from Production Site to DR Site
Run Replication Group — from Production Site to DR Site
2. General Failover Cluster
Stop and Delete Replication Group — from Production Site to DR Site
Stop and Delete Replication Group — from DR Site to Production Site
Stop CSV in Production Site
Unpresent LUN of CSV from Production Site Hosts
Delete LUN of CSV in Production Site
Create LUN of CSV in Production Site
Create Replication Group — from DR Site to Production Site
Add LUN to Replication Group — from DR Site to Production Site
Run Replication Group — from DR Site to Production Site
Take Disk Offline in DR Site Host
Unpresent Cloned LUN from DR Site Host
Stop and Delete Replication Group — from DR Site to Production Site
Present LUN of Replicated CSV to Production Site Hosts
Run CSV in Production Site
Create Replication Group — from Production Site to DR Site
Add LUN to Replication Group — from Production Site to DR Site
Run Replication Group — from Production Site to DR Site
Limitations
There is no one-size-fits-all solution — modification is inevitable
Not all error messages are separately recorded in error log file; some errors only exist in the main log. Outputs and errors encountered in the menu are not logged
SAN storage credentials are stored in the ps1 configuration file in clear text (protect the file properly)
Download
To download the latest version of ws-storage-failover, visit its homepage.
Release History
Welcome to buy me a cup of coffee ☕ if this tool is useful to you. 😊 Thanks!
Originally published at https://tech.wandersick.com on May 30, 2018.