Wednesday, August 19, 2009

Removing Sun[TM] Cluster 3.x node and cluster software packages

http://sunsolve.sun.com/search/document.do?assetkey=1-61-230779-1
Document Audience: SPECTRUM
Document ID: 230779
Old Document ID: (formerly 50093)
Title: Removing Sun[TM] Cluster 3.x node and cluster software packages
Copyright Notice: Copyright © 2009 Sun Microsystems, Inc. All Rights Reserved
Update Date: Thu Dec 18 00:00:00 MST 2008

Solution Type Technical Instruction

Solution 230779 : Removing Sun[TM] Cluster 3.x node and cluster software packages


Related Categories


Home>Product>Software>Enterprise Computing

Description
This document serves to address that need. Need of redeploying clusters.
There are many instances where a cluster node needs to be redeployed and its cluster software removed for resource allocation.
This document describes a 3 node scalable topology configuration running Solaris[TM] 9 and Sun[TM] Cluster 3.1 update 2.
The nodes are referred to as node1, node2 and node3.
There are 4 resource groups configured:
logical-rg (SUNW.LogicalHostname)
dg1-rg(SUNW.HAStoragePlus)
shareaddr-rg(SUNW.SharedAddress)
apache-rg(SUNW.apache)
Since SunCluster 3.0 Update 3, cluster packages can be removed using scinstall -r.
The procedure below removes node2 and uses scinstall -r in the final step.
Notes:
If you plan to completely remove cluster software from all cluster nodes, please refer to Infodoc: < Solution: 217563 > for a more succinct procedure that does not involve removing one node at a time.
This procedure assumes that at least a quorum device is configured for the cluster. This is true in most of the cases. However, otherwise, at least one quorum device needs to be configured in order to remove the first node of the 3 nodes. Please refer to document < Solution: 203650 > for further details.

Steps to Follow
Migrate off resource groups and device groups to other nodes.

# scswitch -S -h node2
Delete node2 instances from all resource groups.
* Start with scalable resource groups, followed by failover resource groups
* Gather configuration information by running the following commands
# scrgadm -pv | grep "Res Group Nodelist"
# scconf -pv | grep "Node ID"
# scrgadm -pvv | grep "NetIfList.*value"
* Scalable Resource Group(s)
- Set maximum and desired primaries to appropriate number
# scrgadm -c -g apache-rg -y maximum_primaries="2" \
-y desired_primaries="2"
- Set remaining nodenames to scalable resource group
# scrgadm -c -g apache-rg -h node1,node3
- Remove from node list failover resource group with shared address
# scrgadm -c -g shareaddr-rg -h node1,node3
* Failover Resource Group(s)
- Set remaining nodenames to failover resource group
# scrgadm -c -g logical-rg -h node1,node3
# scrgadm -c -g dg1-rg -h node1,node3
- Check for IPMP groups affected
# scrgadm -pvv -g logical-rg | grep -i netiflist
# scrgadm -pvv -g shareaddr-rg | grep -i netiflist
- Update IPMP groups affected
# scrgadm -c -j logicalhost \
-x netiflist=sc_ipmp0 @1,sc_ipmp0@3
# scrgadm -c -j shared-address \
-x netiflist=sc_ipmp0@1,sc_ipmp0@3
* Verify changes to resource groups
# scrgadm -pvv -g apache-rg | grep -i nodelist
# scrgadm -pvv -g apache-rg | grep -i netiflist
# scrgadm -pvv -g shareaddr-rg | grep -i nodelist
# scrgadm -pvv -g shareaddr-rg | grep -i netiflist
# scrgadm -pvv -g logical-rg | grep -i nodelist
# scrgadm -pvv -g logical-rg | grep -i netiflist
3. Delete node instances from all disk device groups
* Solaris Volume Manager
- Check for diskgroups affected
# scconf -pv | grep -i "Device group" | grep node2
# scstat -D
- Remove node from diskset nodelist
# metaset -s setname -d -h nodelist (use -f needed)
* VERITAS Volume Manager
- Check for diskgroups affected
# scconf -pv | grep -i "Device group" | grep node2 # scstat -D
- Remove node from diskgroup nodelist
# scconf -r -D name=dg1,nodelist=node2
* Raw Disk Device Group
- Remember to change desired secondaries to 1
- On any active remaining node(s), identify device groups connected # scconf -pvv | grep node2 | grep "Device group node list"
- Determine raw device
# scconf -pvv | grep Disk
- Disable the localonly property of each Local_Disk
# scconf -c -D name=,localonly=false
- Verify disabled localonly property
# scconf -pvv | grep "Disk"
- Remove node from raw device
# scconf -r -D name=rawdisk-device-group,nodelist=node2
Steps 3-5 is not applicable for 2 node clusters.
3. Remove all fully connected quorum devices.
- Check quorum disk information
# scconf -pv | grep Quorum
- Remove quorum disk
# scconf -r -q globaldev=d
4. Remove all fully connected storage devices from node2. Use any method that will block access from node2 to shared storage
- vxdiskadm to suppress access from VxVM
- cfgadm -c unconfigure
- LUN masking/mapping methods if application
- physical cable removal if allowed
5. Add back the quorum devices
# scconf -a -q globaldev=d,node=node1,node=node3
6. Place the node being removed into maintenance state.
* Shutdown node2
# shutdown -g0 -y -i0
* On remaining node
# scconf -c -q node=node2,maintstate
* Verify quorum status
# scstat -q
7. Remove all logical transport connections from node being removed
* Check for interconnect configuration
# scstat -W
# scconf -pv | grep cable
# scconf -pv | grep adapter
* Remove cables configuration
# scconf -r -m endpoint=node2:qfe0
# scconf -r -m endpoint=node2:qfe1
* Remove adapter configuration
# scconf -r -A name=qfe0,node=node2
# scconf -r -A name=qfe1,node=node2
8. For 2 node clusters only, remove quorum disk.
* If not already done so, shutdown node to be uninstalled.
# shutdown -y -g 0
* On remaining node, put node to be removed in maintenence mode
# scconf -c -q node=node2,maintstate
* Place cluster in installmode
# scconf -c -q installmode
* Remove quorum disk
# scconf -r -q globaldev=d
* Verify quorum status
# scstat -q
9. Remove node from the cluster software configuration.
* # scconf -r -h node=node2
* # scstat -n
10. Remove cluster software
* If not already done so, shutdown node to be uninstalled.
# shutdown -g0 -y -i0
* Reboot the node into non-cluster mode.
ok> boot -x
* Remove all globally file systems except /global/.devices in /etc/vfstab
* Uninstall Sun Cluster software from the node
# scinstall -r
If it is desirable to remove the last node of the cluster, a complete removal of all resource and device groups will be required. Please follow the procedure below:
1. Offline all resource groups (RGs):
# scswitch -F -g [,...]
2. Disable all configured resources:
# scswitch -n -j [,...]
3. Remove all resources from the resource group:
# scrgadm -r -j
4. Remove the now empty resource groups:
# scrgadm -r -g
5. Remove global mounts in /etc/vfstab file and "/node@nodeid" mount options.
6. Remove all device groups:
# scstat -D (to get a list of device groups)
# scswitch -F -D device-group-name (to offline device-group)
# scconf -r -D name=device-group-name (to remove/unregister
NOTE: If there are any "rmt" devices, they must be removed with the command:
# /usr/cluster/dtk/bin/dcs_config -c remove -s rmt/1
This assumes that you have the package "SUNWscdtk". If you do not, you will need to install it in order to remove the rmt/XX entries, or the "scinstall -r" will fail.
The SUNWscdtk package is the diagnostics tool for cluster and is not available on the Cluster CD, you need to get it from the following URL:
http://suncluster.eng/service/tools.html
Uninstall the Sun Cluster 3.X software:
* If not already done so, shutdown node.
# shutdown -g0 -y -i0
* Reboot the node into non-cluster mode.
ok> boot -x
* Finally remove the SunCluster 3.x software using:
# scinstall -r

Product
Sun Cluster Geographic Edition 3.1 8/05
Solaris Cluster 3.2
Sun Cluster 3.1
Sun Cluster 3.1 Data Services Agents
Sun Cluster Agents 3.1 9/04
Sun Cluster Agents 3.1 4/04
Sun Cluster Agents 3.1 10/03
Sun Cluster Agents 3.1 05/03
Sun Cluster 3.1 9/04
Sun Cluster 3.1 8/05
Sun Cluster 3.1 7/05
Sun Cluster 3.1 4/04
Sun Cluster 3.1 10/03 for SunPlex Systems
Sun Cluster 3.0
Sun Cluster 3.0 7/01
Sun Cluster 3.0 5/02
Sun Cluster 3.0 12/01

Keywords
remove, removal, Cluster, node, scinstall, 3.x, ccr, resources

No comments: