Copy of Dwai Lahiri's Sun Cluster Cheat Sheet - 2

Searching Sun Cluster

Contents
Index

This page contains the following information:



Sun Cluster Set up

Don't mix PCI and SBus SCSI devices

Quorum Device Rules
  • A quorum device must be available to both nodes in a 2-node cluster
  • quorum device info is maintained globally in the CCR db
  • quorum device should contain user data
  • Max and optimal number of votes contributed by quorum devices must be N -1 (where N == number of nodes in the cluster)
  • If # of quorum devices >= # of nodes, Cluster cannot come up easily if there are too many failed/errored quorum devices
  • quorum devices are not required in clusters with more than 2 nodes, but recommended for higher cluster availability
  • quorum devices are manually configured after Sun Cluster s/w installation is done
  • quorum devices are configured using DID devices
Quorum Math and Consequences

A running cluster is always aware of (Math):

  • Total possible Q votes (number of nodes + disk quorum votes)
  • Total present Q votes (number of booted nodes + available quorum device votes) --> Total needed Q votes ( >= 50% of possible votes)

Consequences:

  • Node that cannot find adequate Q votes will freeze, waiting for other nodes to join the cluster
  • Node that is booted in the cluster but can no longer find the needed number of votes kernel panics

installmode Flag — allows for cluster nodes to be rebooted after/during initial installation without causing the other (active) node(s) to panic.

Cluster status
Reporting the cluster membership and quorum vote information

# /usr/cluster/bin/scstat -q

Verifying cluster configuration info

# scconf -p

Run scsetup to correct any configuration mistakes and/or to:
  • add or remove quorum disks
  • add, remove, enable, disable cluster transport components
  • register/unregister vxVM device groups
  • add/remove node access from a VxVM device group
  • change clsuter private host names
  • change cluster name
Shuting down cluster on all nodes

# scshutdown -y -g 15

# scstat #verifies cluster status

Cluster Daemons

lahirdx@aescib1:/home/../lahirdx > ps -ef|grep cluster|grep -v grep
root 4 0 0 May 07 ? 352:39 cluster
root 111 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/qd_userd
root 120 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/failfastd
root 123 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/clexecd
root 124 123 0 May 07 ? 0:00 /usr/cluster/lib/sc/clexecd
root 1183 1 0 May 07 ? 46:45 /usr/cluster/lib/sc/rgmd
root 1154 1 0 May 07 ? 0:07 /usr/cluster/lib/sc/rpc.fed
root 1125 1 0 May 07 ? 23:49 /usr/cluster/lib/sc/sparcv9/rpc.pmfd
root 1153 1 0 May 07 ? 0:03 /usr/cluster/lib/sc/cl_eventd
root 1152 1 0 May 07 ? 0:04 /usr/cluster/lib/sc/cl_eventlogd
root 1336 1 0 May 07 ? 2:17 /var/cluster/spm/bin/scguieventd -d
root 1174 1 0 May 07 ? 0:03 /usr/cluster/bin/pnmd
root 1330 1 0 May 07 ? 0:01 /usr/cluster/lib/sc/scdpmd
root 1339 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/cl_ccrad

  • FF Panic rule — failfast will shutdown the node (panic the kernel) if specified daemon is not restarted within 30 seconds
  • cluster — System proc created by the kernel to encap kernel threads that make up the core kernel range of operations. It directly panics the kernel if it's sent a KILL signal (SIGKILL). Other signals have no effect.
  • clexecd — This is used by cluster kernel threads to execute userland cmds (such as run_reserve and dofsck cmds). It is also used to run cluster cmds remotely (eg: scshutdown).A failfast driver panics the kernel if this daemon is killed and not restarted in 30 seconds.
  • cl_eventd — This daemon registers and forwards cluster events s(eg: nodes entering and leaving the cluster). With a min of SC 3.1 10/03, user apps can register themselves to receive cluster events. The daemon automatically gets respawned by rpc.pmfd if it is killed.
  • rgmd — This is the resource group mgr, which manages the state of all cluster-unaware applications. A failfast driver panics the kernel if this daemon is killed by not started in 30 seconds.
  • rpc.fed — This is the "fork-and-exec" daemon, which handles reqs from rgmd to spawn methods for specific data services. failfast will hose the box if this is killed and not restarted in 30 seconds.
  • scguieventd — This daemon processes cluster events for the SunPlex or Sun Cluster Mgr GUI, so that the display can be updated in real time. It's not automatically started if it stops. If you are having trouble with SunPlex or Sun Cluster Mgr, might have to restart the daemon or reboot the specific node.
  • rpc.pmfd — This is the process monitoring facility. It is i used as a general mech to initiate restarts and failure action scripts for some cluster f/w daemons, and for most app daemons and app fault monitors. FF panic rule holds good.
  • pnmd — This is the public Network mgt daemon, and manages n/w status info received from the local IPMP (in.mpathd) running on each node in the cluster. It is automatically restarted by rpc.pmfd if it dies.
  • scdpmd — multi-threaded DPM daemon runs on each node. DPM daemon is started by an rc script when a node boots. It montiors the availability of logical path that is visible thru various multipath drivers (MPxIO), HDLM, Powerpath, etc. Automatically restarted by rpc.pmfd if it dies.
Validating basic cluster config
  • The sccheck (/usr/cluster/bin/sccheck) cmd validates the cluster configuration:
  • /var/cluster/sccheck is the repository where it stores the reports generated.
Disk Path Monitoring
  • scdpm -p all:all prints all disk paths in the cluster and their status
  • scinstall -pv checks the cluster installation status — package revisions, patches applied, etc.
  • Cluster release file: /etc/cluster/release
Shuting down cluster

scshutdown -y -g 30

Booting nodes in non-cluster mode

boot -x

Placing node in maintenance mode

scconf -c -q node=,maintstate

Reset the maintenance mode by rebooting the node or running

scconf -c -q reset By placing a node in a cluster in maintenance mode, we reduce the number of reqd. quorum votes and ensure that cluster operation is not disrupted as a result thereof).

Sunplex or Sun Cluster Manager is available on https\:\:3000.

VxVM Rootdg requirements for Sun Cluster
  • vxio major number has to be identical on all nodes of the cluster (check for vxio entry in /etc/name_to_major)
  • vxvm installed on all nodes physically connected to shared storage — on non-storage nodes, yvxvm can be used to encapsulate and mirror the boot disk. If not using VxVM on a non-storage node, use SVM. All is required in such a case is the vxio major number be identical to all other nodes of the cluster (add an entry in /etc/name_to_major file).
  • VxVM license is reqd. on all nodes not connected to a A5x00 StorEdge array.
  • Std rootdg created on all nodes where vxVM is installed. Options to initialize rootdg on each node are:
    • Encap boot disk so it can be mirroered. Preserve all data and creating volumes inside rootdg to encap /global/.devices/node@#
    • If disk has more than 5 slices on it, it cannot be encap'ed.
    • Initialize other local disks into rootdg.
  • Unique volume name and minor number across the nodes for the /global/.devices/node@# file system if the boot disk is encap'ed — the /global/.devices/node@# file system must be on devices with a unique name on each node, because it's mounted on each node for the same reason. The normal Solaris OS /etc/mnttab logic redates global fs and still demands that each device have a unique major/minor number. VxVM doesn't support changing minor numbers of individual volumes. The entire disk group has to be re-minored.

Use the following command:

# vxdg [ -g diskgroup ] [ -f ] reminor [diskgroup ] new-base-minor

From the vxdg man pages:

     reminor   Changes the base minor number for  a  disk  group,
               and  renumbers  all devices in the disk group to a
               range starting at that number.  If the device  for
               a  volume  is  open,  then  the  old device number
               remains in effect until the system is rebooted  or
               until  the disk group is deported and re-imported.
               Also, if you close an open volume, then  the  user
               can   execute  vxdg reminor  again  to  cause  the
               renumbering to take effect  without  rebooting  or
               reimporting.

               A new device number may also overlap with  a  tem-
               porary  renumbering for a volume device. This also
               requires a reboot or reimport for the  new  device
               numbering to take effect.  A temporary renumbering
               can happen in the following situations:  when  two
               volumes  (for  example,  volumes  in two different
               disk groups) share the same  permanently  assigned
               device number, in which case one of the volumes is
               renumbered temporarily to use an alternate  device
               number; or when the persistent device number for a
               volume was changed, but the active  device  number
               could  not be changed to match.  The active number
               may be left unchanged after  a  persistent  device
               number change either because the volume device was
               open, or because the new number was in use as  the
               active device number for another volume.

               vxdg fails if you try to use a  range  of  numbers
               that  is  currently  in use as a persistent (not a
               temporary) device number.  You can  force  use  of
               the  number range with use of the -f option.  With
               -f, some device renumberings may not  take  effect
               until  a  reboot or a re-import (just as with open
               volumes).  Also, if you force volumes in two  disk
               groups  to use the same device number, then one of
               the volumes is temporarily renumbered on the  next
               reboot.   Which volume device is renumbered should
               be considered random, except that  device  number-
               ings in the rootdg disk group take precedence over
               all others.
               The -f option should be used  only  when  swapping
               the  device number ranges used by two or more disk
               groups.  To swap the number ranges  for  two  disk
               groups,  you  would  use  -f  when renumbering the
               first disk group to use the range  of  the  second
               disk  group.  Renumbering the second disk group to
               the first range does not require the use of -f.
  • Sun Cluster does not work with Veritas DMP. DMP can be disabled before installing the software by putting in dummy symlinks, etc.
  • scvxinstall is a shell script that automates VxVM installation in a Sun Clustered environment
  • scvxinstall automates the following things:
    • tries to disable DMP (vxdmp)
    • installs correct cluster package
    • automatically negotiates a vxio major number and properly edits /etc/name_to_major
    • automates rootdg initialization process and encapsulates boot disk
    • gives different device names for the /global/.devices/node@# volumes on each side
    • edits the vfstab properly for this same volume. The problem is this particular line has DID device on it, and VxVM doesn't understand DID devices.
    • installs a script to "reminor" the rootdg on the reboot
    • reboots the node so that VxVM operates properly
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact