View Source

{section}
{column:width=25%}
{livesearch:spaceKey=SunCluster}
h5. Contents
{children:all=true}
h5. [Index]
{column}
{column:width=75%}
This page contains the following information:
----
{toc:type=list|style=none|minLevel=5|maxLevel=5|indent=0px}
----
h5. Sun Cluster Set up
Don't mix PCI and SBus SCSI devices

h5. Quorum Device Rules
* A quorum device must be available to both nodes in a 2-node cluster
* quorum device info is maintained globally in the CCR db
* quorum device should contain user data
* Max and optimal number of votes contributed by quorum devices must be N \-1 (where N == number of nodes in the cluster)
* If \# of quorum devices >= \# of nodes, Cluster cannot come up easily if there are too many failed/errored quorum devices
* quorum devices are not required in clusters with more than 2 nodes, but recommended for higher cluster availability
* quorum devices are manually configured after Sun Cluster s/w installation is done
* quorum devices are configured using DID devices

h6. Quorum Math and Consequences
A running cluster is always aware of (Math):
* Total possible Q votes (number of nodes + disk quorum votes)
* Total present Q votes (number of booted nodes + available quorum device votes) \-\-> Total needed Q votes ( >= 50% of possible votes)

Consequences:
* Node that cannot find adequate Q votes will freeze, waiting for other nodes to join the cluster
* Node that is booted in the cluster but can no longer find the needed number of votes kernel panics

{{installmode}} Flag — allows for cluster nodes to be rebooted after/during initial installation without causing the other (active) node(s) to panic.

h5. Cluster status
h6. Reporting the cluster membership and quorum vote information

{panel}
{{# */usr/cluster/bin/scstat \-q*}}
{panel}

h6. Verifying cluster configuration info

{panel}
{{# *scconf \-p*}}
{panel}

h6. Run {{scsetup}} to correct any configuration mistakes and/or to:
* add or remove quorum disks
* add, remove, enable, disable cluster transport components
* register/unregister vxVM device groups
* add/remove node access from a VxVM device group
* change clsuter private host names
* change cluster name

h6. Shuting down cluster on all nodes

{panel}
{{# *scshutdown \-y \-g 15*}}
{panel}

{panel}
{{# *scstat*}} #verifies cluster status
{panel}

h6. Cluster Daemons

{panel}
{{lahirdx@aescib1:/home/../lahirdx > *ps \-ef|grep cluster|grep \-v grep*}}
{{root 4 0 0 May 07 ? 352:39 cluster}}
{{root 111 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/qd_userd}}
{{root 120 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/failfastd}}
{{root 123 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/clexecd}}
{{root 124 123 0 May 07 ? 0:00 /usr/cluster/lib/sc/clexecd}}
{{root 1183 1 0 May 07 ? 46:45 /usr/cluster/lib/sc/rgmd}}
{{root 1154 1 0 May 07 ? 0:07 /usr/cluster/lib/sc/rpc.fed}}
{{root 1125 1 0 May 07 ? 23:49 /usr/cluster/lib/sc/sparcv9/rpc.pmfd}}
{{root 1153 1 0 May 07 ? 0:03 /usr/cluster/lib/sc/cl_eventd}}
{{root 1152 1 0 May 07 ? 0:04 /usr/cluster/lib/sc/cl_eventlogd}}
{{root 1336 1 0 May 07 ? 2:17 /var/cluster/spm/bin/scguieventd \-d}}
{{root 1174 1 0 May 07 ? 0:03 /usr/cluster/bin/pnmd}}
{{root 1330 1 0 May 07 ? 0:01 /usr/cluster/lib/sc/scdpmd}}
{{root 1339 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/cl_ccrad}}
{panel}

* FF Panic rule — failfast will shutdown the node (panic the kernel) if specified daemon is not restarted within 30 seconds
* {{cluster}} — System proc created by the kernel to encap kernel threads that make up the core kernel range of operations. It directly panics the kernel if it's sent a {{KILL}} signal ({{SIGKILL}}). Other signals have no effect.
* {{clexecd}} — This is used by cluster kernel threads to execute userland cmds (such as {{run_reserve}} and {{dofsck}} cmds). It is also used to run cluster cmds remotely (eg: {{scshutdown}}).A failfast driver panics the kernel if this daemon is killed and not restarted in 30 seconds.
* {{cl_eventd}} — This daemon registers and forwards cluster events s(eg: nodes entering and leaving the cluster). With a min of SC 3.1 10/03, user apps can register themselves to receive cluster events. The daemon automatically gets respawned by {{rpc.pmfd}} if it is killed.
* {{rgmd}} — This is the resource group mgr, which manages the state of all cluster\-unaware applications. A failfast driver panics the kernel if this daemon is killed by not started in 30 seconds.
* {{rpc.fed}} — This is the "fork\-and\-exec" daemon, which handles reqs from {{rgmd}} to spawn methods for specific data services. failfast will hose the box if this is killed and not restarted in 30 seconds.
* {{scguieventd}} — This daemon processes cluster events for the SunPlex or Sun Cluster Mgr GUI, so that the display can be updated in real time. It's not automatically started if it stops. If you are having trouble with SunPlex or Sun Cluster Mgr, might have to restart the daemon or reboot the specific node.
* {{rpc.pmfd}} — This is the process monitoring facility. It is i used as a general mech to initiate restarts and failure action scripts for some cluster f/w daemons, and for most app daemons and app fault monitors. FF panic rule holds good.
* {{pnmd}} — This is the public Network mgt daemon, and manages n/w status info received from the local IPMP ({{in.mpathd}}) running on each node in the cluster. It is automatically restarted by {{rpc.pmfd}} if it dies.
* {{scdpmd}} — multi\-threaded DPM daemon runs on each node. DPM daemon is started by an rc script when a node boots. It montiors the availability of logical path that is visible thru various multipath drivers (MPxIO), HDLM, Powerpath, etc. Automatically restarted by {{rpc.pmfd}} if it dies.

h6. Validating basic cluster config
* The {{sccheck}} ({{/usr/cluster/bin/sccheck}}) cmd validates the cluster configuration:
* {{/var/cluster/sccheck}} is the repository where it stores the reports generated.

h6. Disk Path Monitoring
* {{*scdpm \-p all:all*}} prints all disk paths in the cluster and their status
* {{*scinstall \-pv*}} checks the cluster installation status — package revisions, patches applied, etc.
* Cluster release file: {{/etc/cluster/release}}

h6. Shuting down cluster

{panel}
{{*scshutdown \-y \-g 30*}}
{panel}

h6. Booting nodes in non\-cluster mode
{panel}
{{*boot \-x*}}
{panel}

h6. Placing node in maintenance mode
{panel}
{{*scconf \-c \-q node=,maintstate*}}
{panel}

h6. Reset the maintenance mode by rebooting the node or running
{panel}
{{*scconf \-c \-q reset*}} By placing a node in a cluster in maintenance mode, we reduce the number of reqd. quorum votes and ensure that cluster operation is not disrupted as a result thereof).
{panel}

Sunplex or Sun Cluster Manager is available on {{https\:\:3000}}.

h5. VxVM Rootdg requirements for Sun Cluster

* {{vxio}} major number has to be identical on all nodes of the cluster (check for {{vxio}} entry in {{/etc/name_to_major}})
* {{vxvm}} installed on all nodes physically connected to shared storage — on non\-storage nodes, yvxvm can be used to encapsulate and mirror the boot disk. If not using VxVM on a non\-storage node, use SVM. All is required in such a case is the {{vxio}} major number be identical to all other nodes of the cluster (add an entry in {{/etc/name_to_major}} file).
* VxVM license is reqd. on all nodes not connected to a A5x00 StorEdge array.
* Std rootdg created on all nodes where vxVM is installed. Options to initialize rootdg on each node are:
** Encap boot disk so it can be mirroered. Preserve all data and creating volumes inside rootdg to encap {{/global/.devices/node@#}}
** If disk has more than 5 slices on it, it cannot be encap'ed.
** Initialize other local disks into rootdg.
* Unique volume name and minor number across the nodes for the {{/global/.devices/node@#}} file system if the boot disk is encap'ed — the {{/global/.devices/node@#}} file system must be on devices with a unique name on each node, because it's mounted on each node for the same reason. The normal Solaris OS {{/etc/mnttab}} logic redates global fs and still demands that each device have a unique major/minor number. VxVM doesn't support changing minor numbers of individual volumes. The entire disk group has to be re\-minored.

Use the following command:

{panel}
{{# *vxdg \[ \-g diskgroup \] \[ \-f \] reminor \[diskgroup \] new\-base\-minor*}}
{panel}

From the {{vxdg}} man pages:

{noformat}
reminor Changes the base minor number for a disk group,
and renumbers all devices in the disk group to a
range starting at that number. If the device for
a volume is open, then the old device number
remains in effect until the system is rebooted or
until the disk group is deported and re-imported.
Also, if you close an open volume, then the user
can execute vxdg reminor again to cause the
renumbering to take effect without rebooting or
reimporting.

A new device number may also overlap with a tem-
porary renumbering for a volume device. This also
requires a reboot or reimport for the new device
numbering to take effect. A temporary renumbering
can happen in the following situations: when two
volumes (for example, volumes in two different
disk groups) share the same permanently assigned
device number, in which case one of the volumes is
renumbered temporarily to use an alternate device
number; or when the persistent device number for a
volume was changed, but the active device number
could not be changed to match. The active number
may be left unchanged after a persistent device
number change either because the volume device was
open, or because the new number was in use as the
active device number for another volume.

vxdg fails if you try to use a range of numbers
that is currently in use as a persistent (not a
temporary) device number. You can force use of
the number range with use of the -f option. With
-f, some device renumberings may not take effect
until a reboot or a re-import (just as with open
volumes). Also, if you force volumes in two disk
groups to use the same device number, then one of
the volumes is temporarily renumbered on the next
reboot. Which volume device is renumbered should
be considered random, except that device number-
ings in the rootdg disk group take precedence over
all others.
The -f option should be used only when swapping
the device number ranges used by two or more disk
groups. To swap the number ranges for two disk
groups, you would use -f when renumbering the
first disk group to use the range of the second
disk group. Renumbering the second disk group to
the first range does not require the use of -f.
{noformat}

* Sun Cluster does not work with Veritas DMP. DMP can be disabled before installing the software by putting in dummy symlinks, etc.
* {{scvxinstall}} is a shell script that automates VxVM installation in a Sun Clustered environment
* {{scvxinstall}} automates the following things:
** tries to disable DMP (vxdmp)
** installs correct cluster package
** automatically negotiates a {{vxio}} major number and properly edits /etc/name_to_major
** automates rootdg initialization process and encapsulates boot disk
** gives different device names for the {{/global/.devices/node@#}} volumes on each side
** edits the vfstab properly for this same volume. The problem is this particular line has DID device on it, and VxVM doesn't understand DID devices.
** installs a script to "reminor" the rootdg on the reboot
** reboots the node so that VxVM operates properly
{column}
{section}

The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact