... The most critical for me was to stop following the TSM manual where it was telling me that all configurations files and all scripts for starting and stopping the TSM scheduler _must_ be on shared storage. This simply doesn't work.
The {{dsm.opt}} file for each TSM node (note that a TSM node is different to, and _not_ a cluster node!) can and generally should be on shared storage, mainly for consistency. The scripts for starting, stopping and probing the TSM services, however, need to be local and present on every node at all times. This availability of the scripts is what the cluster framework needs in order to add the resource into the cluster. If the script wasn't available on all nodes when I tried to create the resource, cluster spat the dummy.
After setting up the scripts and manually testing the TSM client to make sure the configuration was correct on all nodes, it was possible to add a new resource to the cluster of type {{SUNW.gds}} - a general data service. To add the scripts as a GDS resource into the cluster, the following command does the job:
{panel} {{# *clresource create -g www-rg -t SUNW.gds \*}} {{*-p Start_command="/etc/init.d/dsm.scheduler.cluster.sh /zones/webdata/tsm/dsm.opt start" \*}} {{*-p Probe_command="/etc/init.d/dsm.scheduler.cluster.sh webdata probe" \*}} {{*-p Stop_command="/etc/init.d/dsm.scheduler.cluster.sh webdata stop" \*}} {{*-p Network_aware=false webdata-backup-rs*}} {panel}
So in this example, the script {{/etc/init.d/dsm.scheduler.cluster.sh}} is located on local storage on all nodes and is identical across all nodes. The script is shown below. The file {{/zones/webdata/tsm/dsm.opt}} is located on shared storage and switches between nodes in the event of a failover. When the resource group starts on a different node, the script is run and the resource comes online. Curiously, the {{dsmcad}} daemon process doesn't need to be killed in the event of a failover. The cluster framework seems to take care of this, killing the process and allowing a clean failover. Also, making the resource not network aware removed the need for a logical host name for the resource group.
The script to start, stop, and probe the DSM client is shown below. The script could definitely be done better, however, it works. I've also noticed it may be possible to directly start and stop the scheduler process, {{dsmc}}, using the script. I haven't tried this. However, I'm sure it would work. Note that I include this script for informational purposes only, I don't promise that it will work for you.
{panel} {{#!/bin/ksh}}
{{# Generally, we should start up with something like this:}} {{# /opt/tivoli/tsm/client/ba/bin/dsmcad -optfile=/zstorage/build-test/tsm/dsm.opt}}
{{# set the necessary environment variables so that TSM doesn't vomit}} {{LC_CTYPE="en_US"}} {{export LC_CTYPE}} {{LANG="en_US"}} {{export LANG}} {{LC_LANG="en_US"}} {{export LC_LANG}} {{LC_ALL="en_US"}} {{export LC_ALL}}
{{# work out which argument is the command and which the config file}} {{case "$1" in}} {{ 'start'|'stop'|'probe')}} {{ COMMAND=$1}} {{ DSM_CONFIG=$2}} {{ ;;}} {{ *)}} {{ COMMAND=$2}} {{ DSM_CONFIG=$1}} {{esac}}
{{# now check what we want to do.}} {{case "$COMMAND" in}} {{ 'start')}} {{ # echo "starting" }} {{ # There has to be a better way to do this test....... }} {{ if test -f $DSM_CONFIG ; then}} {{ true}} {{ else}} {{ echo "Config file $DSM_CONFIG does not exist, exiting." }} {{ exit 1}} {{ fi}} {{ export DSM_CONFIG}} {{ # Check if there is already a dsmcad process running, if so, ignore the start command}} {{ PS=`ps -ef | grep -v grep | grep -v vi | grep -v probe | grep -v zoneadmd | grep -v "dsm.scheduler.cluster.sh" | grep -c "$DSM_CONFIG"` }} {{ if test "$PS" -eq "1" ; then}} {{ echo "dsmcad is already started for $DSM_CONFIG, will not start another." }} {{ ps -ef | grep -v grep | grep -v vi | grep -v probe | grep -v zoneadmd | grep -v "dsm.scheduler.cluster.sh" | grep "$DSM_CONFIG" }} {{ exit 0}} {{ elif test "$PS" -gt "1" ; then}} {{ echo "Seems to be too many processes running for dsmcad for $DSM_CONFIG, please check it."}} {{ exit 1}} {{ fi}}
{{ /opt/tivoli/tsm/client/ba/bin/dsmcad -optfile=$DSM_CONFIG}} {{ if test "$?" -ne "0" ; then}} {{ echo "Failed to start the dsm scheduler, exiting" }} {{ exit 1}} {{ fi}} {{ ;; }}
{{ 'stop')}} {{ # echo "stopping" }} {{ # For the most part, we ignore a stop command as the dsmcad should work out itself}} {{ # that it has to stop it's child process when the directory with it's password}} {{ # isn't available.}} {{ exit 0}} {{ ;;}}
{{ 'probe')}} {{ # echo "probing" }} {{ # WARNING: The following would produce a bug if "vi" is in the arguments...}} {{ # So make sure you avoid it, OK?}} {{ PS=`ps -ef | grep -v grep | grep -v vi | grep -v probe | grep -v zoneadmd | grep -c "$DSM_CONFIG"` }} {{ if test "$PS" -gt "0" ; then}} {{ # echo "Found $PS processes" }} {{ exit 0}} {{ else}} {{ echo "Found no processes" }} {{ exit 1}} {{ fi}} {{ ;;}}
{{ *)}} {{ # otherwise an invalid command was received, vomit.}} {{ echo "options { start | stop | probe }" }} {{ exit 1}}
|