Other Installation Issues

Searching Sun Grid Engine 6.2

Sun Grid Engine Information Center
Installing Sun Grid Engine
Index


Other Sun Grid Engine Installation Issues

Additional considerations for installing Sun Grid Engine software are identified in this section. These include the following topics:

Verifying and Installing Linux Motif Libraries

On newer Linux systems, the libXm.so.2 Motif libraries are not always installed, which results in the inability to run the precompiled Linux qmon binary.

To correct this problem, follow these steps:

  1. Check if the libraries are already present.
    % ls -l /usr/X11R6/lib/libXm*
    

    If the /usr/X11R6/lib/libXm.so.2 points to a libXm.so.2.x version, you are done. Note that a symbolic link to /usr/X11R6/lib/libXm.so.3 does not work.
    If the libraries are not present, then continue following these steps.

  2. Download the corresponding openmotif libraries from http://www.ist.co.uk/DOWNLOADS/motif_download.html or from the SUSE 9.1 distribution (an additional rpm file called openmotif21-* is available).

  3. Install the missing libraries as root.
    For SUSE 9.1, you install the openmotif21-* package like any other package. For packages downloaded from http://www.ist.co.uk, install the libraries as shown in the following example.
    # rpm -i --prefix /tmp/test --force \
          openmotif-2.1.31-2_IST-JDS2003.i386.rpm
    # cd /tmp/test/OpenMotif-2.1.31/lib
    # cp libXm.so.2.1 /usr/X11R6/lib
    # cd /usr/X11R6/lib
    # ln -s libXm.so.2.1 libXm.so.2
    


  4. Test qmon.
    % ldd `which qmon`
    

Installing the Grid Engine on a System With IPMP

This section describes how to install the Grid Engine software on hosts with the Solaris Operating Environment IP Multipathing (IPMP) technology.

What Is IP Multipathing?

IP Multipathing is a technology that allows TCP/IP interfaces to be grouped for failover and load balancing purposes. If an interface within an IP Multipathing group fails, the interface is disabled and its IP address is relocated to another interface in the group. Outbound IP traffic is distributed across the interfaces of a group. For further details on IP Multipathing, refer to the Solaris Operating Environment documentation at http://docs.sun.com/app/docs/doc/816-4554/ipmptm-1.

Issues Between IPMP and Grid Engine

When starting the Grid Engine daemons on a machine where the main interface is part of an IPMP group, error messages appear. When the IPMP load balancing distributes the connections across the interfaces in the group, the IP packets show up at the receiving end as coming from a different host from the one associated with the main interface. For example, on a machine with three interfaces named qfe0, qfe1, and qfe3, where the IP addresses for these interfaces are 10.1.1.1, 10.1.1.2, and 10.1.1.3 respectively, IPMP would need an extra address for each interface for testing. However, that requirement is ignored in this example. Each of these addresses has a host name associated with it. The hosts table looks like the following example:

10.1.1.1 sge
10.1.1.2 sge-qfe1
10.1.1.3 sge-qfe2

The machine's host name is sge. When a connection is established from sge to another machine, it might go through sge, sge-qfe1, or sge-qfe2. Upon installation, Grid Engine will only recognize sge. When Grid Engine receives a connection request from sge-qfe2, it closes the connection because the request is not from one of the authorized (or known) nodes.

To solve this problem, use the host_aliases files to "tell" Grid Engine that sge, sge1, and sge-qfe2 are all from the same machine. See the sge_h_aliases man page for details. The host_aliases file in this case would look like this:

sge sge-qfe1 sge-qfe2
Note
If you make any changes to the $SGE_ROOT/$SGE_CELL/common/host_aliases file, you must stop and restart all running Grid Engine daemons (sge_qmaster and sge_execd). To do this, log in as root to all your Grid Engine hosts and enter these commands:
/etc/init.d/sgemaster stop
/etc/init.d/sgeexecd stop
/etc/init.d/sgemaster start
/etc/init.d/sgeexecd start

Installing the Grid Engine Master Node With IPMP

There are two ways that you can fix this problem:

  • Ignore the error messages during installation. This method is operating system independent (except for MS Windows).
  • Temporarily disable IPMP on the interface associated with the machine's host name. This method only works on systems running at least Version 8 of the Solaris OS.

Ignoring the Error Messages

To ignore the error messages, follow these steps:

  1. Run the inst_sge -m command while ignoring the error messages during the start up of the daemons.

  2. Shut down the daemons with the /etc/init.d/sgemaster stop and /etc/init.d/sgemaster stop commands.
    Due to the networking errors, some daemons fail to shutdown and must be killed with the kill -9 command. To see which daemons failed to shutdown use this command: ps -e | grep sge_.

  3. Install the host_aliases file in the $SGE_ROOT/$SGE_CELL/common directory.

  4. Restart the daemons with the /etc/init.d/sgemaster start and /etc/init.d/sgeexecd start commands.

Temporarily Disabling IPMP

To temporarily disable IPMP, follow these steps:

  1. Identify the interface associated with the machine's host name.

  2. Verify that the interface has IPMP enabled by using the ifconfig interface | grep groupname command.

  3. Take note of the group name.

  4. Disable IPMP with this command: ifconfig interface group "" .

  5. Install the Grid Engine master node.

  6. Install the host_aliases file in the $SGE_ROOT/$SGE_CELL/common directory.

  7. Restart the daemons with the with the /etc/init.d/sgemaster and /etc/init.d/sgeexecd commands.

  8. Re-enable IPMP using the following command: ifconfig interface group _IPMP group.

Installing a Grid Engine on an Execution Host With IPMP

Once the host_aliases file is installed and the Grid Engine daemons are restarted, you can simply start the execution host installation without further problems.

Enabling Administrative and Submit Hosts With IPMP

You have two choices when enabling these hosts with IPMP:

  • Follow the same procedure used for the execution host (updating the host_aliases file before installation).
  • Add all the host names associated with the administrative or submit host with one of the following commands:
    • For the administrative host:
      qconf -ah <hostname> <alias 1> <alias 2> ...
      
    • For the submit host:
      qconf -as <hostname> <alias 1> <alias 2> ...
      

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact