Release Notes
This document provides information about the following:
New Features
Sun Grid Engine 6.2u3 has the following new features:
- Inspect
- SDM Cloud Service Adapter
- SDM Simple Install
- Exclusive Scheduling
- Power Saving
- Microsoft Windows Vista Display Support
Inspect
Sun Grid Engine Inspect is a monitoring tool that provides a flexible interface for viewing current and historical data about your Sun Grid Engine cluster(s) and the Service Domain Manager.
For more information on the feature, see Inspect.
For information on known issues and limitations, see below.
SDM Cloud Service Adapter
The Cloud Service Adapter provides the ability to expand the cluster size to execute jobs on additional virtual nodes on the Amazon Elastic Compute Cluster (EC2).
This new feature is a service adapter for the SDM system that provisions additional resources for use in a SDM system from the cloud (Amazon EC2 service). These cloud resources are provisioned and deprovisioned on demand based on configurable parameter settings and are integrated into the system via a VPN (virtual private network). In conjunction with the SDM Grid Engine Service Adapter, this allows for automated scaling of a Grid Engine cluster based on the load in the cluster.
This feature can also be used to implement a power saving mechanism, powering off machines in the system that are unused and starting them up again dynamically when a need arises, see Using the SDM Cloud Service Adapter for Power Saving.
In order to use this feature, a number of prerequisites have to be met, see Prerequisites and Restrictions for full details. You need an Amazon Web Services account with access to the EC2 service, see http://aws.amazon.com/ec2.
For more information, see SDM Cloud Service Adapter.
SDM Simple Install
It is now possible to install and run an SDM system with only one JVM per (managed or master) host. Previously, the system was using up to three separate JVMs per host. This new feature simplifies installation, configuration and maintenance.
For more information, see How to Install SDM.
Exclusive Scheduling
This exclusive scheduling feature allows you to request that a job be given exclusive non-shared access to hosts.
For more information, see Configuring Exclusive Scheduling.
Power Saving
The power saving feature allows you to use the SDM Cloud Service Adapter to implement power saving.
For more information, see Using the SDM Cloud Service Adapter for Power Saving.
Microsoft Windows Vista Display Support
With the release of Sun Grid Engine 6.2u3, the display_win_gui feature is now fully supported. display_win_gui can now be used to display a job GUI on the visible Desktop, on both the 32- and 64-bit versions of Windows Vista (Enterprise and Ultimate Edition), and on Windows Server 2008.
This feature allows a Sun Grid Engine job to request the display_win_gui complex variable, which launches a GUI on the currently visible Desktop on the Windows host that displays job information. This works only if the job is a native Windows application.
Updated Features
GUI Installer
The GUI Installer now includes the following features:
- The ability to adjust several host configuration options directly on the host selection screen due to new pop-up functionality.
- A continue button at the end of the installation process allows you to install additional hosts based upon the settings specified for already-installed hosts.
- Improved Java detection and version checking in the start_gui_installer script.
- Auto-detection of the JVM library.
- The properties for any hosts that are not successfully installed are automatically saved so that you may attempt to install them again at a later time.
- An improved installation report.
- The capability to build an environment-independent demo version of the GUI Installer that emulates the installation process.
For more information see, Installing the Software With the GUI Installer and Installing the Software From the Command-Line.
Supported Platforms and Patches
Sun Grid Engine 6.2u3
For information about the supported platforms and patches for Sun Grid Engine 6.2u3, see Patch Matrix.
Sun Grid Engine 6.2
The following are the supported platforms and operating system patches for the Sun Grid Engine 6.2 release and subsequent update releases:
Known Issues and Limitations
The following are the known issues and limitations of Sun Grid Engine 6.2u3:
- GUI Installer
- Inspect
- Service Domain Manager
- Accounting and Reporting Console
- Other
- Two Files Missing From 6.2u3 Packages for SDM and Inspect
- Installing And Testing SoyLatte JDK 6
- Using Sun Grid Engine 6.2u3 with an Existing Grid Engine Cluster
- Missing OpenMotif Library for QMON on Mac OS X
- LD_LIBRARY_PATH Settings and DRMAA
- File Descriptor Limit for Master Host
- Limiting the Number of Dynamic Client Events
- Berkeley DB Requires That the Database Files Reside on the Local Disk in Certain Situations
- Busy QMON With Large Array Task Numbers
- Resource Reservation Only Considers Pending Jobs
- Log Files for Automatic Installation
- Mandatory Update of libm.so.3.5 on All Windows Hosts
Two Files Missing From 6.2u3 Packages for SDM and Inspect
| Note - If you do not use the Service Domain Manager or Inspect features, you can ignore this issue. |
The juti.jar and jgdi.jar files are missing from the common patch packages. The affected patch IDs are 139849-05 for tar/gz format and 139835-05 for pkgadd. The juti.jar and jgdi.jar files are required to run Service Domain Manager or Sun Grid Engine Inspect features.
You can get the files from the Sun Grid Engine Inspect package. Follow these steps:
1. Download the Sun Grid Engine Inspect package from the SDLC download site.
2. Copy the juti.jar and jgdi.jar files into $SGE_ROOT/lib.
Installing and Testing SoyLatte JDK 6
If the JMX thread does not work with the SoyLatte JDK6 libjvm.dylib, then choose the Apple /Library/Java/Home/../Libraries/libjvm.dylib instead.
Using Sun Grid Engine 6.2u3 with an Existing Grid Engine Cluster
You can install the Sun Grid Engine 6.2u3 software in an environment that has an existing Sun Grid Engine cluster. To run the Sun Grid Engine 6.2u3 software in parallel with an existing Sun Grid Engine environment, do the following:
- Use a different $SGE_ROOT directory and different TCP ports for the master daemon and the execution daemons.
- When using AIX, do not install a system-wide startup script during manual or automatic installation. If you install a system-wide startup script, it will overwrite the older Grid Engine startup script for the master daemon and the execution daemons.
- If you decide to install two execution daemons on one host, use a gid_range that differs from the global/local cluster configuration.
- On Microsoft Windows systems, you can only install the optional Grid Engine Helper Service for one Grid Engine instance. If you have already installed this service for a previous release of Sun Grid Engine, you may not use it for Sun Grid Engine 6.2u3. This means that you will not be able to run Sun Grid Engine 6.2u3 jobs that require a GUI on the Windows desktop.
- Verify that variables are pointing to the correct instance of Grid Engine. Specifically, check the port settings, the PATH variable, and the LD_LIBRARY_PATH variable. For Solaris and Linux, it is not necessary to set the LD_LIBRARY_PATH variable.
Missing OpenMotif Library for QMON on Mac OS X
The default Mac OS X installation does not include the OpenMotif library that QMON needs. You can get the OpenMotif library for the PowerPC and x86 architectures from various web sites, such as http://www.ist-inc.com/DOWNLOADS/openmotif_download.html. You can also find information about how to install packages that have been ported to Mac OS X at http://www.macports.org.
LD_LIBRARY_PATH Settings and DRMAA
When you use Java bindings with DRMAA, verify that the LD_LIBRARY_PATH is set correctly.
| Note If you are using a 32-bit Java Virtual Machine (JVM), you must set the LD_LIBRARY_PATH to the 32-bit shared DRMAA library (for example, $SGE_ROOT/lib/sol-sparc), even when your application actually runs on a 64-bit operating system platform. |
File Descriptor Limit for Master Host
You should set a high file descriptor limit in the kernel configuration on hosts that are designated to run the sge_qmaster daemon. You might want to set a high file descriptor limit on the shadow master hosts as well. A large number of available file descriptors enables the communication system to keep connections open instead of having to constantly close and reopen them. If you have many execution hosts, a high file descriptor limit significantly improves performance. Set the file descriptor limit to a number that is higher than the number of intended execution hosts. You should also make room for concurrent client requests, in particular for jobs submitted with qsub -sync or when you are running DRMAA sessions that maintain a steady communication connection with the master daemon. Refer to you operating system documentation for information about how to set the file descriptor limit.
Limiting the Number of Dynamic Client Events
The number of concurrent dynamic event clients is limited by the number of file descriptors. The default is 99. Dynamic event clients are jobs submitted with the qsub -sync command and a DRMAA session. You can limit the number of dynamic event clients with the qmaster_params global cluster configuration setting. Set this parameter to MAX_DYN_EC=n. See the sge_conf(5) man page for more information.
Berkeley DB Requires That the Database Files Reside on the Local Disk in Certain Situations
Berkeley DB requires that the database files reside on the local disk, if qmaster is not running on Solaris 10 and uses a NFSv4 mount (full NFSv4 compliant clients and servers from other vendors are also supported, but have not yet been tested.) If the sge_qmaster cannot be run on the file server intended to store the spooling data (for example, if you want to use the shadow master facility), a Berkeley DB RPC server can be used. The RPC server runs on the file server and connects with the Berkeley DB sge_qmaster instance. However, Berkeley DB's RPC server uses an insecure protocol for this communication and so it presents a security problem. Do not use the RPC server method if you are concerned about security at your site. Use sge_qmaster local disks for spooling instead and, for fail-over, use a high availability solution such as Sun Cluster, which maintains host local file access in the fail-over case.
Busy QMON With Large Array Task Numbers
If large array task numbers are used, you should use "compact job array display" in the QMON Job Control dialog box customization. Otherwise the QMON GUI will cause high CPU load and show poor performance.
Resource Reservation Only Considers Pending Jobs
Resource reservation currently takes only pending jobs into account. Consequently, jobs that are in a hold state due to the submit options -a time and -hold_jid joblist, and are thus not pending, do not get reservations. Such jobs are treated as if the -R n submit option were specified for them.
Log Files for Automatic Installation
The automatic installation option does not provide full diagnostic information in case of installation failures. If the installation process aborts, check for the presence and the contents of an installation log file in
$SGE_ROOT/$SGE_CELL/common/install_logs/qmaster_hostname_install_timestamp.log, $SGE_ROOT/$SGE_CELL/common/install_logs/execd_hostname_install_timestamp.log, $SGE_ROOT/$SGE_CELL/common/install_logs/shadowd_hostname_install_timestamp.log, $SGE_ROOT/$SGE_CELL/common/install_logs/bdb_hostname_install_timestamp.log or in /tmp/install.pid.
Mandatory Update of libm.so.3.5 on All Windows Hosts
With the release of Sun Grid Engine 6.2u3, you must update libm.so.3.5 from the version delivered with Microsoft Services for UNIX] (SFU) or Microsoft Subsystem for UNIX-based Applications (SUA) to a newer version.
SFU systems are the following:
- Microsoft Windows Server 2003
- Windows XP Professional with at least Service Pack 1
- Windows 2000 Server with at least Service Pack 3
- Windows 2000 Professional with at least Service Pack 3
SUA systems are the following:
- Microsoft Windows Server 2003 Release 2
- Windows Server 2008
- Windows Vista Enterprise
- Windows Vista Ultimate
To update libm.so.3.5, do the following:
- Download the package (ftp://ftp.interopsystems.com/pkgs/3.5/libm-current-bin.tgz) and extract it to a temporary directory.
This package is meant to be used with package manager, but you can also extract it manually. - Make a backup of libm.so.3.5 by doing the following:
- If you are using SFU, as the local Administrator on each SFU host, please make a backup of /usr/lib/libm.so.3.5.
- If you are using SUA, please make a backup of /usr/lib/x86/libm.so.3.5 on SUA hosts.
- Copy the libm.so.3.5 from the downloaded package by doing the following:
- If you are using SFU, copy the libm.so.3.5 from the downloaded package to /usr/lib.
- If you are using SUA, copy the libm.so.3.5 from the downloaded package to /usr/lib/x86.
New Features From Recent Releases
Sun Grid Engine 6.2u2 Features
- GUI Installer
- Microsoft Windows Vista Support
- Job Submission Verifier (JSV)
- Consumable Resources Per Job
- jemalloc Library
Sun Grid Engine 6.2 Features
The following are the new features in Sun Grid Engine 6.2:
- Multi-Clustering With Service Domain Manager
- Improved Scalability and Job Throughput
- Advance Reservations
- New Support for Interactive Jobs
- Multi-Cluster Support for Accounting and Reporting Console
- Solaris SMF Support
- Sun Service Tags Support
- Ability to Request the Master and Slave Queues for Parallel Jobs
- New Unix Resource Limits Support
- New Upgrade Procedure
Important Information About Sun Grid Engine 6.2u4
The Sun Grid Engine 6.2u4 update release is a minor update to the Sun Grid Engine 6.2u3 release.
| Note - When you update to Sun Grid Engine 6.2u4, you must reset the permissions on one file. After you complete the primary update tasks (stop old cluster, update binaries, start updated cluster), type the following command as root with settings sourced: $ chmod 600 $SGE_ROOT/$SGE_CELL/common/jmx/management.properties |
The Sun Grid Engine 6.2u4 update provides the following significant change:
- The qsub command has a new experimental switch. Use qsub -tc to limit the number of concurrently running array tasks of that job. See also the max_aj_tasks parameter in the global configuration for global settings of array task concurrency. For Sun Grid Engine 6.2u4, the -tc switch is only available for the qsub command. This switch will be added to the qmon and qalter commands beginning with the 6.2u5 release.
For more information about the Sun Grid Engine 6.2u4 release, see the patch README in the software.
|
Participate
|
Learn More
|

