|
Sun Grid Engine Information Center Planning the InstallationWhether you have installed previous versions of the Sun Grid Engine software or this is your first time, you must do some planning before you extract and install the software. This section describes the decisions that you must make, and, wherever possible, gives you criteria on which you can base your decisions. This section consists of the following topics:
Decisions That You Must MakeYou must make several decisions before you can plan the installation:
Gather the Necessary InformationBefore you install the Grid Engine software, you must plan how to achieve the results that fit your environment. This section helps you make the decisions that affect the rest of the procedure. Write down your installation plan in a table similar to the following example. You can view the worksheet alone (for printing).
If you are going to install Grid Engine 6.2 on a Windows system, acquire and install Microsoft Services For UNIX. See Microsoft Services For UNIX for more information. If you are going to install Grid Engine 6.2 on a Windows system, create the required Certificate Security Protocol (CSP) certificates before installing Grid Engine. See How to Install a CSP-Secured System for information about CSP certificates. Check Other Grid Engine Installation Issues for applicability. Disk Space RequirementsThe Grid Engine software directory tree has the following fixed disk space requirements:
The ideal disk space for Grid Engine system spool directories is as follows:
The spool directories of the master host and of the execution hosts are configurable and need not reside under the default location, sge-root.
$SGE_ROOT DirectoryYou must create a directory into which to load the contents of the distribution media. This directory is called the root directory, or $SGE_ROOT. When the Grid Engine system is running, this directory stores the current cluster configuration and all other data that must be spooled to disk.
Use a valid path name for the directory that is network-accessible on all hosts. For example, if the file system is mounted using automounter, set $SGE_ROOT to /usr/SGE6, not to /tmp_mnt/usr/SGE6.
The $SGE_ROOT directory is the top level of the Grid Engine software directory tree. On startup, each Grid Engine software component in a cell needs read access to the $SGE_ROOT/$SGE_CELL/common directory. When Grid Engine software is installed as a single cluster, the value of $SGE_CELL is default. For ease of installation and administration, this directory should be readable on all hosts on which you intend to run the Grid Engine software installation procedure. For example, you can select a directory that is available across a network file system, such as NFS. If you choose to select file systems that are local to the hosts, you must copy the installation directory to each host before you start the installation procedure for the particular machine. See File Access Permissions for a description of required permissions. Directory OrganizationWhen determining the directory organization, you must decide the following:
By default, the installation procedure installs the Grid Engine software, man pages, spool areas, and the configuration files in a directory hierarchy under the installation directory as shown in the following figure. If you accept this default behavior, you should install or select a directory with the access permissions that are described in File Access Permissions. Figure – Sample Directory Hierarchy
You can choose to put the spool areas in other locations during the primary installation. See Configuring Queues for more detailed instructions. CellsYou can set up the Grid Engine system as a single cluster or as a collection of loosely coupled clusters called cells. The $SGE_CELL environment variable indicates the cluster being referenced. When the Grid Engine system is installed as a single cluster, $SGE_CELL is not set, and the value default is assumed for the cell value. Cluster NameThe $SGE_CLUSTER_NAME environment variable supports unique naming of the cluster. Unlike the $SGE_CELL variable, there are restrictions on $SGE_CLUSTER_NAME. If you decide to use Grid Engine SMF services on Solaris 10 or later hosts, you must select a new $SGE_CLUSTER_NAME. This name becomes part of the name of the Sun Grid Engine SMF services. The $SGE_CLUSTER_NAME is also used to distinguish multiple rc files for different clusters.
User NamesFor the Grid Engine system to verify that users submitting jobs have permission to submit them on the desired execution hosts, users' names must be identical on the submit and execution hosts. You might therefore have to change user names on some machines, because Grid Engine user names map directly to system user accounts.
Installation AccountsYou can install the Grid Engine software either as the root user or as an unprivileged user, for example, your own user account. However, if you install the software when you are logged in as an unprivileged user, the installation allows only that user to run Grid Engine jobs. Access is denied to all other accounts. Installing the software when you are logged in as root resolves this restriction. However, root permission is required for the complete installation procedure. Also, if you install as an unprivileged user, you are not allowed to use the qrsh, qtcsh, or qmake commands, nor can you run tightly integrated parallel jobs.
File Access PermissionsIf you install the software logged in as root, you might have a problem configuring root read/write access for all hosts on a shared file system. Therefore, you might have problems putting the $SGE_ROOT files onto a network-wide file system. You can force Grid Engine software to run all Grid Engine system components through a non-root administrative user account, for example sgeadmin. With this setup, this particular user needs only read/write access to the shared $SGE_ROOT file system. The installation procedure asks whether files should be created and owned by an administrative user account. If you answer "Yes" and provide a valid user name, files are created by this user. Otherwise, the user name under which you run the installation procedure is used. Create an administrative user, and answer "Yes" to this question. Make sure in all cases that the account used for file handling on all hosts has read/write access to the $SGE_ROOT directory. Also, the installation procedure assumes that the host from which you access the Grid Engine software distribution media can write to the $SGE_ROOT directory.
Network ServicesDetermine whether your site's network services are defined in an NIS database or in an /etc/services file that is local to each workstation. If your site uses NIS, determine the host name of your NIS server so that you can add entries to the NIS services map. The Grid Engine system services are sge_execd and sge_qmaster. To add the services to your NIS map, choose reserved, unused port numbers. The following examples show sge_qmaster and sge_execd entries. sge_qmaster 6444/tcp sge_execd 6445/tcp Master HostThe master host controls the Grid Engine system. This host runs the master daemon sge_qmaster. The master host must comply with the following requirements:
Shadow Master HostsThese hosts back up the functionality of sge_qmaster in case the master host or the master daemon fails. To be a shadow master host, a machine must have the following characteristics:
The shadow master host facility is activated for a host as soon as these conditions are met. You do not need to restart the Grid Engine system daemons to make a host into a shadow master host.
Spool Directories under the Root DirectoryDuring the installation of the master host, you must specify the location of a spooling directory. This directory is used to spool jobs from execution hosts that do not have a local spooling directory.
You do not need to export these directories to other machines. However, exporting the entire $SGE_ROOT tree and making it write-accessible for the master host and all executable hosts makes administration easier.
Choosing Between Classic Spooling and Database SpoolingDuring the installation, you are given the option to choose between classic spooling and Berkeley DB spooling. If you choose Berkeley DB spooling, you are then given the option to spool to a local directory or to a separate host, known as a Berkeley DB spooling server. Using a Berkeley DB spooling server might provide better performance than classic spooling. Part of this performance increase is because the master host can make non-blocking writes to the database, but has to make blocking writes to the text file used by classic spooling. Also consider file format and data integrity. Writing to the Berkeley DB provides a greater level of data integrity than writing to a text file. However, a text file stores data in a format that you can read and edit. Normally, you do not need to read these files, but the spooling directory contains the messages from the system daemons, which can be useful for debugging. Database Server and Spooling HostThe master host can store its configuration and state to a Berkeley DB spooling database. The spooling database can be installed on the master server or on a separate host. When the Berkeley DB spools into a local directory on the master host, the performance is better. If you want to set up a shadow master host, you need to use a separate Berkeley DB spooling server (host). In this case, you have to choose a host with a configured RPC service. The master host connects through RPC to the Berkeley DB.
With the introduction of NFS4 software available with the Solaris TM 10 operating system, you can use Berkeley DB spooling on a network file system. You could not use Berkeley DB spooling on previous NFS versions. This circumstance allows a shadow host installation spooled on Berkeley DB without setting up an additional Berkeley DB Spooling Server.
If you choose to use Berkeley DB spooling without a shadow master, you do not need to set up a separate spooling server. Likewise, if you choose not to use Berkeley DB spooling, you can set up a shadow master host without setting up a separate spooling server. Once you determine whether you need a separate spooling server, you will also need to determine the location for the spooling directory. The spooling directory must be local to the spooling server. A default value for the location of the spooling directory is recommended during installation, but this default value is not suitable when the file server is different from the master host. The requirements for the Berkeley DB spooling host are similar to the requirements for the master host:
Execution HostsExecution hosts run the jobs that users submit to the Grid Engine system. An execution host must first be set up as an administration host. You run an installation script on each execution host. For more information, see How to Install Execution Hosts. Group IDsYou need to provide a range of IDs that will be assigned dynamically for jobs. The range must be big enough to provide enough numbers for the maximum number of Grid Engine jobs running at a single moment on a single host. A group ID is assigned to each Grid Engine job to monitor the resource utilization of the job. Each job will be assigned a unique ID while it is running. For example, a range of 20000-20100 allows 100 jobs to run concurrently on a single host. You can change the group ID range for your cluster configuration at any time, but the values in the UNIX group ID range must be unused on your system. Administration HostsOperators and managers of the Grid Engine system use administration hosts to perform administrative tasks such as reconfiguring queues or adding Grid Engine users. The master host installation script automatically makes the master host an administration host. During the master host installation process, you can add other administration hosts. You can also manually add administration hosts on the master host at any time after installation. Submit HostsJobs can be submitted and controlled from submit hosts. The master host installation script automatically makes the master host a submit host. Cluster QueuesThe installation procedure creates a default cluster queue structure, which is suitable for getting acquainted with the system. The default queue can be removed after installation.
Consider the following when determining a queue structure:
For more detailed information on administering cluster queues, see Configuring Queues. Scheduler ProfilesYou can choose from three scheduler profiles during the installation process: normal, high, and max. You can use these predefined profiles as a starting point for Grid Engine tuning. Using these profiles, you can optimize the scheduler for one or more of the following:
You can choose from three scheduler profiles:
For more information on how to configure scheduling, see Administering the Scheduler. Installation MethodSeveral methods are available for installing the Grid Engine software:
To decide which installation method you should use, consider the following factors.
Check the Other Installation Issues AppendixIf you are installing Grid Engine on a Linux system or on a system with IPMP, see Other Grid Engine Installation Issues for important information. |

Comments (6)
Aug 27, 2008
Dean_Stanton says:
The long SMF commands do not print fully (from my firefox, even with landscape p...The long SMF commands do not print fully (from my firefox, even with landscape page format). I suggest breaking them with a backslash and a newline.
Also, when I tried these commands on a Solaris 10 11/06 SPARC system, I got this error
UX: roleadd: ERROR: solaris.smf.manage.sge is not a valid profile name. Choose another.
Oct 13, 2008
surajp says:
I have fixed the first comment. I will look into the error message encountered f...I have fixed the first comment. I will look into the error message encountered for Solaris 10 11/06 and update the doc accordingly.
Dec 05, 2008
SKODell says:
Hi, I also receive the error "UX: roleadd: ERROR: solaris.smf.manage.sge is not...Hi,
I also receive the error "UX: roleadd: ERROR: solaris.smf.manage.sge is not a valid profile name."
Is there an description somewhere regarding how this profile should be created?
My environment is:
SunOS snv_91 i86pc i386 i86xpv
Thank you.
Sean
====
Dec 16, 2008
Lubomir.Petrik says:
Hi Dean and SKODell, I've updated the instructions. There was a missing step to ...Hi Dean and SKODell,
I've updated the instructions. There was a missing step to create the authorization/profile in the first place.
Also I hope that the documentation is now more clear about what the sge_smf role can do for you.
Regards,
Lubos.
Jan 14, 2009
Dean_Stanton says:
#PlanningtheInstallation-SpoolDirectoriesundertheRootDirectory says qmaster-s...#PlanningtheInstallation-SpoolDirectoriesundertheRootDirectory says
and
In the former, "qmaster" is literal. But in the latter, "exec-host" is metasyntax meaning the name of the execution host. There is no typography to clarify this, such as putting exec-host in italics.
Feb 16, 2009
surajp says:
The replaceable "exec-host" has been changed to "execution-host-name".The replaceable "exec-host" has been changed to "execution-host-name".