Automating Grid Engine Functions Through DRMAA

Searching Sun Grid Engine 6.2

Sun Grid Engine Information Center
Using Sun Grid Engine
Index


Automating Grid Engine Functions Through the Distributed Resource Management Application API

You can automate Sun Grid Engine functions by writing scripts that run Sun Grid Engine commands and parse the results. However, for more consistent and efficient results, you can use the C or Java language and the Distributed Resource Management Application API. This section introduces the DRMAA concept and explains how to use it with the C and Java languages.

Introduction to Distributed Resource Management Application API (DRMAA)

The Distributed Resource Management Application API (DRMAA, which is pronounced like "drama") is an Open Grid Forum specification to standardize job submission, monitoring, and control in Distributed Resource Management Systems (DRMS). The objective of the DRMAA Working Group was to produce an API that would be easy to learn, easy to implement, and that would enable useful application integrations with DRMS in a standard way.

The DRMAA specification is language, platform, and DRMS agnostic. A wide variety of systems should be able to implement the DRMAA specification. To provide additional guidance for DRMAA implementation in specific languages, the DRMAA Working Group also produced several DRMAA language binding specifications. These specifications define what a DRMAA implementation should resemble in a given language.

The DRMAA specification is currently at version 1.0. The DRMAA Java Language Binding Specification is also at version 1.0, as is the DRMAA C Language Binding Specification. Sun Grid Engine provides implementations of both the 1.0 Java language binding and the 1.0 C language binding. For more information about the DRMAA 1.0 specification, see the language specific binding specifications on the Open Grid Forum DRMAA Working Group Web Site

Developing With the C Language Binding

Important Files for the C Language Binding

To use the DRMAA C language binding implementation included with Sun Grid Engine, you need to know where to find the important files. The most important file is the DRMAA header file that you included from your C application to make the DRMAA functions available to your application. The DRMAA header file resides in the $SGE_ROOT/include/drmaa.h, where $SGE_ROOT defaults to /usr/SGE. For detailed reference information about the DRMAA functions, see section 3 of the Sun Grid Engine man pages, located in the $SGE_ROOT/man directory. To compile and link your application, use the DRMAA shared library at $SGE_ROOT/lib/$SGE_ARCH/libdrmaa.so.

Including the DRMAA Header File

To use the DRMAA functions in your application, every source file that uses a DRMAA function must include the DRMAA header file. To include the DRMAA header file in your source file, add the following line to your source code:

#include "drmaa.h"

Compiling Your C Application

When you compile your DRMAA application, you need to include some additional compiler directives to direct the compiler and linker to use DRMAA. The following directions apply to the Sun Studio Compiler Collection and to gcc. These instructions might not apply for other compilers and linkers. Consult the documentation for your specific compiler and linker products.

You must include the following two directives:

  • Tell the compiler to include the DRMAA header file by adding the following statement to the compiler command line:
    -$SGE_ROOT/include
    
  • Tell the linker to include the DRMAA library by adding the following statement to the compiler and/or linker command line:
    -ldrmaa
    

You also need to verify that the $SGE_ROOT/lib/$SGE_ARCH directory is included in your library search path. The path is LD_LIBRARY_PATH on the Solaris Operating Environment and Linux. The $SGE_ROOT/lib/$SGE_ARCH directory is not included automatically when you set your environment using the settings.sh or settings.csh files.

Example - Compiling Your C Application Using Sun Studio Compiler

The following example shows how you would compile your DRMAA application using the Sun Studio Compiler. The following assumptions apply:

  • You are using the csh shell on a Solaris host.
  • Sun Grid Engine is installed in /sge.
  • The DRMAA application is stored in app.c.

Sample commands would look like the following:

% source /sge/default/common/settings.csh
% cc -I/sge/include -ldrmaa app.c

Running Your C Application

To run your compiled DRMAA application, verify the following:

The $SGE_ROOT/lib/$SGE_ARCH directory must be included in the library search path (LD_LIBRARY_PATH on the Solaris Operating Environment and Linux). The $SGE_ROOT/lib/$SGE_ARCH directory is not included automatically when you set your environment using the settings.sh or settings.csh files.

You must be logged into a machine that is a Sun Grid Engine submit host. If the machine is not a Sun Grid Engine submit host, all DRMAA function calls will fail, returning DRMAA_ERRNO_DRM_COMMUNICATION_FAILURE.

C Application Examples

The following examples illustrate some application interactions that use the C language bindings. You can find additional examples on the "How To" section of the Grid Engine Community Site.

Example - Starting and Stopping a Session

Every call to a DRMAA function returns an error code. If everything goes well, that code is DRMAA_ERRNO_SUCCESS. If an error occurs, an appropriate error code is returned.

Every DRMAA function also takes at least two parameters:

  • A string to populate with an error message in case of an error
  • An integer representing the maximum length of the error string

On line 8, the example calls drmaa_init(). This function sets up the DRMAA session and must be called before most other DRMAA functions. Some functions, like drmaa_get_contact(), can be called before drmaa_init(), but these functions only provide general information. Any function that performs an action, such as drmaa_run_job() or drmaa_wait() must be called after drmaa_init() returns. If such a function is called before drmaa_init() returns, it will return the error code DRMAA_ERRNO_NO_ACTIVE_SESSION.

The dmraa_init() function creates a session and starts an event client listener thread. The session is used for organizing jobs submitted through DRMAA, and the thread is used to receive updates from the queue master about the state of jobs and the system in general. Once drmaa_init() has been called successfully, the calling application must also call drmaa_exit() before terminating. If an application does not call drmaa_exit() before terminating, the queue master might be left with a dead event client handle, which can decrease queue master performance.

At the end of the program, on line 17, drmaa_exit() cleans up the session and stops the event client listener thread. Most other DRMAA functions must be called before drmaa_exit(). Some functions, like drmaa_get_contact(), can be called after drmaa_exit(), but these functions only provide general information. Any function that performs an action, such as drmaa_run_job() or drmaa_wait() must be called before drmaa_exit() is called. If such a function is called after drmaa_exit() is called, it will return the error code DRMAA_ERRNO_NO_ACTIVE_SESSION.

01: #include 
02: #include "drmaa.h"
03: 
04: int main(int argc, char **argv) {
05:    char error[DRMAA_ERROR_STRING_BUFFER];
06:    int errnum = 0;
07: 
08:    errnum = drmaa_init(NULL, error, DRMAA_ERROR_STRING_BUFFER);
09: 
10:    if (errnum != DRMAA_ERRNO_SUCCESS) {
11:       fprintf(stderr, "Could not initialize the DRMAA library: %s\n", error);
12:       return 1;
13:    }
14: 
15:    printf("DRMAA library was started successfully\n");
16:    
17:    errnum = drmaa_exit(error, DRMAA_ERROR_STRING_BUFFER);
18: 
19:    if (errnum != DRMAA_ERRNO_SUCCESS) {
20:       fprintf(stderr, "Could not shut down the DRMAA library: %s\n", error);
21:       return 1;
22:    }
23: 
24:    return 0;
25: }
Example - Running a Job

The following code segment shows how to use the DRMAA C binding to submit a job to Sun Grid Engine. The beginning and end of this program are the same as in the preceding example. The differences are on lines 16 through 59. On line 16, DRMAA allocates a job template. A job template is a structure used to store information about a job to be submitted. The same template can be reused for multiple calls to drmaa_run_job() or drmaa_run_bulk_job().

On line 22, the DRMAA_REMOTE_COMMAND attribute is set. This attribute tells DRMAA where to find the program to run. Its value is the path to the executable. The path can be relative or absolute. If relative, the path is relative to the DRMAA_WD attribute, which defaults to the user's home directory. For this program to work, the script sleeper.sh must be in your default path.

On line 32, the DRMAA_V_ARGV attribute is set. This attribute tells DRMAA what arguments to pass to the executable. For more information on DRMAA attributes, see the drmaa_attributes man page.

On line 43 , drmaa_run_job() submits the job. DRMAA places the id assigned to the job into the character array that is passed to drmaa_run_job(). The job is now running as though submitted by qsub. At this point, calling drmaa_exit() or terminating the program will have no effect on the job.

To clean things up, the job template is deleted on line 54. This frees the memory DRMAA set aside for the job template, but has no effect on submitted jobs.

Finally, on line 61, drmaa_exit() is called. The drmaa_exit() call is outside of the if structure started on line 18 because when drmaa_init() is called, drmaa_exit() must be called before terminating, regardless of successive commands.

01: #include 
02: #include "drmaa.h"
03: 
04: int main(int argc, char **argv) {
05:    char error[DRMAA_ERROR_STRING_BUFFER];
06:    int errnum = 0;
07:    drmaa_job_template_t *jt = NULL;
08: 
09:    errnum = drmaa_init(NULL, error, DRMAA_ERROR_STRING_BUFFER);
10: 
11:    if (errnum != DRMAA_ERRNO_SUCCESS) {
12:       fprintf(stderr, "Could not initialize the DRMAA library: %s\n", error);
13:       return 1;
14:    }
15: 
16:    errnum = drmaa_allocate_job_template(&jt, error, DRMAA_ERROR_STRING_BUFFER);
17: 
18:    if (errnum != DRMAA_ERRNO_SUCCESS) {
19:       fprintf(stderr, "Could not create job template: %s\n", error);
20:    }
21:    else {
22:       errnum = drmaa_set_attribute(jt, DRMAA_REMOTE_COMMAND, "sleeper.sh",
23:                                     error, DRMAA_ERROR_STRING_BUFFER);
24: 
25:       if (errnum != DRMAA_ERRNO_SUCCESS) {
26:          fprintf(stderr, "Could not set attribute \"%s\": %s\n",
27:                   DRMAA_REMOTE_COMMAND, error);
28:       }
29:       else {
30:          const char *args[2] = {"5", NULL};
31:          
32:          errnum = drmaa_set_vector_attribute(jt, DRMAA_V_ARGV, args, error,
33:                                               DRMAA_ERROR_STRING_BUFFER);
34:       }
35:       
36:       if (errnum != DRMAA_ERRNO_SUCCESS) {
37:          fprintf(stderr, "Could not set attribute \"%s\": %s\n",
38:                   DRMAA_REMOTE_COMMAND, error);
39:       }
40:       else {
41:          char jobid[DRMAA_JOBNAME_BUFFER];
42: 
43:          errnum = drmaa_run_job(jobid, DRMAA_JOBNAME_BUFFER, jt, error,
44:                                  DRMAA_ERROR_STRING_BUFFER);
45: 
46:          if (errnum != DRMAA_ERRNO_SUCCESS) {
47:             fprintf(stderr, "Could not submit job: %s\n", error);
48:          }
49:          else {
50:             printf("Your job has been submitted with id %s\n", jobid);
51:          }
52:       } /* else */
53: 
54:       errnum = drmaa_delete_job_template(jt, error, DRMAA_ERROR_STRING_BUFFER);
55: 
56:       if (errnum != DRMAA_ERRNO_SUCCESS) {
57:          fprintf(stderr, "Could not delete job template: %s\n", error);
58:       }
59:    } /* else */
60: 
61:    errnum = drmaa_exit(error, DRMAA_ERROR_STRING_BUFFER);
62: 
63:    if (errnum != DRMAA_ERRNO_SUCCESS) {
64:       fprintf(stderr, "Could not shut down the DRMAA library: %s\n", error);
65:       return 1;
66:    }
67: 
68:    return 0;
69: }

Developing With the Java Language Binding

Important Files for the Java Language Binding

To use the DRMAA Java language binding implementation included with Sun Grid Engine, you need to know where to find the important files. The most important file is the DRMAA JAR file $SGE_ROOT/lib/drmaa.jar. To compile your DRMAA application, you must include the DRMAA JAR file in your CLASSPATH. The DRMAA classes are documented in the DRMAA Javadoc, located in the $SGE_ROOT/doc/javadocs directory. To access the Javadocs, open the file $SGE_ROOT/doc/javadocs/index.html in your browser. When you are ready to run your application, you also need the DRMAA shared library, $SGE_ROOT/lib/$SGE_ARCH/libdrmaa.so, which provides the required native routines.

Importing the DRMAA Java Classes and Packages

To use the DRMAA classes in your application, your classes should import the DRMAA classes or packages. In most cases, only the classes in the org.ggf.drmaa package will be used. You can import these packages individually or using a wildcard package import. In some rare cases, you might need to reference the Sun Grid Engine DRMAA implementation classes found in the com.sun.grid.drmaa package. In those cases, you can import the classes individually or you can import all the classes in a given package. The names of the com.sun.grid.drmaa classes do not overlap with the org.ggf.drmaa classes, so you can import both packages without creating a namespace collision.

Compiling Your Java Application

To compile your DRMAA application, you must include the $SGE_ROOT/lib/drmaa.jar file in your CLASSPATH. The drmaa.jar file will not be included automatically when you set your environment using the settings.sh or settings.csh files.

How to Use DRMAA With NetBeans 5.x

To use the DRMAA classes with your NetBeans 5.0 or 5.5 project, follow these steps:

  1. Click mouse button 3 on the project node and select Properties.

  2. Determine whether your project generates a build file or uses an existing file.
    • If your project uses a generated build file:
      1. Select Libraries in the left column.
      2. Click Add Library.
      3. Click Manage Libraries in the Libraries dialog box.
      4. Click New Library in the Library Management dialog box.
      5. Type DRMAA in the Library Name field in the New Library dialog box.
      6. Click OK to dismiss the New Library dialog box.
      7. Click Add JAR/Folder.
      8. Browse to the $SGE_ROOT/lib directory in the file chooser dialog box and select the drmaa.jar file.
      9. Click Add JAR/Folder to dismiss the file chooser dialog box.
      10. Click OK to dismiss the Library Management dialog box.
      11. Select the DRMAA library and click Add Library to dismiss the Libraries dialog box.
    • If your project uses an existing build file:
      1. Select Java Sources Classpath in the left column.
      2. Click Add JAR/Folder.
      3. Browse to the $SGE_ROOT/lib directory in the file chooser dialog box and select the drmaa.jar file.
      4. Click Choose to dismiss the file chooser dialog box.
  3. Click OK to dismiss the properties dialog box.

  4. Verify that the DRMAA shared library is in the library search path.
    To run your application from NetBeans, the DRMAA shared library file $SGE_ROOT/lib/$SGE_ARCH/libdrmaa.so must be included in the library search path (LD_LIBRARY_PATH on the Solaris Operating Environment and Linux). The $SGE_ROOT/lib/$SGE_ARCH directory is not included automatically when you set your environment using the settings.sh or settings.csh files. To set up the path for the shared library, perform one of the following:
    • Set up your environment in the shell before launching NetBeans.
    • Add to the netbeans-root/etc/netbeans.conf file to set up the environment, such as:
      # Setup environment for SGE
      . $SGE_ROOT/$SGE_CELL/common/settings.sh
      SGE_ARCH=`$SGE_ROOT/util/arch`
      LD_LIBRARY_PATH=$SGE_ROOT/lib/$SGE_ARCH; export LD_LIBRARY_PATH 
      

Running Your Java Application

To run your compiled DRMAA application, verify the following:

  • The $SGE_ROOT/lib/$SGE_ARCH directory must be included in the library search path (LD_LIBRARY_PATH on the Solaris Operating Environment and Linux). The $SGE_ROOT/lib/$SGE_ARCH directory is not included automatically when you set your environment using the settings.sh or settings.csh files.
  • You must be logged into a machine that is a Sun Grid Engine submit host. If the machine is not a Sun Grid Engine submit host, all DRMAA method calls will fail, throwing a DrmCommunicationException.

Java Application Examples

The following examples illustrate some application interactions that use the Java language bindings. You can find additional examples on the "How To" section of the Grid Engine Community Site.

Example - Starting and Stopping a Session

The following code segment shows the most basic DRMAA Java language binding program.

You must have a Session object to do anything with DRMAA. You get the Session object from a SessionFactory. You get the SessionFactory from the static SessionFactory.getFactory() method. The reason for this chain is that the org.ggf.drmaa.* classes should be considered an immutable package to be used by every DRMAA Java language binding implementation. Because the package is immutable, to load a specific implementation, the SessionFactory uses a system property to find the implementation's session factory, which it then loads. That session factory is then responsible for creating the session in whatever way it sees fit. It should be noted that even though there is a session factory, only one session may exist at a time.

On line 9, SessionFactory.getFactory() gets a session factory instance. On line 10, SessionFactory.getSession() gets the session instance. On line 13, Session.init() initializes the session. "" is passed in as the contact string to create a new session because no initialization arguments are needed.

Session.init() creates a session and starts an event client listener thread. The session is used for organizing jobs submitted through DRMAA, and the thread is used to receive updates from the queue master about the state of jobs and the system in general. Once Session.init() has been called successfully, the calling application must also call Session.exit() before terminating. If an application does not call Session.exit() before terminating, the queue master might be left with a dead event client handle, which can decrease queue master performance. Use the Runtime.addShutdownHook() method to make sure Session.exit() gets called.

At the end of the program, on line 14, Session.exit() cleans up the session and stops the event client listener thread. Most other DRMAA methods must be called before Session.exit(). Some functions, like Session.getContact(), can be called after Session.exit(), but these functions only provide general information. Any function that performs an action, such as Session.runJob() or Session.wait() must be called before Session.exit() is called. If such a function is called after Session.exit() is called, it will throw a NoActiveSessionException.

01: package com.sun.grid.drmaa.howto;
02:
03: import org.ggf.drmaa.DrmaaException;
04: import org.ggf.drmaa.Session;
05: import org.ggf.drmaa.SessionFactory;
06:
07: public class Howto1 {
08:    public static void main(String[] args) {
09:       SessionFactory factory = SessionFactory.getFactory();
10:       Session session = factory.getSession();
11:
12:       try {
13:          session.init("");
14:          session.exit();
15:       } catch (DrmaaException e) {
16:          System.out.println("Error: " + e.getMessage());
17:       }
18:    }
19: }
Example - Running a Job

The following code segment shows how to use the DRMAA Java language binding to submit a job to Sun Grid Engine. The beginning and end of this program are the same as in the preceding example. The differences are on lines 16 through 24.

On line 16 , DRMAA allocates a JobTemplate. A JobTemplate is an object that is used to store information about a job to be submitted. The same template can be reused for multiple calls to Session.runJob() or Session.runBulkJobs().

On line 17, the RemoteCommand attribute is set. This attribute tells DRMAA where to find the program to run. Its value is the path to the executable. The path can be relative or absolute. If relative, the path is relative to the WorkingDirectory attribute, which defaults to the user's home directory. For more information on DRMAA attributes, see the DRMAA Javadoc or the drmaa_attributes man page. For this program to work, the script sleeper.sh must be in your default path.

On line 18, the args attribute is set. This attribute tells DRMAA what arguments to pass to the executable. For more information on DRMAA attributes, see the DRMAA Javadoc or the drmaa_attributes man page.

On line 20, Session.runJob() submits the job. This method returns the ID assigned to the job by the queue master. The job is now running as though submitted by qsub. At this point, calling Session.exit() or terminating the program will have no effect on the job.

To clean things up, the job template is deleted on line 24. This action frees the memory DRMAA set aside for the job template, but has no effect on submitted jobs.

01: package com.sun.grid.drmaa.howto;
02:
03: import java.util.Collections;
04: import org.ggf.drmaa.DrmaaException;
05: import org.ggf.drmaa.JobTemplate;
06: import org.ggf.drmaa.Session;
07: import org.ggf.drmaa.SessionFactory;
08:
09: public class Howto2 {
10:    public static void main(String[] args) {
11:       SessionFactory factory = SessionFactory.getFactory();
12:       Session session = factory.getSession();
13:
14:       try {
15:          session.init("");
16:          JobTemplate jt = session.createJobTemplate();
17:          jt.setRemoteCommand("sleeper.sh");
18:          jt.setArgs(Collections.singletonList("5"));
19:
20:          String id = session.runJob(jt);
21:
22:          System.out.println("Your job has been submitted with id " + id);
23:
24:          session.deleteJobTemplate(jt);
25:          session.exit();
26:       } catch (DrmaaException e) {
27:          System.out.println("Error: " + e.getMessage());
28:       }
29:    }
30: }

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact