|
Sun Grid Engine Information Center Automating Grid Engine Functions Through the Distributed Resource Management Application APIYou can automate Sun Grid Engine functions by writing scripts that run Sun Grid Engine commands and parse the results. However, for more consistent and efficient results, you can use the C or Java language and the Distributed Resource Management Application API. This section introduces the DRMAA concept and explains how to use it with the C and Java languages.
Introduction to Distributed Resource Management Application API (DRMAA)The Distributed Resource Management Application API (DRMAA, which is pronounced like "drama") is an Open Grid Forum specification to standardize job submission, monitoring, and control in Distributed Resource Management Systems (DRMS). The objective of the DRMAA Working Group was to produce an API that would be easy to learn, easy to implement, and that would enable useful application integrations with DRMS in a standard way. The DRMAA specification is language, platform, and DRMS agnostic. A wide variety of systems should be able to implement the DRMAA specification. To provide additional guidance for DRMAA implementation in specific languages, the DRMAA Working Group also produced several DRMAA language binding specifications. These specifications define what a DRMAA implementation should resemble in a given language. The DRMAA specification is currently at version 1.0. The DRMAA Java Language Binding Specification is also at version 1.0, as is the DRMAA C Language Binding Specification. Sun Grid Engine provides implementations of both the 1.0 Java language binding and the 1.0 C language binding. For more information about the DRMAA 1.0 specification, see the language specific binding specifications on the Open Grid Forum DRMAA Working Group Web Site Developing With the C Language BindingImportant Files for the C Language BindingTo use the DRMAA C language binding implementation included with Sun Grid Engine, you need to know where to find the important files. The most important file is the DRMAA header file that you included from your C application to make the DRMAA functions available to your application. The DRMAA header file resides in the $SGE_ROOT/include/drmaa.h, where $SGE_ROOT defaults to /usr/SGE. For detailed reference information about the DRMAA functions, see section 3 of the Sun Grid Engine man pages, located in the $SGE_ROOT/man directory. To compile and link your application, use the DRMAA shared library at $SGE_ROOT/lib/$SGE_ARCH/libdrmaa.so. Including the DRMAA Header FileTo use the DRMAA functions in your application, every source file that uses a DRMAA function must include the DRMAA header file. To include the DRMAA header file in your source file, add the following line to your source code:
#include "drmaa.h"
Compiling Your C ApplicationWhen you compile your DRMAA application, you need to include some additional compiler directives to direct the compiler and linker to use DRMAA. The following directions apply to the Sun Studio Compiler Collection and to gcc. These instructions might not apply for other compilers and linkers. Consult the documentation for your specific compiler and linker products. You must include the following two directives:
You also need to verify that the $SGE_ROOT/lib/$SGE_ARCH directory is included in your library search path. The path is LD_LIBRARY_PATH on the Solaris Operating Environment and Linux. The $SGE_ROOT/lib/$SGE_ARCH directory is not included automatically when you set your environment using the settings.sh or settings.csh files. Example - Compiling Your C Application Using Sun Studio CompilerThe following example shows how you would compile your DRMAA application using the Sun Studio Compiler. The following assumptions apply:
Sample commands would look like the following:
% source /sge/default/common/settings.csh
% cc -I/sge/include -ldrmaa app.c
Running Your C ApplicationTo run your compiled DRMAA application, verify the following: The $SGE_ROOT/lib/$SGE_ARCH directory must be included in the library search path (LD_LIBRARY_PATH on the Solaris Operating Environment and Linux). The $SGE_ROOT/lib/$SGE_ARCH directory is not included automatically when you set your environment using the settings.sh or settings.csh files. You must be logged into a machine that is a Sun Grid Engine submit host. If the machine is not a Sun Grid Engine submit host, all DRMAA function calls will fail, returning DRMAA_ERRNO_DRM_COMMUNICATION_FAILURE. C Application ExamplesThe following examples illustrate some application interactions that use the C language bindings. You can find additional examples on the "How To" section of the Grid Engine Community Site. Example - Starting and Stopping a SessionEvery call to a DRMAA function returns an error code. If everything goes well, that code is DRMAA_ERRNO_SUCCESS. If an error occurs, an appropriate error code is returned. Every DRMAA function also takes at least two parameters:
On line 8, the example calls drmaa_init(). This function sets up the DRMAA session and must be called before most other DRMAA functions. Some functions, like drmaa_get_contact(), can be called before drmaa_init(), but these functions only provide general information. Any function that performs an action, such as drmaa_run_job() or drmaa_wait() must be called after drmaa_init() returns. If such a function is called before drmaa_init() returns, it will return the error code DRMAA_ERRNO_NO_ACTIVE_SESSION. The dmraa_init() function creates a session and starts an event client listener thread. The session is used for organizing jobs submitted through DRMAA, and the thread is used to receive updates from the queue master about the state of jobs and the system in general. Once drmaa_init() has been called successfully, the calling application must also call drmaa_exit() before terminating. If an application does not call drmaa_exit() before terminating, the queue master might be left with a dead event client handle, which can decrease queue master performance. At the end of the program, on line 17, drmaa_exit() cleans up the session and stops the event client listener thread. Most other DRMAA functions must be called before drmaa_exit(). Some functions, like drmaa_get_contact(), can be called after drmaa_exit(), but these functions only provide general information. Any function that performs an action, such as drmaa_run_job() or drmaa_wait() must be called before drmaa_exit() is called. If such a function is called after drmaa_exit() is called, it will return the error code DRMAA_ERRNO_NO_ACTIVE_SESSION. 01: #include 02: #include "drmaa.h" 03: 04: int main(int argc, char **argv) { 05: char error[DRMAA_ERROR_STRING_BUFFER]; 06: int errnum = 0; 07: 08: errnum = drmaa_init(NULL, error, DRMAA_ERROR_STRING_BUFFER); 09: 10: if (errnum != DRMAA_ERRNO_SUCCESS) { 11: fprintf(stderr, "Could not initialize the DRMAA library: %s\n", error); 12: return 1; 13: } 14: 15: printf("DRMAA library was started successfully\n"); 16: 17: errnum = drmaa_exit(error, DRMAA_ERROR_STRING_BUFFER); 18: 19: if (errnum != DRMAA_ERRNO_SUCCESS) { 20: fprintf(stderr, "Could not shut down the DRMAA library: %s\n", error); 21: return 1; 22: } 23: 24: return 0; 25: } Example - Running a JobThe following code segment shows how to use the DRMAA C binding to submit a job to Sun Grid Engine. The beginning and end of this program are the same as in the preceding example. The differences are on lines 16 through 59. On line 16, DRMAA allocates a job template. A job template is a structure used to store information about a job to be submitted. The same template can be reused for multiple calls to drmaa_run_job() or drmaa_run_bulk_job(). On line 22, the DRMAA_REMOTE_COMMAND attribute is set. This attribute tells DRMAA where to find the program to run. Its value is the path to the executable. The path can be relative or absolute. If relative, the path is relative to the DRMAA_WD attribute, which defaults to the user's home directory. For this program to work, the script sleeper.sh must be in your default path. On line 32, the DRMAA_V_ARGV attribute is set. This attribute tells DRMAA what arguments to pass to the executable. For more information on DRMAA attributes, see the drmaa_attributes man page. On line 43 , drmaa_run_job() submits the job. DRMAA places the id assigned to the job into the character array that is passed to drmaa_run_job(). The job is now running as though submitted by qsub. At this point, calling drmaa_exit() or terminating the program will have no effect on the job. To clean things up, the job template is deleted on line 54. This frees the memory DRMAA set aside for the job template, but has no effect on submitted jobs. Finally, on line 61, drmaa_exit() is called. The drmaa_exit() call is outside of the if structure started on line 18 because when drmaa_init() is called, drmaa_exit() must be called before terminating, regardless of successive commands. 01: #include 02: #include "drmaa.h" 03: 04: int main(int argc, char **argv) { 05: char error[DRMAA_ERROR_STRING_BUFFER]; 06: int errnum = 0; 07: drmaa_job_template_t *jt = NULL; 08: 09: errnum = drmaa_init(NULL, error, DRMAA_ERROR_STRING_BUFFER); 10: 11: if (errnum != DRMAA_ERRNO_SUCCESS) { 12: fprintf(stderr, "Could not initialize the DRMAA library: %s\n", error); 13: return 1; 14: } 15: 16: errnum = drmaa_allocate_job_template(&jt, error, DRMAA_ERROR_STRING_BUFFER); 17: 18: if (errnum != DRMAA_ERRNO_SUCCESS) { 19: fprintf(stderr, "Could not create job template: %s\n", error); 20: } 21: else { 22: errnum = drmaa_set_attribute(jt, DRMAA_REMOTE_COMMAND, "sleeper.sh", 23: error, DRMAA_ERROR_STRING_BUFFER); 24: 25: if (errnum != DRMAA_ERRNO_SUCCESS) { 26: fprintf(stderr, "Could not set attribute \"%s\": %s\n", 27: DRMAA_REMOTE_COMMAND, error); 28: } 29: else { 30: const char *args[2] = {"5", NULL}; 31: 32: errnum = drmaa_set_vector_attribute(jt, DRMAA_V_ARGV, args, error, 33: DRMAA_ERROR_STRING_BUFFER); 34: } 35: 36: if (errnum != DRMAA_ERRNO_SUCCESS) { 37: fprintf(stderr, "Could not set attribute \"%s\": %s\n", 38: DRMAA_REMOTE_COMMAND, error); 39: } 40: else { 41: char jobid[DRMAA_JOBNAME_BUFFER]; 42: 43: errnum = drmaa_run_job(jobid, DRMAA_JOBNAME_BUFFER, jt, error, 44: DRMAA_ERROR_STRING_BUFFER); 45: 46: if (errnum != DRMAA_ERRNO_SUCCESS) { 47: fprintf(stderr, "Could not submit job: %s\n", error); 48: } 49: else { 50: printf("Your job has been submitted with id %s\n", jobid); 51: } 52: } /* else */ 53: 54: errnum = drmaa_delete_job_template(jt, error, DRMAA_ERROR_STRING_BUFFER); 55: 56: if (errnum != DRMAA_ERRNO_SUCCESS) { 57: fprintf(stderr, "Could not delete job template: %s\n", error); 58: } 59: } /* else */ 60: 61: errnum = drmaa_exit(error, DRMAA_ERROR_STRING_BUFFER); 62: 63: if (errnum != DRMAA_ERRNO_SUCCESS) { 64: fprintf(stderr, "Could not shut down the DRMAA library: %s\n", error); 65: return 1; 66: } 67: 68: return 0; 69: } Developing With the Java Language BindingImportant Files for the Java Language BindingTo use the DRMAA Java language binding implementation included with Sun Grid Engine, you need to know where to find the important files. The most important file is the DRMAA JAR file $SGE_ROOT/lib/drmaa.jar. To compile your DRMAA application, you must include the DRMAA JAR file in your CLASSPATH. The DRMAA classes are documented in the DRMAA Javadoc, located in the $SGE_ROOT/doc/javadocs directory. To access the Javadocs, open the file $SGE_ROOT/doc/javadocs/index.html in your browser. When you are ready to run your application, you also need the DRMAA shared library, $SGE_ROOT/lib/$SGE_ARCH/libdrmaa.so, which provides the required native routines. Importing the DRMAA Java Classes and PackagesTo use the DRMAA classes in your application, your classes should import the DRMAA classes or packages. In most cases, only the classes in the org.ggf.drmaa package will be used. You can import these packages individually or using a wildcard package import. In some rare cases, you might need to reference the Sun Grid Engine DRMAA implementation classes found in the com.sun.grid.drmaa package. In those cases, you can import the classes individually or you can import all the classes in a given package. The names of the com.sun.grid.drmaa classes do not overlap with the org.ggf.drmaa classes, so you can import both packages without creating a namespace collision. Compiling Your Java ApplicationTo compile your DRMAA application, you must include the $SGE_ROOT/lib/drmaa.jar file in your CLASSPATH. The drmaa.jar file will not be included automatically when you set your environment using the settings.sh or settings.csh files. How to Use DRMAA With NetBeans 5.xTo use the DRMAA classes with your NetBeans 5.0 or 5.5 project, follow these steps:
Running Your Java ApplicationTo run your compiled DRMAA application, verify the following:
Java Application ExamplesThe following examples illustrate some application interactions that use the Java language bindings. You can find additional examples on the "How To" section of the Grid Engine Community Site. Example - Starting and Stopping a SessionThe following code segment shows the most basic DRMAA Java language binding program. You must have a Session object to do anything with DRMAA. You get the Session object from a SessionFactory. You get the SessionFactory from the static SessionFactory.getFactory() method. The reason for this chain is that the org.ggf.drmaa.* classes should be considered an immutable package to be used by every DRMAA Java language binding implementation. Because the package is immutable, to load a specific implementation, the SessionFactory uses a system property to find the implementation's session factory, which it then loads. That session factory is then responsible for creating the session in whatever way it sees fit. It should be noted that even though there is a session factory, only one session may exist at a time. On line 9, SessionFactory.getFactory() gets a session factory instance. On line 10, SessionFactory.getSession() gets the session instance. On line 13, Session.init() initializes the session. "" is passed in as the contact string to create a new session because no initialization arguments are needed. Session.init() creates a session and starts an event client listener thread. The session is used for organizing jobs submitted through DRMAA, and the thread is used to receive updates from the queue master about the state of jobs and the system in general. Once Session.init() has been called successfully, the calling application must also call Session.exit() before terminating. If an application does not call Session.exit() before terminating, the queue master might be left with a dead event client handle, which can decrease queue master performance. Use the Runtime.addShutdownHook() method to make sure Session.exit() gets called. At the end of the program, on line 14, Session.exit() cleans up the session and stops the event client listener thread. Most other DRMAA methods must be called before Session.exit(). Some functions, like Session.getContact(), can be called after Session.exit(), but these functions only provide general information. Any function that performs an action, such as Session.runJob() or Session.wait() must be called before Session.exit() is called. If such a function is called after Session.exit() is called, it will throw a NoActiveSessionException. 01: package com.sun.grid.drmaa.howto; 02: 03: import org.ggf.drmaa.DrmaaException; 04: import org.ggf.drmaa.Session; 05: import org.ggf.drmaa.SessionFactory; 06: 07: public class Howto1 { 08: public static void main(String[] args) { 09: SessionFactory factory = SessionFactory.getFactory(); 10: Session session = factory.getSession(); 11: 12: try { 13: session.init(""); 14: session.exit(); 15: } catch (DrmaaException e) { 16: System.out.println("Error: " + e.getMessage()); 17: } 18: } 19: } Example - Running a JobThe following code segment shows how to use the DRMAA Java language binding to submit a job to Sun Grid Engine. The beginning and end of this program are the same as in the preceding example. The differences are on lines 16 through 24. On line 16 , DRMAA allocates a JobTemplate. A JobTemplate is an object that is used to store information about a job to be submitted. The same template can be reused for multiple calls to Session.runJob() or Session.runBulkJobs(). On line 17, the RemoteCommand attribute is set. This attribute tells DRMAA where to find the program to run. Its value is the path to the executable. The path can be relative or absolute. If relative, the path is relative to the WorkingDirectory attribute, which defaults to the user's home directory. For more information on DRMAA attributes, see the DRMAA Javadoc or the drmaa_attributes man page. For this program to work, the script sleeper.sh must be in your default path. On line 18, the args attribute is set. This attribute tells DRMAA what arguments to pass to the executable. For more information on DRMAA attributes, see the DRMAA Javadoc or the drmaa_attributes man page. On line 20, Session.runJob() submits the job. This method returns the ID assigned to the job by the queue master. The job is now running as though submitted by qsub. At this point, calling Session.exit() or terminating the program will have no effect on the job. To clean things up, the job template is deleted on line 24. This action frees the memory DRMAA set aside for the job template, but has no effect on submitted jobs. 01: package com.sun.grid.drmaa.howto; 02: 03: import java.util.Collections; 04: import org.ggf.drmaa.DrmaaException; 05: import org.ggf.drmaa.JobTemplate; 06: import org.ggf.drmaa.Session; 07: import org.ggf.drmaa.SessionFactory; 08: 09: public class Howto2 { 10: public static void main(String[] args) { 11: SessionFactory factory = SessionFactory.getFactory(); 12: Session session = factory.getSession(); 13: 14: try { 15: session.init(""); 16: JobTemplate jt = session.createJobTemplate(); 17: jt.setRemoteCommand("sleeper.sh"); 18: jt.setArgs(Collections.singletonList("5")); 19: 20: String id = session.runJob(jt); 21: 22: System.out.println("Your job has been submitted with id " + id); 23: 24: session.deleteJobTemplate(jt); 25: session.exit(); 26: } catch (DrmaaException e) { 27: System.out.println("Error: " + e.getMessage()); 28: } 29: } 30: } |