|
Sun Grid Engine Information Center Managing Resource QuotasThis section explains how to use the resource quotas feature of the Grid Engine software to limit resources by user, project, host, cluster queue, or parallel environment. For convenience, you can express these limits using user access lists, departments, or host groups. This section covers the following topics: Resource Quota OverviewTo prevent users from consuming all available resources, the Grid Engine software supports complex attributes that you can configure on a global, queue or host layer. While this layered resource management approach is powerful, the approach leaves gaps that become particularly important in large installations that consist of many different custom resources, user groups, and projects. The resource quota feature closes this gap by enabling you to manage these enterprise environments to the extent that you can control which project or department must abdicate when single bottleneck resources run out. The resource quota feature enables you to apply limits to several kinds of resources and resource consumers, to all jobs in the cluster, and to combinations of consumers. In this context, resources are any defined complex attribute known by the Sun Grid Engine configuration. For more information about complex attributes, see the complex(5) man page. Resources can be slots, arch, mem_total, num_proc, swap_total, built-in resources, or any custom-defined resource like compiler_license. Resource consumers are (per) users, (per) queues, (per) hosts, (per) projects, and (per) parallel environments. The resource quota feature provides a way for you to limit the resources that a consumer can use at any time. This limitation provides an indirect method to prioritize users, departments, and projects. To define directly the priorities by which a user should obtain a resource, use the resource urgency and share-based policies described in Configuring the Urgency Policy and Configuring the Share-Based Policy. To limit resources through the Grid Engine software, use the qquota and qconf commands, or the QMON graphical interface. For more information, see the qquota(1) and qconf(1) man pages. About Resource Quota SetsResource quota sets enable you to specify the maximum resource consumption for any job requests. Once you define the resource quota sets, the scheduler uses them to select the next possible jobs to be run by watching that the quotas will not be exceeded. The ultimate result of setting resource quotas is that only those jobs that do not exceed their resource quotas will be scheduled and run. A resource quota set defines a maximum resource quota for a particular job request. All of the configured rule sets apply all of the time. If multiple resource quota sets are defined, the most restrictive set applies. Every resource quota set consists of one or more resource quota rules. These rules are evaluated in order, and the first rule that matches a specific request is used. A resource quota set always results in at most one effective resource quota rule for a specific request. A resource quota set consists of the following information:
Example – Sample Resource Quota SetThe following example resource quota set restricts user1 and user2 to 2 Gbytes of free virtual space on each host in the host group lx_hosts.
{
name max_virtual_free_on_lx_hosts
description "resource quota for virtual_free restriction"
enabled true
limit users {user1,user2} hosts {@lx_host} to virtual_free=2g
}
Static and Dynamic Resource QuotasResource quota rules always define a maximum value of a resource that can be used. In most cases, these values are static and equal for all matching filter scopes. Although you could define several different rules to apply to different scopes, you would then have several rules that are nearly identical. Instead of duplicating rules, you can instead define a dynamic limit. A dynamic limit uses an algebraic expression to derive the rule limit value. The algebraic formula can reference a complex attribute whose value is used to calculate the resulting limit. Example – Dynamic Limit ExampleThe following example illustrates the use of dynamic limits. Users are allowed to use five slots per CPU on all Linux hosts.
limit hosts {@linux_hosts} to slots=$num_proc*5
The value of num_proc is the number of processors on the host. The limit is calculated by the formula $num_proc*5, and can be different on each host. Expanding the example above, you could have the following resulting limits:
Instead of num_proc, you could use any other complex attribute known for a host as either a load value or a consumable resource. Managing Resource Quotas With QMONThe following task explains how to set resource quotas using the QMON graphical interface. How to Set Resource Quotas Using QMON
Monitoring Resource Quota Utilization From the Command LineUse the qquota command to view information about the current Sun Grid Engine resource quotas. The qquota command lists each resource quota that is being used at least once or that defines a static limit. For each applicable resource quota, qquota displays the following information:
The qquota command includes several options that you can use to limit the information to a specific host, cluster queue, project, parallel environments, resource, or user. If you use no options, qquota displays information about resource sets that apply to the user name from which you invoke the command. For more information, see the qquota(1) man page. Example – Sample qquota CommandThe following example shows information about the resource quota sets that apply to user user1: $ qquota -u user1 resource quota limit filter -------------------------------------------------------------------------------- maxujobs/1 slots=5/20 - max_linux/1 slots=5/5 hosts @linux max_per_host/1 slots=1/2 users user1 hosts host2 Configuring Resource Quotas From the Command LineUse the qconf command to add, modify, or delete resource quota sets and rules.
For more information about qconf, see the qconf(1) man page. Resource Quota Command Line ExamplesThe following example shows how you can use the various commands for resource quotas. The rule set shown in Example – Rule Set defines the following limit:
To configure the rule set, use one of the following forms of the qconf command:
After jobs are submitted for different users, the qstat command shows output similar to the example shown in Example – qstat Output. Example – Rule Set
{
name maxujobs
limit users * to slots=20
}
{
name max_linux
limit users * hosts @linux to slots=5
}
{
name max_per_host
limit users MyUser hosts {@linux} to slots=2
limit users {*} hosts {@linux} to slots=1
limit users * hosts * to slots=0
}
Example – qstat Output
$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
---------------------------------------------------------------------------------------------
27 0.55500 Sleeper MyUser r 02/21/2006 15:53:10 all.q@host1 1
29 0.55500 Sleeper MyUser r 02/21/2006 15:53:10 all.q@host1 1
30 0.55500 Sleeper MyUser r 02/21/2006 15:53:10 all.q@host2 1
26 0.55500 Sleeper MyUser r 02/21/2006 15:53:10 all.q@host2 1
28 0.55500 Sleeper user1 r 02/21/2006 15:53:10 all.q@host2 1
Example – qquota Output$ qquota # as user MyUser resource quota rule limit filter -------------------------------------------------------------------------------- maxujobs/1 slots=5/20 - max_linux/1 slots=5/5 hosts @linux max_per_host/1 slots=2/2 users MyUser hosts host2 max_per_host/1 slots=2/2 users MyUser hosts host1 $ qquota -h host2 # as user MyUser resource quota limit filter -------------------------------------------------------------------------------- maxujobs/1 slots=5/20 - max_linux/1 slots=5/5 hosts @linux max_per_host/1 slots=2/2 users MyUser hosts host2 $ qquota -u user1 resource quota limit filter -------------------------------------------------------------------------------- maxujobs/1 slots=5/20 - max_linux/1 slots=5/5 hosts @linux max_per_host/1 slots=1/2 users user1 hosts host2 $ qquota -u * resource quota limit filter -------------------------------------------------------------------------------- maxujobs/1 slots=5/20 - max_linux/1 slots=5/5 hosts @linux max_per_host/1 slots=2/2 users MyUser hosts host1 max_per_host/1 slots=2/2 users MyUser hosts host2 max_per_host/1 slots=1/2 users user1 hosts host2 Performance ConsiderationsEfficient Rule SetsTo provide the most efficient processing of jobs and resources in queues, put the most restrictive rule at the first position of a rule set. Following this convention helps the Sun Grid Engine scheduler to restrict the amount of suited queue instances in a particularly efficient manner, because the first rule is never shadowed by any subsequent rule in the same rule set and thus always stands for itself. To illustrate this rule, consider an environment similar to the following:
In such an environment, you might define a single rule set as follows:
{
name 30_for_each_project
description "not more than 30 per project"
enabled TRUE
limit projects {*} queues Q001 to F001=30
limit projects {*} queues Q002 to F002=30
limit projects {*} queues Q003 to F003=30
limit projects {*} queues Q004 to F004=30
limit to F001=0,F002=0,F003=0,F004=0
}
The single rule set limits the utilization of each managed resource to 30 for each project and constrains the jobs in eligible queues at the same time. This will work fine, but in a larger cluster with many hosts, the single rule set would become the cause of slow job dispatching. To help the Sun Grid Engine scheduler to foreclose as many queue instances as possible during matchmaking, use four separate rule sets.
{
name 30_for_each_project_in_Q001
description "not more than 30 per project of F001 in Q001"
enabled TRUE
limit queues !Q001 to F001=0
limit projects {*} queues Q001 to F001=30
}
{
name 30_for_each_project_in_Q002
description "not more than 30 per project of F002 in Q002"
enabled TRUE
limit queues !Q002 to F002=0
limit projects {*} queues Q002 to F002=30
}
{
name 30_for_each_project_in_Q003
description "not more than 30 per project of F003 in Q003"
enabled TRUE
limit queues !Q003 to F003=0
limit projects {*} queues Q003 to F003=30
}
{
name 30_for_each_project_in_Q004
description "not more than 30 per project of F004 in Q004"
enabled TRUE
limit queues !Q004 to F004=0
limit projects {*} queues Q004 to F004=30
}
These four rule sets constrain the very same per project resource quotas as the single rule set. However, the four rule sets can be processed much more efficiently due to unsuitable queue instances being shielded first. Consolidating these shields into a single resource quota set would not be doable in this case.
{
name 30_for_each_project_in_Q001
description "not more than 30 per project of F001/F002 in Q001"
enabled TRUE
limit queues !Q001 to F001=0,F002=0
limit projects {*} queues Q001 to F001=30,F002=30
}
{
name 30_for_each_project_in_Q002
description "not more than 30 per project of F003/F004 in Q002"
enabled TRUE
limit queues !Q002 to F003=0,F004=0
limit projects {*} queues Q002 to F003=30,F004=30 }
In this example, the queues are consolidated from Q001-Q004 down to Q001-Q002. However, this actually increases overall cluster utilization and throughput. |

