|
Sun Grid Engine Information Center Managing Policies
Grid Engine policies are implemented in conjunction with the Grid Engine scheduler. For information, see Managing the Scheduler. About Grid Engine PoliciesThe Grid Engine software orchestrates the delivery of computational power, based on enterprise resource policies that the administrator manages. The software uses these policies to examine available computer resources in the grid. The software gathers these resources, and then it allocates and delivers them automatically, in a way that optimizes usage across the grid. To enable cooperation in the grid, project owners must do the following:
As administrator, you can define high-level usage policies that are customized for your site. Four such policies are available:
Policy management automatically controls the use of shared resources in the cluster to achieve your goals. High-priority jobs are dispatched preferentially. These jobs receive greater CPU entitlements when they are competing with other, lower-priority jobs. The Grid Engine software monitors the progress of all jobs. It adjusts their relative priorities correspondingly, and with respect to the goals that you define in the policies. This policy-based resource allocation grants each user, team, department, and all projects an allocated share of system resources. This allocation of resources extends over a specified period of time, such as a week, a month, or a quarter. Configuring Policy-Based Resource Management With QMONOn the QMON Main Control window, click the Policy Configuration button. The Policy Configuration dialog box appears. The Policy Configuration dialog box lets you directly edit the following information:
You can also access detailed configuration dialog boxes for the three ticket-based policies.
To refresh the information displayed in the Policy Configuration dialog box, click Refresh. To save any changes that you make to the Policy Configuration, click Apply. To close the dialog box without saving changes, click Done. Specifying Policy PriorityBefore the Grid Engine system dispatches jobs, the jobs are brought into priority order, highest priority first. Without any administrator influence, the order is first-in, first-out (FIFO). On the Policy Configuration dialog box, under Policy Importance Factor, you can specify the relative importance of the three priority types that control the sorting order of jobs. For example, if you specify Priority as 1, Urgency as 0.1, and Ticket as 0.01, job priority that is specified by the qsub --p command is given the most weight, job priority that is specified by the Urgency Policy is considered next, and job priority that is specified by the Ticket Policy is given the least weight.
For more information about job priorities, see Job Sorting. You can specify a weighting factor for each priority type. This weighting factor determines the degree to which each type of priority affects overall job priority. To make it easier to control the range of values for each priority type, normalized values are used instead of the raw ticket values, urgency values, and POSIX priority values. The following formula expresses how a job's priority values are determined: Job priority = Urgency * normalized urgency value + Ticket * normalized ticket value + Priority * normalized priority value Configuring the Urgency PolicyThe Urgency Policy defines an urgency value for each job. This urgency value is determined by the sum of the following three contributing elements:
For details about how the Grid Engine system arrives at the urgency value total, see About the Urgency Policy. Configuring Ticket-Based PoliciesThe tickets that are currently assigned to individual policies are listed under Current Active Tickets in the Policy Configuration dialog box. The numbers reflect the relative importance of the policies. The numbers indicate whether a certain policy currently dominates the cluster or whether policies are in balance. Tickets provide a quantitative measure. For example, you might assign twice as many tickets to the share-based policy as you assign to the functional policy. This means that twice the resource entitlement is allocated to the share-based policy than is allocated to the functional policy. In this sense, tickets behave very much like stock shares. The total number of all tickets has no particular meaning. Only the relations between policies counts. Hence, total ticket numbers are usually quite high to allow for fine adjustment of the relative importance of the policies. Under Edit Tickets, you can modify the number of tickets that are allocated to the share tree policy and the functional policy. For details, see Editing Tickets. Select the Share Override Tickets check box to control the total ticket amount distributed by the override policy. Deselect the Share Override Tickets check box to control the importance of individual jobs relative to the ticket pools that are available for the other policies and override categories. With this setting, the number of jobs that are under a category member does not matter. The jobs always get the same number of tickets. However, the total number of override tickets in the system increases as the number of jobs with a right to receive override tickets increases. Other policies can lose importance in such cases. For detailed information, see Sharing Override Tickets. Select the Share Functional Tickets check box to give a category member a constant entitlement level for the sum of all its jobs. Deselect the check box to give each job the same entitlement level, based on its category member's entitlement. For detailed information, see Sharing Functional Ticket Shares. You can set the maximum number of jobs that can be scheduled in the functional policy. The default value is 200. You can set the maximum number of pending subtasks that are allowed for each array job. The default value is 50. Use this setting to reduce scheduling overhead. You can specify the Ticket Policy Hierarchy to resolve certain cases of conflicting policies. The resolving of policy conflicts applies particularly to pending jobs. For detailed information, see Setting the Ticket Policy Hierarchy. Editing TicketsYou can edit the total number of share-tree tickets and functional tickets. Override tickets are assigned directly through the override policy configuration. The other ticket pools are distributed automatically among jobs that are associated with the policies and with respect to the actual policy configuration. Sharing Override TicketsThe administrator assigns tickets to the different members of the override categories, that is, to individual users, projects, departments, or jobs. Consequently, the number of tickets that are assigned to a category member determines how many tickets are assigned to jobs under that category member. For example, the number of tickets that are assigned to user A determines how many tickets are assigned to all jobs of user A.
Use the Share Override Tickets check box to set the share_override_tickets parameter of sched_conf(5). This parameter controls how job ticket values are derived from their category member ticket value. When you select the Share Override Tickets check box, the tickets of the category members are distributed evenly among the jobs under this member. If you deselect the Share Override Tickets check box, each job inherits the ticket amount defined for its category member. In other words, the category member tickets are replicated for all jobs underneath. Select the Share Override Tickets check box to control the total ticket amount distributed by the override policy. With this setting, ticket amounts that are assigned to a job can become negligibly small if many jobs are under one category member. For example, ticket amounts might diminish if many jobs belong to one member of the user category. Deselect the Share Override Tickets check box to control the importance of individual jobs relative to the ticket pools that are available for the other policies and override categories. With this setting, the number of jobs that are under a category member does not matter. The jobs always get the same number of tickets. However, the total number of override tickets in the system increases as the number of jobs with a right to receive override tickets increases. Other policies can lose importance in such cases. Sharing Functional Ticket SharesThe functional policy defines entitlement shares for the functional categories. Then the policy defines shares for all members of each of these categories. The functional policy is thus similar to a two-level share tree. The difference is that a job can be associated with several categories at the same time. The job belongs to a particular user, for instance, but the job can also belong to a project or a department. However, as in the share tree, the entitlement shares that a job receives from a functional category is determined by the following:
Use the Share Functional Tickets check box to set the share_functional_shares parameter of sched_conf(5). This parameter defines how the category member shares are used to determine the shares of a job. The shares assigned to the category members, such as a particular user or project, can be replicated for each job. Alternatively, shares can be distributed among the jobs under the category member.
Those shares are comparable to stock shares. Such shares have no effect for the jobs that belong to the same category member. All jobs under the same category member have the same number of shares in both cases. But the share number has an effect when comparing the share amounts within the same category. Jobs with many siblings that belong to the same category member receive relatively small share portions if you select the Share Functional Tickets check box. On the other hand, if you clear the Share Functional Tickets check box, all sibling jobs receive the same share amount as their category member. Select the Share Functional Tickets check box to give a category member a constant entitlement level for the sum of all its jobs. The entitlement of an individual job can get negligibly small, however, if the job has many siblings. Deselect the Share Functional Tickets check box to give each job the same entitlement level, based on its category member's entitlement. The number of job siblings in the system does not matter.
Be aware that the setting of share functional shares does not determine the total number of functional tickets that are distributed. The total number is always as defined by the administrator for the functional policy ticket pool. The share functional shares parameter influences only how functional tickets are distributed within the functional policy. Example – Functional PolicyThe following example describes a common scenario where a user wishes to translate the Sun Grid Engine 5.3 Scheduler Option -user_sort true to a Sun Grid Engine 6.2 configuration but does not understand the share override functional policy ticket feature. For a plain user-based equal share, you configure your global configuration sge_conf(5) with
Then you use -weight_tickets_functional 10000 in the scheduler configuration sched_conf(5). This action causes the functional policy to be used for user-based equal share scheduling with 100 shares for each user. Tuning Scheduling Run TimePending jobs are sorted according to the number of tickets that each job has, as described in Job Sorting. The scheduler reports the number of tickets each pending job has to the master daemon sge_qmaster. However, on systems with very large numbers of jobs, you might want to turn off ticket reporting. When you turn off ticket reporting, you disable ticket-based job priority. The sort order of jobs is based only on the time each job is submitted. To turn off the reporting of pending job tickets to sge_qmaster, clear the Report Pending Job Tickets check box on the Policy Configuration dialog box. Doing so sets the report_pjob_tickets parameter of sched_conf(5) to false. Setting the Ticket Policy HierarchyTicket policy hierarchy provides the means to resolve certain cases of conflicting ticket policies. The resolving of ticket policy conflicts applies particularly to pending jobs. Such cases can occur in combination with the share-based policy and the functional policy. With both policies, assigning priorities to jobs that belong to the same leaf-level entities is done on a first-come, first-served basis. Leaf-level entities include the following:
Members of the job category are not included among leaf-level entities. So, for example, the first job of the same user gets the most, the second gets the next most, the third next, and so on. A conflict can occur if another policy mandates an order that is different. So, for example, the override policy might define the third job as the most important, whereas the first job that is submitted should come last. A policy hierarchy might gives the override policy higher priority over the share-tree policy or the functional policy. Such a policy hierarchy ensures that high-priority jobs under the override policy get more entitlements than jobs in the other two policies. Such jobs must belong to the same leaf level entity (user or project) in the share tree. The Ticket Policy Hierarchy can be a combination of up to three letters. These letters are the first letters of the names of the following three ticket policies:
Use these letters to establish a hierarchy of ticket policies. The first letter defines the top policy. The last letter defines the bottom of the hierarchy. Policies that are not listed in the policy hierarchy do not influence the hierarchy. However, policies that are not listed in the hierarchy can still be a source for tickets of jobs. However, those tickets do not influence the ticket calculations in other policies. All tickets of all policies are added up for each job to define its overall entitlement. The following examples describe two settings and how they influence the order of the pending jobs:
All combinations of the three letters are theoretically possible, but only a subset of the combinations are meaningful or have practical relevance. The last letter should always be S or F, because only those two policies can be influenced due to their characteristics described in the examples. The following form is recommended for policy_hierarchy settings: [O][S|F] If the override policy is present, O should occur as the first letter only, because the override policy can only influence. The share-based policy and the functional policy can only be influenced. Therefore S or F should occur as the last letter. Configuring the Share-Based PolicyShare-based scheduling grants each user and project its allocated share of system resources during an accumulation period such as a week, a month, or a quarter. Share-based scheduling is also called share tree scheduling. It constantly adjusts each user's and project's potential resource share for the near term, until the next scheduling interval. Share-based scheduling is defined for user or for project, or for both. Share-based scheduling ensures that a defined share is guaranteed to the instances that are configured in the share tree over time. Jobs that are associated with share-tree branches where fewer resources were consumed in the past than anticipated are preferred when the system dispatches jobs. At the same time, full resource usage is guaranteed, because unused share proportions are still available for pending jobs associated with other share-tree branches. By giving each user or project its targeted share as far as possible, groups of users or projects also get their targeted share. Departments or divisions are examples of such groups. Fair share for all entities is attainable only when every entity that is entitled to resources contends for those resources during the accumulation period. If a user, a project, or a group does not submit jobs during a given period, the resources are shared among those who do submit jobs. Share-based scheduling is a feedback scheme. The share of the system to which any user or user-group, or project or project-group, is entitled is a configuration parameter. The share of the system to which any job is entitled is based on the following factors:
The Grid Engine software keeps track of how much usage users and projects have already received. At each scheduling interval, the Scheduler adjusts all jobs' share of resources. Doing so ensures that all users, user groups, projects, and project groups get close to their fair share of the system during the accumulation period. In other words, resources are granted or are denied to keep everyone more or less at their targeted share of usage. The Half-Life FactorHalf-life is how fast the system "forgets" about a user's resource consumption. The administrator decides whether to penalize a user for high resource consumption, be it six months ago or six days ago. The administrator also decides how to apply the penalty. On each node of the share tree, Grid Engine software maintains a record of users' resource consumption. With this record, the system administrator can decide how far to look back to determine a user's under-usage or over-usage when setting up a share-based policy. The resource usage in this context is the mathematical sum of all the computer resources that are consumed over a "sliding window of time." The length of this window is determined by a "half-life" factor, which in the Grid Engine system is an internal decay function. This decay function reduces the impact of accrued resource consumption over time. A short half-life quickly lessens the impact of resource overconsumption. A longer half-life gradually lessens the impact of resource overconsumption. This half-life decay function is a specified unit of time. For example, consider a half-life of seven days that is applied to a resource consumption of 1,000 units. This half-life decay factor results in the following usage "penalty" adjustment over time:
The half-life-based decay diminishes the impact of a user's resource consumption over time, until the effect of the penalty is negligible.
Compensation FactorSometimes the comparison shows that actual usage is well below targeted usage. In such a case, the adjusting of a user's share or a project's share of resource can allow a user to dominate the system. Such an adjustment is based on the goal of reaching target share. This domination might not be desirable. The compensation factor enables an administrator to limit how much a user or a project can dominate the resources in the near term. For example, a compensation factor of two limits a user's or project's current share to twice its targeted share. Assume that a user or a project should get 20 percent of the system resources over the accumulation period. If the user or project currently gets much less, the maximum that it can get in the near term is only 40 percent. The share-based policy defines long-term resource entitlements of users or projects as per the share tree. When combined with the share-based policy, the compensation factor makes automatic adjustments in entitlements. If a user or project is either under or over the defined target entitlement, the Grid Engine system compensates. The system raises or lowers that user's or project's entitlement for a short term over or under the long-term target. This compensation is calculated by a share tree algorithm. The compensation factor provides an additional mechanism to control the amount of compensation that the Grid Engine system assigns. The additional compensation factor (CF) calculation is carried out only if the following conditions are true:
If either condition is not true, or if both conditions are not true, the compensation as defined and implemented by the share-tree algorithm is used. The smaller the value of the CF, the greater is its effect. If the value is greater than 1, the Grid Engine system's compensation is limited. The upper limit for compensation is calculated as long-term-entitlement multiplied by the CF. And as defined earlier, the short-term entitlement must exceed this limit before anything happens based on the compensation factor. If the CF is 1, the Grid Engine system compensates in the same way as with the raw share-tree algorithm. So a value of one has an effect that is similar to a value of zero. The only difference is an implementation detail. If the CF is one, the CF calculations are carried out without an effect. If the CF is zero, the calculations are suppressed. If the value is less than 1, the Grid Engine system overcompensates. Jobs receive much more compensation than they are entitled to based on the share-tree algorithm. Jobs also receive this overcompensation earlier, because the criterion for activating the compensation is met at lower short-term entitlement values. The activating criterion is short-term-entitlement > long-term-entitlement * CF. Hierarchical Share TreeThe share-based policy is implemented through a hierarchical share tree. The share tree specifies, for a moving accumulation period, how system resources are to be shared among all users and projects. The length of the accumulation period is determined by a configurable decay constant. The Grid Engine system bases a job's share entitlement on the degree to which each parent node in the share tree reaches its accumulation limit. A job's share entitlement is based on its leaf node share allocation, which in turn depends on the allocations of its parent nodes. All jobs associated with a leaf node split the associated shares. The entitlement derived from the share tree is combined with other entitlements, such as entitlements from a functional policy, to determine a job's net entitlement. The share tree is allotted the total number of tickets for share-based scheduling. This number determines the weight of share-based scheduling among the four scheduling policies. The share tree is defined during installation. The share tree can be altered at any time. When the share tree is edited, the new share allocations take effect at the next scheduling interval. Configuring the Share-Tree Policy With QMONOn the QMON Policy Configuration dialog box, click Share Tree Policy. The Share Tree Policy dialog box appears. Node AttributesUnder Node Attributes, the attributes of the selected node are displayed:
When a user node or a project node is removed and then added back, the user's or project's usage is retained. A node can be added back either at the same place or at a different place in the share tree. You can zero out that usage before you add the node back to the share tree. To do so, first remove the node from the users or projects configured in the Grid Engine system. Then add the node back to the users or projects there. Users or projects that were not in the share tree but that ran jobs have nonzero usage when added to the share tree. To zero out usage when you add such users or projects to the tree, first remove them from the users or projects configured in the Grid Engine system. Then add them to the tree. To add an interior node under the selected node, click Add Node. A blank Node Info window appears, where you can enter the node's name and number of shares. You can enter any node name or share number. To add a leaf node under the selected node, click Add Leaf. A blank Node Info window appears, where you can enter the node's name and number of shares. The node's name must be an existing Grid Engine user (Configuring User Objects With QMON) or project (Defining Projects). The following rules apply when you are adding a leaf node:
To edit the selected node, click Modify. A Node Info window appears. The window displays the mode's name and its number of shares. To cut or copy the selected node to a buffer, click Cut or Copy. To paste under the selected node the contents of the most recently cut or copied node, click Paste. To delete the selected node and all its descendants, click Delete. To clear the entire share-tree hierarchy, click Clear Usage. Clear the hierarchy when the share-based policy is aligned to a budget and needs to start from scratch at the beginning of each budget term. The Clear Usage facility also is handy when setting up or modifying test Grid Engine software environments. QMON periodically updates the information displayed in the Share Tree Policy dialog box. Click Refresh to force the display to refresh immediately. To save all the node changes that you make, click Apply. To close the dialog box without saving changes, click Done. To search the share tree for a node name, click Find, and then type a search string. Node names are indicated which begin with the case sensitive search string. Click Find Next to find the next occurrence of the search string. Click Help to open the online help system. Share Tree Policy ParametersTo display the Share Tree Policy Parameters, click the arrow at the right of the Node Attributes.
The actual usage of a user or project can be far below its targeted usage. The compensation factor prevents such users or projects from dominating resources when they first get those resources. See Compensation Factor for more information. About the Special User defaultYou can use the special user default to reduce the amount of share-tree maintenance for sites with many users. Under the share-tree policy, a job's priority is determined based on the node that the job maps to in the share tree. Users who are not explicitly named in the share tree are mapped to the default node, if it exists. The specification of a single default node allows for a simple share tree to be created. Such a share tree makes user-based fair sharing possible. You can use the default user also in cases where the same share entitlement is assigned to most users. Same share entitlement is also known as equal share scheduling. The default user configures all user entries under the default node, giving the same share amount to each user. Each user who submits jobs receives the same share entitlement as that configured for the default user. To activate the facility for a particular user, you must add this user to the list of Grid Engine users. The share tree displays "virtual" nodes for all users who are mapped to the default node. The display of virtual nodes enables you to examine the usage and the fair-share scheduling parameters for users who are mapped to the default node. You can also use the default user for "hybrid" share trees, where users are subordinated under projects in the share tree. The default user can be a leaf node under a project node. The short-term entitlements of users vary according to differences in the amount of resources that the users consume. However, long-term entitlements of users remain the same. You might want to assign lower or higher entitlements to some users while maintaining the same long-term entitlement for all other users. To do so, configure a share tree with individual user entries next to the default user for those users with special entitlements. In Example A, all users submitting to Project A get equal long-term entitlements. The users submitting to Project B only contribute to the accumulated resource consumption of Project B. Entitlements of Project B users are not managed. Example A
Compare Example A with Example B: Example B
In Example B, treatment for Project A is the same as for Example A. But all default users who submit jobs to Project B, except users A and B, receive equal long-term resource entitlements. Default users have 20 shares. User A, with 10 shares, receives half the entitlement of the default users. User B, with 40 shares, receives twice the entitlement as the default users. How to Create Project-Based Share-Tree SchedulingThe objective of this setup is to guarantee a certain share assignment of all the cluster resources to different projects over time.
Configuring the Functional PolicyFunctional scheduling is a nonfeedback scheme for determining a job's importance. Functional scheduling associates a job with the submitting user, project, or department. Functional scheduling is sometimes called priority scheduling. The functional policy setup ensures that a defined share is guaranteed to each user, project, job, or department at any time. Jobs of users, projects, or departments that have used fewer resources than anticipated are preferred when the system dispatches jobs to idle resources. At the same time, full resource usage is guaranteed, because unused share proportions are distributed among those users, projects, departments, and jobs that need the resources. Past resource consumption is not taken into account. Functional policy entitlement to system resources is combined with other entitlements in determining a job's net entitlement. For example, functional policy entitlement might be combined with override policy entitlement. The total number of tickets that are allotted to the functional policy determines the weight of functional scheduling among the scheduling policies. During installation, the administrator divides the total number of functional tickets among the functional categories of user, department, project, and job. Functional SharesFunctional shares are assigned to every member of each functional category: user, department, project, and job. These shares indicate the proportion of the tickets for a category to which each job associated with a member of the category is entitled. For example, user davidson has 200 shares, and user donlee has 100. A job submitted by davidson is entitled to twice as many user-functional-tickets as a job submitted by donlee. The functional tickets that are allotted to each category are shared among all the jobs that are associated with a particular category. Configuring the Functional Share Policy With QMONAt the bottom of the QMON Policy Configuration dialog box, click Functional Policy. The Functional Policy dialog box appears. Function Category ListSelect the functional category for which you are defining functional shares: user, project, department, or job. Functional Shares TableThe table under Functional Shares is scrollable. The table displays the following information:
QMON periodically updates the information displayed in the Functional Policy dialog box. Click Refresh to force the display to refresh immediately. To save all node changes that you make, click Apply. To close the dialog box without saving changes, click Done. Changing Functional ConfigurationsClick the jagged arrow above the Functional Shares table to open a configuration dialog box.
Ratio Between Sorts of Functional TicketsTo display the Ratio Between Sorts Of Functional Tickets, click the arrow at the right of the Functional Shares table. User [%], Department [%], Project [%], and Job [%] always add up to 100%. When you change any of the sliders, all other unlocked sliders change to compensate for the change. When a lock is open, the slider that it guards can change freely. The slider can change either because it is moved or because the moving of another slider causes this slider to change. When a lock is closed, the slider that it guards cannot change. If four locks are closed and one lock is open, no sliders can change.
Creating User-Based, Project-Based, and Department-Based Functional SchedulingUse this setup to create a certain share assignment of all the resources in the cluster to different users, projects, or departments. First-come, first-served scheduling is used among jobs of the same user, project, or department.
Configuring the Override PolicyOverride scheduling enables a Grid Engine system manager or operator to dynamically adjust the relative importance of one job or of all jobs that are associated with a user, a department, or a project. This adjustment adds tickets to the specified job, user, department, or project. By adding override tickets, override scheduling increases the total number of tickets that a user, department, project, or job has. As a result, the overall share of resources is increased. The addition of override tickets also increases the total number of tickets in the system. These additional tickets deflate the value of every job's tickets. You can use override tickets for the following two purposes:
Override tickets that are assigned directly to a job go away when the job finishes. All other tickets are inflated back to their original value. Override tickets that are assigned to users, departments, projects, and jobs remain in effect until the administrator explicitly removes the tickets. The Policy Configuration dialog box displays the current number of override tickets that are active in the system.
Configuring the Override Policy With QMONAt the bottom of the Policy Configuration dialog box, click Override Policy. The Override Policy dialog box appears. Override Category ListSelect the category for which you are defining override tickets: user, project, department, or job. Override TableThe override table is scrollable. It displays the following information:
QMON periodically updates the information that is displayed in the Override Policy dialog box. Click Refresh to force the display to refresh immediately. To save all override changes that you make, click Apply. To close the dialog box without saving changes, click Done. Changing Override ConfigurationsClick the jagged arrow above the override table to open a configuration dialog box.
Configuring Policies From the Command Line
Configuring the Share-Based Policy From the Command Line
To configure the share-based policy from the command line, use the qconf command with appropriate options.
Configuring the Functional Share Policy From the Command LineTo configure the functional share policy from the command line, use the qconf command with the appropriate options.
To assign functional shares to jobs, use the -js job_share option with the qsub, qsh, qrsh, qlogin, and qalter commands. The -js job_share option defines or redefines the job share of the job relative to other jobs. job_share is an unsigned integer value. The default job_share value for jobs is 0. Configuring the Override Policy From the Command LineTo configure the override policy from the command line, use the qconf command with the appropriate options.
To change the number of override tickets for the specified job, use the qalter -ot override_tickets command. |







