|
Sun Grid Engine Information Center Improving Grid Engine PerformanceFor information about troubleshooting, see Troubleshooting and Error Messages. Fine-Tuning Your Grid EnvironmentThe Grid Engine system is a full-function, general-purpose distributed resource management tool. The scheduler component of the system supports a wide range of different compute farm scenarios. To get the maximum performance from your compute environment, you should review the features that are enabled. You should then determine which features you really need to solve your load management problem. Disabling some of these features can improve performance on the throughput of your cluster. Scheduler MonitoringScheduler monitoring can help you to determine why certain jobs are not dispatched. However, providing this information for all jobs at all times can consume resources. You usually do not need this much information. To disable scheduler monitoring, set schedd_job_info to false in the scheduler configuration. See Changing the Scheduler Configuration With QMON, and the sched_conf(5) man page. Finished JobsIn the case of array jobs, the finished job list in qmaster can become quite large. By switching the finished job list off, you save memory and speed up the qstat process, because qstat also fetches the finished jobs list. To turn off the finished job list function, set finished_jobs to zero in the cluster configuration. See Changing the Scheduler Configuration With QMON, and the sge_conf(5) man page. Job ValidationForced validation at job submission time can be a valuable procedure to prevent non-dispatchable jobs from forever remaining in a pending state. However, job validation can also be a time-consuming task. Job validation can be especially time-consuming in heterogeneous environments with different execution nodes and consumable resources, and in which all users have their own job profiles. In homogeneous environments with only a few different jobs, a general job validation usually can be omitted. To disable job verification, add the qsub option -w n in the cluster-wide default requests. For more information, see How to Submit Advanced Jobs With QMON and the sge_request(5) man page. Load Thresholds and Suspend ThresholdsLoad thresholds are needed if you deliberately oversubscribe your machines and you need to prevent excessive system load. Suspend thresholds are also used to prevent overloading the system. Another case where you want to prevent the overloading of a node is when the execution node is still open for interactive load. Interactive load is not under the control of the Grid Engine system. A compute farm might be more single-purpose. For example, each CPU at a compute node might be represented by only one queue slot, and no interactive load might be expected at these nodes. In such cases, you can omit load_thresholds. To disable both thresholds, set load_thresholds to none and suspend_thresholds to none. See Configuring Load and Suspend Thresholds, and the queue_conf(5) man page. Load AdjustmentsLoad adjustments are used to increase the measured load after a job is dispatched. This mechanism prevents over-subscription of machines that is caused by the delay between job dispatching and the corresponding load impact. You can switch off load adjustments if you do not need them. Load adjustments impose on the scheduler some additional work in connection with sorting hosts and load thresholds verification. To disable load adjustments, set job_load_adjustments to none and load_adjustment_decay_time to zero in the scheduler configuration. See Changing the Scheduler Configuration With QMON, and the sched_conf(5) man page. Immediate SchedulingThe default for the Grid Engine system is to start scheduling runs in a fixed schedule interval. A good feature of fixed intervals is that they limit the CPU time consumption of the qmaster and the scheduler. A bad feature is that fixed intervals choke the scheduler, artificially resulting in a limited throughput. Many compute farms have machines specifically dedicated to qmaster and the scheduler, and such setups provide no reason to choke the scheduler. See schedule_interval in sched_conf(5). You can configure immediate scheduling by using the flush_submit_sec and flush_finish_sec parameters of the scheduler configuration. See Changing the Scheduler Configuration With QMON, and the sched_conf(5) man page. If immediate scheduling is activated, the throughput of a compute farm is limited only by the power of the machine that is hosting sge_qmaster and the scheduler. Urgency Policy and Resource ReservationThe urgency policy enables you to customize job priority schemes that are resource-dependent. Such job priority schemes include the following:
The implementing of both objectives is especially valuable if you are using resource reservation. Using DTrace for Performance TuningTroubleshooting in a distributed system that spans potentially thousands of active components can challenge even the most experienced system administrator. In practice, Grid Engine administrators have no explicit mechanism for identifying and reproducing issues that lead to degraded performance in their production environments. In the Solaris TM 10 environment, you can use the DTrace utility to monitor the on-site performance of the Grid Engine master component. DTrace is a comprehensive framework for tracing dynamic events in Solaris 10 environments. For general information about DTrace, see http://www.sun.com/bigadmin/content/dtrace/ and the dtrace man page. For detailed information about using DTrace with Grid Engine software, view the $SGE_ROOT/dtrace/README_dtrace.txt file. Tuning Performance From the Command Line Through DTraceIf you can use Solaris 10 DTrace, you can use the $SGE_ROOT/dtrace/monitor.sh script to monitor a Grid Engine master and look for any bottlenecks. The monitor.sh script supports the following options:
Analyzing Bottlenecks on the Grid Engine MasterTo provide effective performance tuning, you must understand the bottlenecks of distributed systems. The $SGE_ROOT/dtrace/monitor.sh script measures throughput-relevant data of the running Grid Engine master and compiles this data into a few indices that are printed in a single-line view per interval. This view shows four main categories of information:
For more information, see the example below. Sample DTrace Output for Bottleneck AnalysisThe following monitoring output sample illustrates a case where a Grid Engine master bottleneck can be detected. The example shows the following information:
In this example, performance degraded between 17:40:32 and 17:41:05. CPU ID FUNCTION:NAME 0 1 :BEGIN Time | #wrt wrt/ms |#rep #gdi #ack| #dsp dsp/ms #sad| #snd #rcv| #in++ #in-- #out++ #out--| #lck0 #ulck0 #lck1 #ulck1 0 36909 :tick-3sec 2006 Nov 24 17:39:23 | 43 3| 0 8 4| 3 691 121| 4 4| 11 11 15 15| 68 68 289 288 0 36909 :tick-3sec 2006 Nov 24 17:39:26 | 83 16| 0 10 3| 3 699 122| 3 3| 14 13 17 17| 90 90 681 681 0 36909 :tick-3sec 2006 Nov 24 17:39:29 | 117 24| 0 9 4| 4 1092 198| 4 4| 13 13 17 17| 71 71 591 591 0 36909 :tick-3sec 2006 Nov 24 17:39:32 | 19 4| 0 9 3| 3 591 147| 3 3| 12 12 15 15| 44 43 249 249 0 36909 :tick-3sec 2006 Nov 24 17:39:35 | 144 28| 0 9 4| 4 1012 173| 4 4| 13 13 17 17| 61 62 1246 1247 0 36909 :tick-3sec 2006 Nov 24 17:39:38 | 46 5| 0 8 3| 3 705 122| 3 3| 11 11 14 14| 67 67 293 293 0 36909 :tick-3sec 2006 Nov 24 17:39:41 | 154 31| 0 9 3| 4 894 198| 3 3| 13 13 16 16| 73 72 968 969 0 36909 :tick-3sec 2006 Nov 24 17:39:44 | 46 5| 0 10 4| 4 971 162| 4 4| 13 13 17 17| 71 72 304 304 0 36909 :tick-3sec 2006 Nov 24 17:39:47 | 154 29| 0 8 3| 3 739 158| 3 3| 11 11 14 14| 67 67 990 990 0 36909 :tick-3sec 2006 Nov 24 17:39:50 | 46 5| 0 10 4| 4 815 162| 4 4| 14 14 18 18| 76 76 692 693 0 36909 :tick-3sec 2006 Nov 24 17:39:53 | 74 15| 0 8 3| 3 746 136| 3 3| 12 12 15 15| 54 53 571 571 0 36909 :tick-3sec 2006 Nov 24 17:39:56 | 116 20| 0 11 4| 4 992 184| 4 4| 14 14 18 18| 80 81 669 669 0 36909 :tick-3sec 2006 Nov 24 17:39:59 | 87 18| 0 11 4| 4 851 176| 5 4| 15 15 21 21| 77 76 670 670 0 36909 :tick-3sec 2006 Nov 24 17:40:02 | 109 20| 0 12 5| 4 930 184| 4 5| 17 17 20 20| 77 78 624 624 0 36909 :tick-3sec 2006 Nov 24 17:40:05 | 88 15| 0 9 3| 4 995 176| 3 3| 12 12 15 15| 71 71 1026 1026 0 36909 :tick-3sec 2006 Nov 24 17:40:08 | 112 20| 0 12 4| 4 927 184| 5 4| 16 16 22 22| 81 81 652 652 0 36909 :tick-3sec 2006 Nov 24 17:40:11 | 32 6| 0 7 4| 3 618 121| 3 4| 11 11 13 13| 54 53 336 336 0 36909 :tick-3sec 2006 Nov 24 17:40:14 | 145 30| 0 11 4| 4 988 199| 4 4| 15 15 19 19| 64 65 827 827 0 36909 :tick-3sec 2006 Nov 24 17:40:17 | 43 3| 0 7 3| 3 618 121| 3 3| 10 10 13 13| 64 64 286 286 0 36909 :tick-3sec 2006 Nov 24 17:40:20 | 157 31| 0 11 4| 4 977 199| 4 4| 15 15 19 19| 80 80 1406 1408 0 36909 :tick-3sec 2006 Nov 24 17:40:23 | 43 4| 0 7 3| 3 701 121| 3 3| 10 10 13 13| 64 64 285 285 0 36909 :tick-3sec 2006 Nov 24 17:40:26 | 73 18| 0 11 4| 4 948 171| 4 4| 15 15 19 19| 77 77 700 700 0 36909 :tick-3sec 2006 Nov 24 17:40:29 | 127 31| 0 10 4| 4 968 189| 4 4| 14 14 18 18| 74 74 584 584 0 36909 :tick-3sec 2006 Nov 24 17:40:32 | 10 3| 0 6 0| 1 203 41| 0 0| 58 8 62 62| 23 22 106 106 0 36909 :tick-3sec 2006 Nov 24 17:40:35 | 19 5| 0 5 0| 0 0 0| 0 0| 8 5 13 13| 30 30 200 200 0 36909 :tick-3sec 2006 Nov 24 17:40:38 | 16 5| 0 5 1| 0 0 0| 0 0| 5 6 10 10| 27 26 558 559 0 36909 :tick-3sec 2006 Nov 24 17:40:41 | 1 0| 0 4 0| 0 0 0| 0 0| 7 4 11 11| 9 9 34 34 0 36909 :tick-3sec 2006 Nov 24 17:40:44 | 0 0| 0 4 0| 0 0 0| 0 0| 7 4 11 11| 8 8 28 28 0 36909 :tick-3sec 2006 Nov 24 17:40:47 | 0 0| 0 6 0| 1 744 81| 1 1| 10 6 15 15| 14 14 33 33 0 36909 :tick-3sec 2006 Nov 24 17:40:50 | 1 0| 0 5 1| 0 0 0| 0 0| 8 6 14 14| 11 11 49 49 0 36909 :tick-3sec 2006 Nov 24 17:40:53 | 0 0| 0 4 0| 0 0 0| 0 0| 9 4 12 12| 6 7 28 28 0 36909 :tick-3sec 2006 Nov 24 17:40:56 | 0 0| 0 5 0| 0 0 0| 0 0| 8 5 13 13| 12 12 420 420 0 36909 :tick-3sec 2006 Nov 24 17:40:59 | 0 0| 0 4 0| 0 0 0| 0 0| 8 4 12 12| 9 8 30 30 0 36909 :tick-3sec 2006 Nov 24 17:41:02 | 0 0| 0 4 1| 0 0 0| 0 0| 12 5 16 16| 7 8 25 25 0 36909 :tick-3sec 2006 Nov 24 17:41:05 | 165 41| 0 48 60| 0 0 0| 1 1| 23 106 71 71| 96 97 1236 1236 0 36909 :tick-3sec 2006 Nov 24 17:41:08 | 178 28| 0 15 53| 4 965 206| 4 4| 68 68 75 75| 130 130 1336 1336 0 36909 :tick-3sec 2006 Nov 24 17:41:11 | 106 23| 0 27 35| 4 855 166| 4 4| 82 82 91 91| 115 114 1040 1040 0 36909 :tick-3sec 2006 Nov 24 17:41:14 | 198 37| 0 41 70| 4 1189 196| 4 4| 185 185 185 185| 134 135 1327 1327 0 36909 :tick-3sec 2006 Nov 24 17:41:17 | 16 5| 0 9 5| 4 940 161| 3 3| 17 17 20 20| 43 42 234 234 0 36909 :tick-3sec 2006 Nov 24 17:41:20 | 162 35| 0 13 8| 4 958 200| 4 4| 23 23 28 28| 80 81 1018 1018 0 36909 :tick-3sec 2006 Nov 24 17:41:23 | 44 6| 0 6 3| 2 544 81| 3 3| 8 8 11 11| 63 63 747 747 0 36909 :tick-3sec 2006 Nov 24 17:41:26 | 150 34| 0 13 6| 4 921 199| 4 4| 21 21 25 25| 73 72 923 923 0 36909 :tick-3sec 2006 Nov 24 17:41:29 | 43 3| 0 5 2| 2 506 81| 2 2| 7 7 9 9| 57 57 260 260 0 36909 :tick-3sec 2006 Nov 24 17:41:32 | 157 37| 0 9 3| 4 978 199| 3 3| 13 13 16 16| 73 72 970 970 0 36909 :tick-3sec 2006 Nov 24 17:41:35 | 43 3| 0 7 3| 2 512 85| 3 3| 9 9 12 12| 61 62 274 274 0 36909 :tick-3sec 2006 Nov 24 17:41:38 | 127 29| 0 8 3| 4 994 185| 3 3| 11 11 14 14| 68 68 1265 1265 0 36909 :tick-3sec 2006 Nov 24 17:41:41 | 66 11| 0 10 4| 4 973 171| 4 4| 14 14 18 18| 67 67 354 354 0 36909 :tick-3sec 2006 Nov 24 17:41:44 | 48 10| 0 8 3| 3 785 128| 3 3| 11 11 14 14| 52 51 399 399 0 36909 :tick-3sec 2006 Nov 24 17:41:47 | 142 31| 0 12 4| 4 913 192| 5 4| 17 17 23 23| 89 90 830 830 0 36909 :tick-3sec 2006 Nov 24 17:41:50 | 64 13| 0 11 5| 4 853 168| 4 5| 15 15 18 18| 75 75 542 542 |