|
Sun Grid Engine Information Center Derived Values and Deletion RulesDerived ValuesAt dbwriter startup, and in continuous mode once an hour, derived values are calculated. You can configure which values to calculate in an XML file, which is by default in $SGE_ROOT/dbwriter/database/<database_type>/dbwriter.xml. <database_type> defines the type of database being used; currently, Oracle, PostgreSQL and MySQL are supported. The path to the configuration file is passed to dbwriter during installation and is stored in the dbwriter.conf file as the value of the parameter DBWRITER_CALCULATION_FILE. The configuration file uses an XML format, and contains rules for both derived values and deleted values (described in the next section). Derived Values FormatThe rules for derived values have the following format.
Derived Values ExamplesHere is an example of a derivation rule using the <sql> tag. The sge_queue table has a composed primary key comprised of q_qname and q_hostname. For a rule specified for the queue object_type, a query will be made for each entry in the sge_queue table, the placeholders _key_0 will be replaced by the queue name and key_1_ will be replaced by the hostname. <!--average queue utilization per hour--> <derive object="queue" interval="hour" variable="h_utilized"> <sql> SELECT DATE_TRUNC( 'hour', qv_time_start) AS time_start, DATE_TRUNC( 'hour', qv_time_start) + INTERVAL '1 hour' AS time_end, AVG(qv_dvalue * 100 / qv_dconfig) AS value FROM sge_queue_values WHERE qv_variable = 'slots' AND qv_parent = (SELECT q_id FROM sge_queue WHERE q_qname = __key_0__ AND q_hostname = __key_1__) AND qv_time_start <= '__time_end__' AND qv_time_end > '__time_start__' GROUP BY time_start </sql> </derive> Here is an example when the rule above is processed by the dbwriter. A query will be made for each entry in the sge_queue table, the placeholders _key_0 will be replaced by the queue name and key_1_ will be replaced by the hostname. In this example, the results of these queries will be inserted in the sge_queue_values table, because object="queue".
SELECT DATE_TRUNC('hour', qv_time_start) AS time_start,
DATE_TRUNC( 'hour', qv_time_start) + INTERVAL '1 hour' AS time_end,
AVG(qv_dvalue * 100 / qv_dconfig) AS value
FROM sge_queue_values
WHERE qv_variable = 'slots' AND
qv_parent = (SELECT q_id FROM sge_queue WHERE q_qname = 'all.q' AND q_hostname = 'my.hostname') AND
qv_time_start <= '2008-05-21 00:00:00.0' AND
qv_time_end > '1970-01-01 01:00:00.0'
GROUP BY time_start;
Here is an example of a derivation rule using the <auto> tag. <derive object="host" interval="day" variable="d_load"> <auto function="AVG" variable="h_load" /> </derive> Deleting Outdated RecordsAt dbwriter startup, and in continuous mode once an hour, outdated records will be deleted. You can configure how these records are calculated in an XML file, by default in $SGE_ROOT/dbwriter/database/<database_type>/dbwriter.xml. <database_type> is the type of database being used; currently, Oracle, PostgreSQL and MySQL are supported. The path to the configuration file is passed to dbwriter during installation and is stored in the dbwriter.conf file as the value of the parameter DBWRITER_CALCULATION_FILE. Deletion Rules FormatThe configuration file contains rules for both derived values and deleted values. Deletion rules are of the following format.
For certain scopes, a sub-scope can be configured. The sub-scope specifies an additional condition for deletion. A sub-scope can be configured for all *_values scopes and for the share_log scope. The following rules apply:
Here is an example of a delete tag: <?xml version="1.0" encoding="UTF-8"?> <DbWriterConfig> <!-- keep host values for 2 years --> <delete scope="host_values" time_range="year" time_amount="2"/> <!-- keep queue values one month --> <delete scope="queue_values" time_range="month" time_amount="1"> <sub_scope>slots</sub_scope> <sub_scope>state</sub_scope> </delete> </DbWriterConfig> Deletion Rules ExamplesThe following rule indicates that the four variables given in the subscope should be deleted from the table sge_host_values after 7 days. <delete scope="host_values" time_range="day" time_amount="7"> <sub_scope>np_load_avg</sub_scope> <sub_scope>cpu<sub_scope> <sub_scope>mem_free</sub_scope> <sub_scope>virtual_free</sub_scope> </delete> The following rule says to delete all variables from the table sge_host_values after two years: <delete scope="host_values" time_range="year" time_amount="2"/> The following rule says to delete all records for user fred after one month: <delete scope="share_log" time_range="month" time_amount="1"> <sub_scope>fred</sub_scope> </delete> |
Comments (1)
Nov 13
weicfa says:
Hi In our grid environment (about 500 servers , 3000 cpus , 12000 job every da...Hi
In our grid environment (about 500 servers , 3000 cpus , 12000 job every day ), The size of host_values table (mysql database for ARCO )grow up very fast( 350 million row every day ) even I change deletion rule to 1 minute as follow.
<delete scope="host_values" time_range="minute" time_amount="1">
<sub_scope>np_load_avg</sub_scope>
<sub_scope>cpu</sub_scope>
<sub_scope>h_cpu</sub_scope>
<sub_scope>h_load</sub_scope>
<sub_scope>d_load</sub_scope>
<sub_scope>h_jobs_finished</sub_scope>
<sub_scope>h_job_finished</sub_scope>
<sub_scope>mem_free</sub_scope>
<sub_scope>mem_used</sub_scope>
<sub_scope>virtual_free</sub_scope>
</delete>
I need to delete the obsolete data in host_values table manually , otherwise it will become a monster(over 100G for 2 months) and query speed will become very slow.
The speed of deletion can not catch up the speed of data insert into host_values table. May be the root cause
was the limitation of 500 line per deletion action(http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=107536).
It's really bother gridengine user if we need to use accounting function .
If there are detail descriptions document about how to disable data insert into database in host_values table or other tables. It will be very useful when user do not need those accounting data.