DTrace Quick Start Guide

DTrace Quick Start Guide

The DTrace Quick Start Guide: Observing Native and Web Applications in Production [PDF] is a small book (36 pages) to get you started using DTrace to collect information about your system or application. DTrace is especially well suited for collecting data about applications running on a live system under a real workload.

This guide provides example DTrace scripts, including examples that examine an AMP stack and Web 2.0 applications. These examples are included here on this page so that you can copy them more easily. Since these are the examples that come with the book, please do not change these, but of course feel free to add your own.

Many more sample scripts are available in /usr/demo/dtrace on your Solaris or OpenSolaris system and in /opt/DTT (DTraceToolkit) on your OpenSolaris system. See the Preface of this Quick Start Guide for many references to DTrace information.

Thank you to Angelo Rajadurai, Sun ISV Engineer, for providing the content for this guide. See also Angelo's presentation at CommunityOne West 2009: "Observing Applications with DTrace" [PDF] [video]

Developing a DTrace Script

Using Aggregate Functions

In the following example, the $1 is a pid that you enter when you invoke the script. Aggregations collect the output in tables in memory and output a summary. In this example, the output table lists each function only one time, with the number of times that function was entered, instead of listing each function each time it was entered.

#!/usr/sbin/dtrace -s
pid$1:::entry
{
    @count_table[probefunc] = count() ;
}

Other popular aggregation functions include average(), min(), max(), and sum().

Calculating Time Spent in Each Function

The following example gives the function name and the total time spent in that function during the time the script ran.

#!/usr/sbin/dtrace -s
pid$1:::entry
{
    ts[probefunc] = timestamp;
}
pid$1:::return
{
    @func_time[probefunc] = sum(timestamp - ts[probefunc]);
    ts[probefunc] = 0;
}

In the following modification, functions that were already entered before the script started are not counted. The predicate (ts[probefunc]) ignores a function return if there was no enter for that function.

#!/usr/sbin/dtrace -s
pid$1:::entry
{
    ts[probefunc] = timestamp;
}
pid$1:::return
/ts[probefunc]/
{
    @func_time[probefunc] = sum(timestamp - ts[probefunc]);
    ts[probefunc] = 0;
}

The following modification handles multithreaded applications. Anything that you add to the self variable is made thread local.

#!/usr/sbin/dtrace -s
pid$1:::entry
{
    self->ts[probefunc] = timestamp;
}
pid$1:::return
/self->ts[probefunc]/
{
    @func_time[probefunc] = sum(timestamp - self->ts[probefunc]);
    self->ts[probefunc] = 0;
}

Collecting Information About an Application

The following script counts the number of times libc functions are called from a specified command:

#!/usr/sbin/dtrace -s
pid$target:libc::entry
{
    @[probefunc]=count();
}

Use the -c option to specify the target command, as in this example:

# ./myDscript.d -c "man dtrace"

Scripts for Developers

The syscall Provider

The following script displays the stack trace when a program makes a write(2) system call.

#!/usr/sbin/dtrace -s
syscall::write:entry
{
    @[ustack()]=count();
}

The sysinfo Provider

The following script counts the number of times various processes get to run in the CPU.

#!/usr/sbin/dtrace -s
sysinfo:::pswitch
{
    @[execname] = count();
}

The proc Provider

The proc:::exec-success event occurs when a new process is started successfully. The -q option suppresses output such as column headings and number of probes matched.

#!/usr/sbin/dtrace -qs
proc:::exec-success
{
    printf("%s(pid=%d) started by uid - %d\n",execname, pid, uid);
}

Scripts for System Administrators

The syscall Provider

The syscall provider is probably the most important provider to learn and use because system calls are the primary communication channel between user level applications and the kernel.

The following script identifies the process that wrote to a particular process:

#!/usr/sbin/dtrace -s
syscall::write:entry
{
    trace(pid);
    trace(execname);
}

The following script reports how much time your web server is spending at read(2). The BEGIN probe is a place to put actions that you want to perform only once at the beginning of the script.

#!/usr/sbin/dtrace -qs
BEGIN
{
    printf("size\ttime\n");
}
syscall::read:entry
/execname == "httpd"/
{
    self->start = timestamp;
}
syscall::read:return
/self->start/
{
    printf("%d\t%d\n", arg0, timestamp - self->start);
    self->start = 0;
}

The proc Provider

The following script traces all signals sent to all processes currently running on the system. The -w option permits
destructive actions.

#!/usr/sbin/dtrace -wqs
proc:::signal-send
{
    printf("%d was sent to %s by ", args[2], args[1]->pr_fname);
    system("getent passwd %d | cut -d: -f5", uid);
}

Add the conditional statement (/args[2] == SIGKILL/) to the script and send SIGKILL signals to different processes from different users.

#!/usr/sbin/dtrace -wqs
proc:::signal-send
/args[2] == SIGKILL/
{
    printf("SIGKILL was sent to %s by ", args[1]->pr_fname);
    system("getent passwd %d | cut -d: -f5", uid);
}

The sched Provider

The following script determines the amount of time the CPU spends on I/O wait and working. It also breaks down the I/O process and indicates the data that was retrieved during the I/O wait time by StarOffice.

#!/usr/sbin/dtrace -qs
sched:::on-cpu
/execname == "soffice.bin"/
{
    self->on = vtimestamp;
}
sched:::off-cpu
/self->on/
{
    @time["<on cpu>"] = sum(vtimestamp - self->on);
    self->on = 0;
}
io:::wait-start
/execname == "soffice.bin"/
{
    self->wait = timestamp;
}
io:::wait-done
/self->wait/
{
    @io[args[2]->fi_name] = sum(timestamp - self->wait);
    @time["<I/O wait>"] = sum(timestamp - self->wait);
    self->wait = 0;
}
END
{
    printf("Time breakdown (milliseconds):\n");
    normalize(@time, 1000000);
    printa(" %-50s %15@d\n", @time);
    printf("\nI/O wait breakdown (milliseconds):\n");
    normalize(@io, 1000000);
    printa(" %-50s %15@d\n", @io);
}

The io Provider

The following script traces which files are being accessed on which device and determines whether the task being performed is a read or a write. This example script is for UFS file systems, not for ZFS file systems.

#!/usr/sbin/dtrace -qs
BEGIN
{
    printf("%10s %58s %2s\n", "DEVICE", "FILE", "RW");
}
io:::start
{
    printf("%10s %58s %2s\n", args[1]->dev_statname, args[2]->fi_pathname, args[0]->b_flags & B_READ ? "R" : "W");
}

Web 2.0 Applications

Observing Multiple Layers

The following script reports how much time you are spending on the different layers in an AMP stack (Apache/MySQL/PHP). The output is %s of time spent in Apache, Java,MySQL, the browser, and the operating system. This script is like a TOP tool for AMP.

#!/usr/sbin/dtrace -qs
BEGIN
{
    total=mysqlcnt=httpcnt=phpcnt=javacnt=ffxcnt=othercnt=0;
    printf("%10s %10s %10s %10s %10s %10s\n","% MYSQL","% APACHE","% FIREFOX","% PHP","% Java","% OTHER");
}
php*:::request-startup
{
    inphp[pid,tid]=1;
}
php*:::request-shutdown
{
    inphp[pid,tid]=0;
}
profile-1001
{
    total++;
    (execname=="mysqld")?mysqlcnt++:(execname=="httpd")?(inphp[pid,tid]==1?phpcnt++:httpcnt++): \
        (execname=="java")?javacnt++:(execname=="firefox-bin")?ffxcnt++:othercnt++;
}
tick-30s
{
    printf("%10s %10s %10s %10s %10s %10s\n","% MYSQL","% APACHE","% FIREFOX","% PHP","% Java","% OTHER");
}
tick-2s
{
    printf("%10d %10d %10d %10d %10d %10d\n",mysqlcnt*100/total,httpcnt*100/total,ffxcnt*100/total, \
        phpcnt*100/total,javacnt*100/total,othercnt*100/total);
    total=mysqlcnt=httpcnt=phpcnt=ffxcnt=javacnt=othercnt=0;
}

Observing the MySQL Database

The following very simple D script to print out the SQL that are executed in a live running instance of MySQL:

#!/usr/sbin/dtrace -qs
#pragma D option strsize=1024
pid$1::*dispatch_command*:entry
{
    printf("%d::%s\n",tid,copyinstr(arg2));
}

Observing Drupal in Production

The following D script shows the total time spent in each Drupal module and the number of calls to the module:

#!/usr/sbin/dtrace -Zs
#pragma D option quiet
BEGIN
{
    start_time = timestamp;
    printf("Collecting data, press ^c to end...\n");
}
php*:::function-entry
/arg0/
{
    self->pfunc[arg0]=timestamp;
}
php*:::function-return
/arg0 && self->pfunc[arg0]/
{
    @php_times[dirname(copyinstr(arg1))]=sum(timestamp - self->pfunc[arg0]);
    @php_calls[dirname(copyinstr(arg1))]=count();
}
END
{
    printf("\n====================================\n");
    printf("---Elapsed time (usec): %d\n", (timestamp - start_time) / 1000);
    printf("====================================\n\n");
    normalize(@php_times, 1000);
    printf("%-40.40s %12.12s %12.12s\n", "DIR", "TOTAL USEC", "CALLS");
    printa("%-40.40s %@12d %@12d\n", @php_times, @php_calls);
    printf("----------------\n");
}

The following D script shows you how much time is spent on each hook of a given Drupal module.

#!/usr/sbin/dtrace -Zs
#pragma D option quiet
BEGIN
{
    start_time = timestamp;
    printf("Collecting data, press ^c to end...\n");
}
php*:::function-entry
/arg0 && dirname(copyinstr(arg1))==$$1/
{
    self->pfunc[arg0]=timestamp;
}
php*:::function-return
/arg0 && self->pfunc[arg0]/
{
    @php_times[copyinstr(arg0)]=sum(timestamp - self->pfunc[arg0]);
    @php_calls[copyinstr(arg0)]=count();
}
END
{
    printf("\n====================================\n");
    printf("---Elapsed time (usec): %d\n", (timestamp - start_time) / 1000);
    printf("====================================\n\n");
    normalize(@php_times, 1000);
    printf("%-40.40s %12.12s %12.12s\n", "HOOK", "TOTAL USEC", "CALLS");
    printa("%-40.40s %@12d %@12d\n", @php_times, @php_calls);
    printf("----------------\n");
}

The following is a sample call to the above script:

# ./php_hook_times.d /htfs/htdocs/drupal-6-10/modules/user
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact