View Source

{anchor:top}
h1. {anchor:CHP-POST} Postmortem Tracing
This chapter describes the DTrace facilities for _postmortem_ extraction and processing of the in-kernel data of DTrace consumers. In the event of a system crash, the information that has been recorded with DTrace may provide the crucial clues to root-cause the system failure. DTrace data may be extracted and processed from the system crash dump to aid you in understanding fatal system failures. By coupling these postmortem capabilities of DTrace with its ring buffering buffer policy (see [Chapter 11, Buffers and Buffering|Buffers and Buffering]), DTrace can be used as an operating system analog to the _black box_ flight data recorder present on commercial aircraft.\\
\\To extract DTrace data from a specific crash dump, you should begin by running the Solaris Modular Debugger, [{{mdb}}(1)|http://docs.sun.com/doc/819-2239/mdb-1?a=view], on the crash dump of interest. The MDB module containing the DTrace functionality will be loaded automatically. To learn more about MDB, refer to the [??Solaris Modular Debugger Guide??|http://docs.sun.com/doc/817-2543].

[Top|#top]
h2. {anchor:CHP-POST-1} Displaying DTrace Consumers
To extract DTrace data from a DTrace consumer, you must first determine the DTrace consumer of interest by running the {{::dtrace_state}} MDB dcmd:
{noformat}
> ::dtrace_state
ADDR MINOR PROC NAME FILE
ccaba400 2 - <anonymous> -
ccab9d80 3 d1d6d7e0 intrstat cda37078
cbfb56c0 4 d71377f0 dtrace ceb51bd0
ccabb100 5 d713b0c0 lockstat ceb51b60
d7ac97c0 6 d713b7e8 dtrace ceb51ab8
{noformat}
This command displays a table of DTrace state structures. Each row of the table consists of the following information:
* The address of the state structure
* The minor number associated with the [{{dtrace}}(7D)|http://docs.sun.com/doc/819-2254/dtrace-7d?a=view] device
* The address of the process structure that corresponds to the DTrace consumer
* The name of the DTrace consumer (or {{&lt;anonymous>}} for anonymous consumers)
* The name of the file structure that corresponds to the open [{{dtrace}}(7D)|http://docs.sun.com/doc/819-2254/dtrace-7d?a=view] device

To obtain further information about a specific DTrace consumer, specify the address of its process structure to the {{::ps}} dcmd:
{noformat}
> d71377f0::ps
S PID PPID PGID SID UID FLAGS ADDR NAME
R 100647 100642 100647 100638 0 0x00004008 d71377f0 dtrace
{noformat}

[Top|#top]
h2. {anchor:CHP-POST-2} Displaying Trace Data
Once you determine the consumer of interest, you can retrieve the data corresponding to any unconsumed buffers by specifying the address of the state structure to the {{::dtrace}} dcmd. The following example shows the output of the {{::dtrace}} dcmd on an anonymous enabling of {{syscall:::entry}} with the action {{trace(execname)}}:
{noformat}
> ::dtrace_state
ADDR MINOR PROC NAME FILE
cbfb7a40 2 - <anonymous> -

> cbfb7a40::dtrace
CPU ID FUNCTION:NAME
0 344 resolvepath:entry init
0 16 close:entry init
0 202 xstat:entry init
0 202 xstat:entry init
0 14 open:entry init
0 206 fxstat:entry init
0 186 mmap:entry init
0 186 mmap:entry init
0 186 mmap:entry init
0 190 munmap:entry init
0 344 resolvepath:entry init
0 216 memcntl:entry init
0 16 close:entry init
0 202 xstat:entry init
0 14 open:entry init
0 206 fxstat:entry init
0 186 mmap:entry init
0 186 mmap:entry init
0 186 mmap:entry init
0 190 munmap:entry init
...
{noformat}
The {{::dtrace}} dcmd handles errors in the same way that [{{dtrace}}(1M)|http://docs.sun.com/doc/819-2240/dtrace-1m?a=view] does: if drops, errors, speculative drops, or the like were encountered while the consumer was executing, {{::dtrace}} will emit a message corresponding to the [{{dtrace}}(1M)|http://docs.sun.com/doc/819-2240/dtrace-1m?a=view]message.\\
\\The order of events as displayed by {{::dtrace}} is always oldest to youngest within a given CPU. The CPU buffers themselves are displayed in numerical order. If an ordering is required for events on different CPUs, trace the {{timestamp}} variable.\\
\\You can display only the data for a specific CPU by specifying the \-c option to {{::dtrace}}:
{noformat}
> cbfb7a40::dtrace -c 1
CPU ID FUNCTION:NAME
1 14 open:entry init
1 206 fxstat:entry init
1 186 mmap:entry init
1 344 resolvepath:entry init
1 16 close:entry init
1 202 xstat:entry init
1 202 xstat:entry init
1 14 open:entry init
1 206 fxstat:entry init
1 186 mmap:entry init
...
{noformat}
Notice that {{::dtrace}} only processes _in-kernel_ DTrace data. Data that has been consumed from the kernel and processed (through [{{dtrace}}(1M)|http://docs.sun.com/doc/819-2240/dtrace-1m?a=view] or other means) will not be available to be processed with {{::dtrace}}. To assure that the most amount of data possible is available at the time of failure, use a ring buffer buffering policy. See [Chapter&nbsp;11, Buffers and Buffering|Buffers and Buffering] for more information on buffer policies.\\
\\The following example creates a very small (16K) ring buffer and records all system calls and the process making them:
{noformat}
# dtrace -P syscall'{trace(curpsinfo->pr_psargs)}' -b 16k -x bufpolicy=ring
dtrace: description 'syscall:::entry' matched 214 probes
{noformat}
Looking at a crash dump taken when the above command was running results in output similar to the following example:
{noformat}
> ::dtrace_state
ADDR MINOR PROC NAME FILE
cdccd400 3 d15e80a0 dtrace ced065f0

> cdccd400::dtrace
CPU ID FUNCTION:NAME
0 139 getmsg:return mibiisa -r -p 25216
0 138 getmsg:entry mibiisa -r -p 25216
0 139 getmsg:return mibiisa -r -p 25216
0 138 getmsg:entry mibiisa -r -p 25216
0 139 getmsg:return mibiisa -r -p 25216
0 138 getmsg:entry mibiisa -r -p 25216
0 139 getmsg:return mibiisa -r -p 25216
0 138 getmsg:entry mibiisa -r -p 25216
0 139 getmsg:return mibiisa -r -p 25216
0 138 getmsg:entry mibiisa -r -p 25216
0 17 close:return mibiisa -r -p 25216
...
0 96 ioctl:entry mibiisa -r -p 25216
0 97 ioctl:return mibiisa -r -p 25216
0 96 ioctl:entry mibiisa -r -p 25216
0 97 ioctl:return mibiisa -r -p 25216
0 96 ioctl:entry mibiisa -r -p 25216
0 97 ioctl:return mibiisa -r -p 25216
0 96 ioctl:entry mibiisa -r -p 25216
0 97 ioctl:return mibiisa -r -p 25216
0 16 close:entry mibiisa -r -p 25216
0 17 close:return mibiisa -r -p 25216
0 124 lwp_park:entry mibiisa -r -p 25216
1 68 access:entry mdb -kw
1 69 access:return mdb -kw
1 202 xstat:entry mdb -kw
1 203 xstat:return mdb -kw
1 14 open:entry mdb -kw
1 15 open:return mdb -kw
1 206 fxstat:entry mdb -kw
1 207 fxstat:return mdb -kw
1 186 mmap:entry mdb -kw
...
1 13 write:return mdb -kw
1 10 read:entry mdb -kw
1 11 read:return mdb -kw
1 12 write:entry mdb -kw
1 13 write:return mdb -kw
1 96 ioctl:entry mdb -kw
1 97 ioctl:return mdb -kw
1 364 pread64:entry mdb -kw
1 365 pread64:return mdb -kw
1 366 pwrite64:entry mdb -kw
1 367 pwrite64:return mdb -kw
1 364 pread64:entry mdb -kw
1 365 pread64:return mdb -kw
1 38 brk:entry mdb -kw
1 39 brk:return mdb -kw
>
{noformat}
Note that CPU 1's youngest records include a series of [{{write}}(2)|http://docs.sun.com/doc/819-2241/write-2?a=view] system calls by an {{mdb -kw}} process. This result is likely related to the reason for the system failure because a user can modify running kernel data or text with [{{mdb}}(1)|http://docs.sun.com/doc/819-2239/mdb-1?a=view] when run with the \-k and \-w options. In this case, the DTrace data provides at least an interesting avenue of investigation, if not the root cause of the failure.
{excerpt:hidden=true}Converted by tech dogg's sgml2wiki on Tue 20 Nov 2007 at 9:26:01 PM{excerpt}

The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact