tcp Provider
The tcp provider provides probes for tracing the TCP protocol.
This provider is under development and is not yet available.
Probes
The tcp probes are described in the table below.
tcp Probes
| Probe | Description |
|---|---|
| accept-established | This traces a successful inbound TCP connection (SYN-ACK sent, ACK received). |
| accept-refused | This traces a failed inbound TCP connection (SYN received, RST sent - port closed). |
| connect-request | This traces outbound TCP connection requests (SYN). |
| connect-established | This traces outbound TCP successful connections (SYN-ACK received, ACK sent). |
| connect-refused | This failed outbound TCP connections (SYN sent, RST received - port closed). |
| state-change | Probe that fires a TCP session changes its TCP state. |
| send | Probe that fires whenever TCP sends a packet. |
| receive | Probe that fires whenever TCP receives a packet. |
The send and receive probes trace packets on physical interfaces and also packets on loopback interfaces that are processed by tcp. On Solaris, loopback TCP connections can bypass the TCP layer when transferring data packets - this is a performance feature called tcp fusion; these packets can be traced by the tcpf provider described below.
Arguments
The argument types for the tcp probes are listed in the table below. The arguments are described in the following section.
tcp Probe Arguments
| Probe | args[0] | args[1] | args[2] | args[3] | args[4] |
|---|---|---|---|---|---|
| accept-established | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpinfo_t * |
| accept-refused | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpinfo_t * |
| connect-request | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpinfo_t * |
| connect-established | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpinfo_t * |
| connect-refused | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpinfo_t * |
| state-change | null | csinfo_t * | tcpsinfo_t * | tcpnsinfo_t * | |
| send | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpinfo_t * |
| receive | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpinfo_t * |
pktinfo_t structure
The pktinfo_t structure is where packet ID info can be made available for deeper analysis if packet IDs become supported by the kernel in the future.
The pkt_addr member is currently always NULL.
typedef struct pktinfo {
uintptr_t pkt_addr; /* currently always NULL */
} pktinfo_t;
csinfo_t structure
The csinfo_t structure is where connection state info can be made available if connection IDs become supported by the kernel in the future.
The cs_addr member is currently always NULL.
typedef struct csinfo {
uintptr_t cs_addr; /* currently always NULL */
} csinfo_t;
ipinfo_t structure
The ipinfo_t structure contains common IP info for both IPv4 and IPv6.
typedef struct ipinfo {
uint8_t ip_ver; /* IP version (4, 6) */
uint16_t ip_plength; /* payload length */
string ip_saddr; /* source address */
string ip_daddr; /* destination address */
} ipinfo_t;
These values are read at the time the probe fired in TCP, and so ip_plength is the expected IP payload length - however the IP layer may add headers (such as AH and ESP) which will increase the actual payload length. To examine this, also trace packets using the ip provider.
ipinfo_t Members
| ip_ver | IP version number. Currently either 4 or 6. |
| ip_plength | Payload length in bytes. This is the length of the packet at the time of tracing, excluding the IP header. |
| ip_saddr | Source IP address, as a string. For IPv4 this is a dotted decimal quad, IPv6 follows RFC-1884 convention 2 with lower case hexadecimal digits. |
| ip_daddr | Destination IP address, as a string. For IPv4 this is a dotted decimal quad, IPv6 follows RFC-1884 convention 2 with lower case hexadecimal digits. |
tcpsinfo_t structure
The tcpsinfo_t structure contains tcp state info.
typedef struct tcpsinfo {
int tcps_local; /* is delivered locally, boolean */
int tcps_active; /* active open (from here), boolean */
string tcps_state; /* TCP state, as a string */
} tcpsinfo_t;
tcpsinfo_t Members
| tcps_local | is local, boolean. 0: is not delivered locally (uses a physical network interface), 1: is delivered locally (including loopback interfaces, eg lo0),. |
| tcps_active | is an active open, boolean. 0: TCP connection was created from a remote host, 1: TCP connection was created from this host. |
| tcps_state | TCP state, as a string. |
tcpsninfo_t structure
The tcpsninfo_t structure contains the new tcp state during a state change.
typedef struct tcpsninfo {
string tcps_state; /* TCP state, as a string */
} tcpsninfo_t;
tcpsinfo_t Members
| tcps_state | new TCP state, as a string. |
tcpinfo_t structure
The tcpinfo_t structure is a DTrace translated version of the TCP header.
typedef struct tcpinfo {
uint16_t tcp_sport; /* source port */
uint16_t tcp_dport; /* destination port */
uint32_t tcp_seq; /* sequence number */
uint32_t tcp_ack; /* acknowledgment number */
uint8_t tcp_offset; /* data offset, in bytes */
uint8_t tcp_flags; /* flags */
uint16_t tcp_window; /* window size */
uint16_t tcp_checksum; /* checksum */
uint16_t tcp_urgent; /* urgent data pointer */
tcph_t *tcp_hdr; /* raw TCP header */
} tcpinfo_t;
tcpinfo_t Members
| tcp_sport | TCP source port. |
| tcp_dport | TCP destination port. |
| tcp_seq | TCP sequence number. |
| tcp_ack | TCP acknowledgment number. |
| tcp_offset | Payload data offset, in bytes (not 32-bit words). |
| tcp_flags | TCP flags. See the tcp_flags table below for available macros. |
| tcp_window | TCP window size, bytes. |
| tcp_checksum | Checksum of TCP header and payload. |
| tcp_urgent | TCP urgent data pointer, bytes. |
| tcp_hdr | Pointer to raw TCP header at time of tracing. |
tcp_flags Values
| TH_FIN | No more data from sender (finish). |
| TH_SYN | Synchronize sequence numbers (connect). |
| TH_RST | Reset the connection. |
| TH_PUSH | TCP push function. |
| TH_ACK | Acknowledgment field is set. |
| TH_URG | Urgent pointer field is set. |
| TH_ECE | Explicit congestion notification echo (see RFC-3168). |
| TH_CWR | Congestion window reduction. |
See RFC-793 for a detailed explanation of the standard TCP header fields and flags.
Examples
Some simple examples of tcp provider usage follow.
Connections by host address
This DTrace one-liner counts inbound TCP connections by source IP address:
# dtrace -n 'tcp:::accept-established { @[args[2]->ip_saddr] = count(); }'
dtrace: description 'tcp:::accept-established ' matched 1 probe
^C
127.0.0.1 1
192.168.2.88 1
fe80::214:4fff:fe8d:59aa 1
192.168.1.109 3
The output above shows there were 3 TCP connections from 192.168.1.109, a single TCP connection from the IPv6 host fe80::214:4fff:fe8d:59aa, etc.
Connections by TCP port
This DTrace one-liner counts inbound TCP connections by local TCP port:
# dtrace -n 'tcp:::accept-established { @[args[4]->tcp_dport] = count(); }'
dtrace: description 'tcp:::accept-established ' matched 1 probe
^C
40648 1
22 3
The output above shows there were 3 TCP connections for port 22 (ssh), a single TCP connection for port 40648 (an RPC port).
Who is connecting to what
Combining the previous two examples produces a useful one liner, to quickly identify who is connecting to what:
# dtrace -n 'tcp:::accept-established { @[args[2]->ip_saddr, args[4]->tcp_dport] = count(); }'
dtrace: description 'tcp:::accept-established ' matched 1 probe
^C
192.168.2.88 40648 1
fe80::214:4fff:fe8d:59aa 22 1
192.168.1.109 22 3
The output above shows there were 3 TCP connections from 192.168.1.109 to port 22 (ssh), etc.
Who isn't connecting to what
It may be useful when troubleshooting connection issues to see who is failing to connect to their requested ports:
# dtrace -n 'tcp:::accept-refused { @[args[2]->ip_daddr, args[4]->tcp_sport] = count(); }'
dtrace: description 'tcp:::accept-refused ' matched 1 probe
^C
192.168.1.109 23 2
Here we traced two failed attempts by host 192.168.1.109 to connect to port 23 (telnet). Note that the semantics are a little different for the accept-refused probe - since it traces the TCP RST packet, we use the destination address and source port.
Packets by host address
This DTrace one-liner counts TCP received packets by host address:
# dtrace -n 'tcp:::receive { @[args[2]->ip_saddr] = count(); }'
dtrace: description 'tcp:::receive ' matched 5 probes
^C
127.0.0.1 7
fe80::214:4fff:fe8d:59aa 14
192.168.2.30 43
192.168.1.109 44
192.168.2.88 3722
The output above shows that 7 TCP packets were recieved from 127.0.0.1, 14 TCP packets from the IPv6 host fe80::214:4fff:fe8d:59aa, etc.
Packets by local port
This DTrace one-liner counts TCP received packets by the local TCP port:
# dtrace -n 'tcp:::receive { @[args[4]->tcp_dport] = count(); }'
dtrace: description 'tcp:::receive ' matched 5 probes
^C
42303 3
42634 3
2049 27
40648 36
22 162
The output above shows that 162 packets were received for port 22 (ssh), 36 packets were received for port 40648 (an RPC port), 27 packets for 2049 (NFS), and a few packets to high numbered client ports.
Sent size distribution
This DTrace one-liner prints distribution plots of IP payload size by destination, for TCP sends:
# dtrace -n 'tcp:::send { @[args[2]->ip_daddr] = quantize(args[2]->ip_plength); }'
dtrace: description 'tcp:::send ' matched 3 probes
^C
192.168.1.109
value ------------- Distribution ------------- count
32 | 0
64 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 14
128 |@@@ 1
256 | 0
192.168.2.30
value ------------- Distribution ------------- count
16 | 0
32 |@@@@@@@@@@@@@@@@@@@@ 7
64 |@@@@@@@@@ 3
128 |@@@ 1
256 |@@@@@@ 2
512 |@@@ 1
1024 | 0
tcpstate.d
This DTrace script demonstrates the capability to trace TCP state changes:
#!/usr/sbin/dtrace -s
#pragma D option quiet
#pragma D option switchrate=10
dtrace:::BEGIN
{
printf(" %3s %12s %-20s %-20s\n", "CPU", "DELTA(us)", "OLD", "NEW");
last = timestamp;
}
tcp:::state-change
{
this->elapsed = (timestamp - last) / 1000;
printf(" %3d %12d %-20s -> %-20s\n", cpu, this->elapsed,
args[2]->tcps_state, args[3]->tcps_state);
last = timestamp;
}
This script was run on a system for a couple of minutes:
# ./tcpstate.d CPU DELTA(us) OLD NEW 3 938491 state-syn-received -> state-syn-received 3 98 state-syn-received -> state-established 3 14052789 state-established -> state-close-wait 3 67 state-close-wait -> state-last-ack 3 56 state-last-ack -> state-bound 2 7783 state-bound -> state-closed 2 68797522 state-idle -> state-bound 2 172 state-bound -> state-syn-sent 3 210 state-syn-sent -> state-established 2 5364 state-established -> state-fin-wait1 3 79 state-fin-wait1 -> state-fin-wait2 3 65 state-fin-wait2 -> state-time-wait
In the above example output, an inbound connection is first traced, which lasted for over 14 seconds. About 68 seconds after it was closed, an outbound connection is traced - which only lasted around 5 milliseconds.
The fields printed are:
| field | description |
|---|---|
| CPU | CPU id for the event |
| DELTA(us) | time since previous event, microseconds |
| OLD | old TCP state |
| NEW | new TCP state |
tcpio.d
The following DTrace script traces TCP packets and prints various details:
#!/usr/sbin/dtrace -s
#pragma D option quiet
#pragma D option switchrate=10hz
dtrace:::BEGIN
{
printf(" %3s %15s:%-5s %15s:%-5s %6s %s\n", "CPU",
"LADDR", "LPORT", "RADDR", "RPORT", "BYTES", "FLAGS");
}
tcp:::send
{
this->length = args[2]->ip_plength - args[4]->tcp_offset;
printf(" %3d %16s:%-5d -> %16s:%-5d %6d (", cpu,
args[2]->ip_saddr, args[4]->tcp_sport,
args[2]->ip_daddr, args[4]->tcp_dport, this->length);
}
tcp:::receive
{
this->length = args[2]->ip_plength - args[4]->tcp_offset;
printf(" %3d %16s:%-5d <- %16s:%-5d %6d (", cpu,
args[2]->ip_daddr, args[4]->tcp_dport,
args[2]->ip_saddr, args[4]->tcp_sport, this->length);
}
tcp:::send,
tcp:::receive
{
printf("%s", args[4]->tcp_flags & TH_FIN ? "FIN|" : "");
printf("%s", args[4]->tcp_flags & TH_SYN ? "SYN|" : "");
printf("%s", args[4]->tcp_flags & TH_RST ? "RST|" : "");
printf("%s", args[4]->tcp_flags & TH_PUSH ? "PUSH|" : "");
printf("%s", args[4]->tcp_flags & TH_ACK ? "ACK|" : "");
printf("%s", args[4]->tcp_flags & TH_URG ? "URG|" : "");
printf("%s", args[4]->tcp_flags & TH_ECE ? "ECE|" : "");
printf("%s", args[4]->tcp_flags & TH_CWR ? "CWR|" : "");
printf("%s", args[4]->tcp_flags == 0 ? "null " : "");
printf("\b)\n");
}
This example output has captured a TCP handshake:
# ./tcpio.d CPU LADDR:LPORT RADDR:RPORT BYTES FLAGS 1 192.168.2.80:22 -> 192.168.1.109:60337 464 (PUSH|ACK) 1 192.168.2.80:22 -> 192.168.1.109:60337 48 (PUSH|ACK) 2 192.168.2.80:22 -> 192.168.1.109:60337 20 (PUSH|ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 0 (SYN) 3 192.168.2.80:22 -> 192.168.1.109:60337 0 (SYN|ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 0 (ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 0 (ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 20 (PUSH|ACK) 3 192.168.2.80:22 -> 192.168.1.109:60337 0 (ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 0 (ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 376 (PUSH|ACK) 3 192.168.2.80:22 -> 192.168.1.109:60337 0 (ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 24 (PUSH|ACK) 2 192.168.2.80:22 -> 192.168.1.109:60337 736 (PUSH|ACK) 3 192.168.2.80:22 <- 192.168.1.109:60337 0 (ACK)
The fields printed are:
| field | description |
|---|---|
| CPU | CPU id that event occurred on |
| LADDR | local IP address |
| LPORT | local TCP port |
| RADDR | remote IP address |
| RPORT | remote TCP port |
| BYTES | TCP payload bytes |
| FLAGS | TCP flags |
Note: The output may be shuffled slightly on multi-CPU servers due to DTrace per-CPU buffering, and events such as the TCP handshake can be printed out of order. Keep an eye on changes in the CPU column, or add a timestamp column to this script and post sort.
tcp Stability
The tcp provider uses DTrace's stability mechanism to describe its stabilities, as shown in the following table. For more information about the stability mechanism, see Chapter 39, Stability.
| Element | Name stability | Data stability | Dependency class |
|---|---|---|---|
| Provider | Evolving | Evolving | ISA |
| Module | Private | Private | Unknown |
| Function | Private | Private | Unknown |
| Name | Evolving | Evolving | ISA |
| Arguments | Evolving | Evolving | ISA |
tcpf Provider (tcp fusion)
On Solaris, loopback TCP connections can bypass the TCP layer when transferring data - this is a performance feature called tcp fusion. The tcpf provider can trace these events, which are similar to TCP events and the tcp provider, but not identical. On Solaris it is necessary to use this provider when examining all local or loopback traffic.
tcpf Probes
The tcpf probes are described in the table below.
tcpf Probes
| Probe | Description |
|---|---|
| send | Probe that fires whenever TCP fusion sends data. |
| receive | Probe that fires whenever TCP fusion receives data. |
Arguments
The argument types for the tcpf probes are listed in the table below. Most of these arguments are identical to those in the tcp provider and are described above, with the exception of tcpfinfo_t.
tcp Probe Arguments
| Probe | args[0] | args[1] | args[2] | args[3] | args[4] |
|---|---|---|---|---|---|
| send | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpfinfo_t * |
| receive | pktinfo_t * | csinfo_t * | ipinfo_t * | tcpsinfo_t * | tcpfinfo_t * |
Note that the ipinfo_t struct has been provided for consistency with other network providers, however tcp fusion does not use the IP layer, and does not construct IP or TCP headers. Since the ip_plength member contains the payload length, in the case of tcp fusion this is just the data bytes - no TCP header length is included.
tcpfinfo_t structure
The tcpsinfo_t structure contains available tcp fusion info, in lieu of a tcpinfo_t (which tcp fusion avoids creating).
typedef struct tcpfinfo {
uint16_t tcpf_sport; /* source port */
uint16_t tcpf_dport; /* destination port */
} tcpfinfo_t;
tcpfinfo_t Members
| tcpf_sport | TCP source port. |
| tcpf_dport | TCP destination port. |
tcpf Stability
| Element | Name stability | Data stability | Dependency class |
|---|---|---|---|
| Provider | Unstable | Unstable | ISA |
| Module | Private | Private | Unknown |
| Function | Private | Private | Unknown |
| Name | Unstable | Unstable | ISA |
| Arguments | Unstable | Unstable | ISA |