Analyzing SAP Appserver I/OIntroduction / MotivationSAP provides a rich set of monitoring tools allowing for detailed systems analysis. However, there are situations that go beyond the capabilities of SAP CCMS. BackgroundOracle Disk I/O is pretty easy to trace with SAP transactions ST04N and ST05. However, not only the DB might create I/O. In a real customer situation, high disk I/O really screwing up overall system performance has been observed outside business hours on a specific application server. SAP doesn't provide a lot of help here. In the past, such problems had to be diagnosed online during occurrence (sunday morning 03:00). Solaris 10 and DTrace are a great help, because it is possible to constantly monitor a system and raise an action after thresholds are exceeded. Such actions can even be external comands. SolutionThe DTrace io provider allows fairly easy monitoring of disk I/O. The profile provider adds the option of timer driven probes. |
|
| Note: The script above assumes, that the SAP <sid>adm user is known within the global zone (e.g. by using LDAP or NIS). |
root@r3psap:/opt/dtrace # ./sap_io_moni.d 2 100
Workprocess Table (long) Sun May 24 19:02:19 2009
========================
No Ty. Pid Status Cause Start Err Sem CPU Time Program Cl User Action Table
-------------------------------------------------------------------------------------------------------------------------------
0 DIA 8528 Run yes 0 0 15 ZBBC_EXTSORT 000 VWETTER
1 DIA 8827 Wait yes 0 0 0
2 DIA 8024 Wait yes 0 0 0
3 DIA 8025 Wait yes 0 0 0
4 DIA 8026 Wait yes 0 0 0
5 DIA 8029 Wait yes 0 0 0
6 DIA 8030 Wait yes 0 0 0
7 UPD 8033 Wait yes 0 0 0
8 UPD 8038 Wait yes 0 0 0
9 ENQ 8041 Wait yes 0 0 0
10 BTC 8185 Wait yes 0 0 0
11 BTC 8049 Wait yes 0 0 0
12 BTC 8050 Wait yes 0 0 0
13 SPO 8052 Wait yes 0 0 0
14 UP2 8054 Wait yes 0 0 0
s - stop workprocess
k - kill workprocess (with core)
r - enable restart flag (only possible in wp-status "ended")
q - quit
m - menue
-->
Output from ps -fZ:
solman demadm 8528 8016 8 20:09:55 ? 2:59 dw.sapDEM_DVEBMGS14 pf=/usr/sap/DEM/SYS/profile/DEM_DVEBMGS14_solman
I/O limit exceeded by process pid: 8528 name: disp+work !
last file being accessed: /usr/sap/DEM/DVEBMGS14/work/myfile
root@r3psap:/opt/dtrace #
As a result, we can see details about the process causing the high I/O load from an OS perspective as well as from an SAP perspective. In particular we can see:
- the OS username (demadm)
- the name of the zone the process is belonging to (solman)
- the name of the executable (disp+work)
- the corresponding SAP Instance (central instance of DEM: DVEBMGS14)
- the name of the last file that has been accessed before the I/O limits have been exceeded (/usr/sap/DEM/DVEBMGS14/work/myfile)
- the name of the SAP user running the SAP transaction / job causing the io (VWETTER)
- and the name of the ABAP program being the source of high I/O
Looking at the ABAP source code, it is easy to figure out, that extensive external sort operations are the cause for high I/O. In this particular example the OS level sort operation could be eliminated by using an already sorted internal table type within SAP.