... !oracle_logo_s.jpg|align=right!
h1. System and Application Performance, Monitoring, and Tuning
*{_}Brought to you by Sun Microsystems and the engineers from{_}* !pt_logo_s.jpg!
{section}
{column:width=20%} {panel:title=Home Page|titleBGColor=#A3B8CB|bgColor=#F0F0F0} [Main Performance Page|http://wikis.sun.com/display/Performance] {panel} {panel:title=Key Sun technologies for running Oracle|titleBGColor=#A3B8CB|bgColor=#F0F0F0} [Chronology of Innovation|#chrono]
{panel} {excerpt-include:Home|nopanel=true} {column} {column:width=80%}
{anchor:chrono1} {panel:title=Early innovations|titleBGColor=#A3B8CB|bgColor=#F0F0F0}
|
... h5. 1993 (ISM arrives in Solaris 2.2)
Intimate shared memory (ISM) is an optimization introduced first in Solaris 2.2. It allows for the sharing of the translation tables involved in the virtual to physical address translation for shared memory pages, as opposed to just sharing the actual physical memory pages. Typically, non-ISM systems maintain a per-process mapping for the shared memory pages. With many processes attaching to shared memory, this creates a lot of redundant mappings to the same physical pages that the kernel must maintain. Additionally, all modern processors implement some form of a translation lookaside buffer (TLB), which is (essentially) a hardware cache of address translation information. SPARC processors are no exception, and, just like an instruction and data cache, the TLB has limits as to how many translations it can maintain at any one time. As processes get context switched in and out, we can reduce the effectiveness of the TLB. If those processes are sharing memory, and we can share the memory mappings also, we can make more effective use of the hardware TLB.
ISM was a critical technology which enabled Oracle to efficiently scale on large SMP systems as well as smaller machines.
Jim Mauro did a excellent [writeup|http://sunsite.uakom.sk/sunworldonline/swol-09-1997/swol-09-insidesolaris.html] for Sun World that describes the origin of this enhancement.
h5. 1997 (no-preempt while holding a latch)
The schedctl API, [schedctl_init(3C)|http://docs.sun.com/app/docs/doc/819-2243/schedctl-init-3c?a=view], for use in the Oracle latch, was implemented to reduce Solaris preemption of Oracle threads while holding a lock. This allowed the application to request that the system hold off "just a little bit longer" if it's holding a mutex. Before this time, Oracle processes could go to sleep while holding critical latches which really limited scalability especially on large SMP servers which were able to achieve high levels of concurrency.
h5. 1998 (64bit Solaris)
Solaris 2.7 was introduced offering the first 64-bit version of Solaris. This enabled the 64-bit version of Oracle 8i to scale to beyond the 4GB memory barrier. This was necessary to make use of the 64GB of memory available on the [E10K|http://en.wikipedia.org/wiki/E10k#Enterprise_10000] (aka Starfire) servers.
h5. 1999 (Record TPC-C performance)
Sun and Oracle broke the 100k tpmC barrier with a result of 115,395.73 tpmC running Oracle 8i with Solaris 2.7 on the E10K (Starfire). This result due made possible by the performance enhancements up until the time of publishing. 64bit Solaris, ISM, and no-preempt were all critical technologies that made this record TPC-C result possible. {panel}
{anchor:chrono2} {panel:title=Solaris 9 introduced in 2002|titleBGColor=#A3B8CB|bgColor=#F0F0F0}
h5. Dynamic ISM (DISM)
Dynamic ISM enabled Oracle support for the dynamic SGA feature introduced in Oracle9i. This allowed a DBA to dynamically increase or decrease the size of the SGA (up to a limit defined by sga_max_size) without needing to restart the Oracle instance. Using the Solaris Reconfiguration Coordination Manager (RCM), it is also possible to write a script that allows Oracle to be alerted when new cpus/memory are to be removed from the domain, so that the SGA can be dynamically scaled back to allow the board to be removed without shutting down the database.
h5. NUMA optimizations
Sun collaborated with Oracle to define and use the lgroup API, lgrp_init (3LGRP), and enable Oracle to optimize local vs remote access to the SGA (database buffer cache) on NUMA machines. These optimizations were made default on Oracle 10g running on Sun NUMA based machines. These optimizations aim to increase the locality of reference for the SGA and PGA. The performance improvements can be quite drastic depending on the server. These innovations referred to as memory placement optimization (MPO) are key to scaling on servers with high NUMA ratios. The following references detail the MPO technology as well as it's use with Oracle. * [Technical White paper on MPO|http://www.sun.com/servers/wp/docs/mpo_v7_CUSTOMER.pdf] (This paper focuses on the technology) * [MPO aware Oracle|Oracle Database Performance^mpo_public.pdf] (This presentation focuses on MPO with Oracle)
h5. Large Page Support
With the ever increasing memory sizes, it became apparent that pages larger than 8K would be necessary. In Solaris 9, multiple page size support was introduced. In Solaris 10 with CMT based servers, Large page support was expanded support up to 256M pages. This lowers the TLB overhead mapping memory.
The following [entry|http://www.solarisinternals.com/wiki/index.php/Multiple_Page_Size_Support] in the Solaris internals site shows more details about to use this technology. In Oracle 10g, the memcntl(xx) API was used to allow Oracle to automatically used large pages for the PGA. This was enabled by setting the "_realfree_heap_pagesize_hint=4M" parameter. This allowed the mmapped heap to use 4M pages instead of the default 8k size.
h5. Polling Enhancements
In order to better scale Oracle applications, enhancements were made to the poll interface. A nice [writeup|http://developers.sun.com/solaris/articles/polling_efficient.html] of the changes was posted to the Sun Developer network by Bruce Chapman.
h5. Scheduler enhancements
The Fixed "FX" priority class was introduced to help alleviate priority inversions which can happen with Oracle instances running the default TS class. The FX class allows the administrator to select the priority and time-slice for different classes of processes. For instance, you might want to give the Oracle LGWR process a higher priority than a standard user process. Oracle does make use of the Real-Time class for CRS services needed for RAC. Oracle server "Shadow" processes must be adjusted by the administrator via the [priocntl(1)|http://docs.sun.com/app/docs/doc/816-0210/6m6nb7mi6?a=view] interface. Since scheduling classes are inherited, it is easy to ensure new connects get the proper priority by altering the listener process. An experienced administrator will create scripts to assign priorities to the server processes and listeners upon database start-up.
Oracle RAC makes use of the "RT" class for the LMS daemon processes which are used to transfer blocks over the interconnect via cache-fusion.
h5. UFS direct I/O enhancements
UFS direct I/O with concurrent writes was introduced to abolished the "single-writer" lock that plagued Oracle performance for years. This enhancement allowed for UFS performance be mostly on-par with "raw" performance. Richard McDougall championed this fix did a nice [writeup|http://www.solarisinternals.com/si/reading/oracle_fsperf.pdf] on Oracle filesystem performance. This document and other material on [Direct I/O|http://www.solarisinternals.com/wiki/index.php/Direct_I/O] can be found on the [http://www.solarisinternals.com/] site.
{panel}
{anchor:chrono3} {panel:title=Solaris 10 arrives in 2005|titleBGColor=#A3B8CB|bgColor=#F0F0F0}
h5. Expanded MPSS support.
Multiple page size support was expanded to include 256MB pages. This Out-of-box optimization useful it increase the performance of CMT servers running Oracle. Ravi explains this in his blog entry entitled ["Database scaling on Sun Fire T2000"|http://blogs.sun.com/travi/entry/database_scaling_on_sun_fire].
h5. Enhanced IPC-based parameter control.
This enhancement allowed for IPC parameters to be modified "without" rebooting a machine via the projects. This expanded control allows for even greater increased up-time.
{panel}
{anchor:chrono4} {panel:title=Misc innovations|titleBGColor=#A3B8CB|bgColor=#F0F0F0}
* Improving efficiency of "dbwriters" using aio_waitn(3RT) API. * semtimedop(2) was added specifically for Oracle. * Sun Cluster enhancements for Oracle RAC:
- striped reliable private interconnect for higher internode bandwidth (clprivnet) - Advanced HA agent for RAC (QFS+SVM) (see Tim Read's [Making Oracle Database 10g R2 and 11g RAC Even More Unbreakable"|http://www.sun.com/software/whitepapers/solaris10/solaris_cluster.pdf].
* Container upgrades (fine grained privileges, device management) to support Oracle RAC
{panel}
{column} {section}
{recently-updated-dashboard:types=page} |