Key Sun technologies for running Oracle

System and Application Performance, Monitoring, and Tuning

Brought to you by Sun Microsystems and the engineers from

Key Sun technologies for running Oracle

Early innovations

With the pending acquisition by Oracle fresh in our minds, a recent email thread started to recount the technology that Sun put into making the Oracle database run well on our servers. Please use the Wiki page to help us ensure that all of the important contributions get recognized.

This material has been arranged in to a presentation Here

1993 (ISM arrives in Solaris 2.2)

Intimate shared memory (ISM) is an optimization introduced first in Solaris 2.2. It allows for the sharing of the translation tables involved in the virtual to physical address translation for shared memory pages, as opposed to just sharing the actual physical memory pages. Typically, non-ISM systems maintain a per-process mapping for the shared memory pages. With many processes attaching to shared memory, this creates a lot of redundant mappings to the same physical pages that the kernel must maintain. Additionally, all modern processors implement some form of a translation lookaside buffer (TLB), which is (essentially) a hardware cache of address translation information. SPARC processors are no exception, and, just like an instruction and data cache, the TLB has limits as to how many translations it can maintain at any one time. As processes get context switched in and out, we can reduce the effectiveness of the TLB. If those processes are sharing memory, and we can share the memory mappings also, we can make more effective use of the hardware TLB.

ISM was a critical technology which enabled Oracle to efficiently scale on large SMP systems as well as smaller machines.

Jim Mauro did a excellent writeup for Sun World that describes the origin of this enhancement.

1997 (no-preempt while holding a latch)

The schedctl API, schedctl_init(3C), for use in the Oracle latch, was implemented to reduce Solaris preemption of Oracle threads while holding a lock. This allowed the application to request that the system hold off "just a little bit longer" if it's holding a mutex. Before this time, Oracle processes could go to sleep while holding critical latches which really limited scalability especially on large SMP servers which were able to achieve high levels of concurrency.

1998 (64bit Solaris)

Solaris 2.7 was introduced offering the first 64-bit version of Solaris. This enabled the 64-bit version of Oracle 8i to scale to beyond the 4GB memory barrier. This was necessary to make use of the 64GB of memory available on the E10K (aka Starfire) servers.

1999 (Record TPC-C performance)

Sun and Oracle broke the 100k tpmC barrier with a result of 115,395.73 tpmC running Oracle 8i with Solaris 2.7 on the E10K (Starfire). This result due made possible by the performance enhancements up until the time of publishing. 64bit Solaris, ISM, and no-preempt were all critical technologies that made this record TPC-C result possible.

Solaris 9 introduced in 2002
Dynamic ISM (DISM)

Dynamic ISM enabled Oracle support for the dynamic SGA feature introduced in Oracle9i. This allowed a DBA to dynamically increase or decrease the size of the SGA (up to a limit defined by sga_max_size) without needing to restart the Oracle instance. Using the Solaris Reconfiguration Coordination Manager (RCM), it is also possible to write a script that allows Oracle to be alerted when new cpus/memory are to be removed from the domain, so that the SGA can be dynamically scaled back to allow the board to be removed without shutting down the database.

NUMA optimizations

Sun collaborated with Oracle to define and use the lgroup API, lgrp_init (3LGRP), and enable Oracle to optimize local vs remote access to the SGA (database buffer cache) on NUMA machines. These optimizations were made default on Oracle 10g running on Sun NUMA based machines. These optimizations aim to increase the locality of reference for the SGA and PGA. The performance improvements can be quite drastic depending on the server. These innovations referred to as memory placement optimization (MPO) are key to scaling on servers with high NUMA ratios. The following references detail the MPO technology as well as it's use with Oracle.

Large Page Support

With the ever increasing memory sizes, it became apparent that pages larger than 8K would be necessary. In Solaris 9, multiple page size support was introduced. In Solaris 10 with CMT based servers, Large page support was expanded support up to 256M pages. This lowers the TLB overhead mapping memory.

The following entry in the Solaris internals site shows more details about to use this technology. In Oracle 10g, the memcntl(xx) API was used to allow Oracle to automatically used large pages for the PGA. This was enabled by setting the "_realfree_heap_pagesize_hint=4M" parameter. This allowed the mmapped heap to use 4M pages instead of the default 8k size.

Polling Enhancements

In order to better scale Oracle applications, enhancements were made to the poll interface. A nice writeup of the changes was posted to the Sun Developer network by Bruce Chapman.

Scheduler enhancements

The Fixed "FX" priority class was introduced to help alleviate priority inversions which can happen with Oracle instances running the default TS class. The FX class allows the administrator to select the priority and time-slice for different classes of processes. For instance, you might want to give the Oracle LGWR process a higher priority than a standard user process. Oracle does make use of the Real-Time class for CRS services needed for RAC. Oracle server "Shadow" processes must be adjusted by the administrator via the priocntl(1) interface. Since scheduling classes are inherited, it is easy to ensure new connects get the proper priority by altering the listener process. An experienced administrator will create scripts to assign priorities to the server processes and listeners upon database start-up.

Oracle RAC makes use of the "RT" class for the LMS daemon processes which are used to transfer blocks over the interconnect via cache-fusion.

UFS direct I/O enhancements

UFS direct I/O with concurrent writes was introduced to abolished the "single-writer" lock that plagued Oracle performance for years. This enhancement allowed for UFS performance be mostly on-par with "raw" performance. Richard McDougall championed this fix did a nice writeup on Oracle filesystem performance. This document and other material on Direct I/O can be found on the http://www.solarisinternals.com/ site.

Solaris 10 arrives in 2005
Expanded MPSS support.

Multiple page size support was expanded to include 256MB pages. This Out-of-box optimization useful it increase the performance of CMT servers running Oracle. Ravi explains this in his blog entry entitled "Database scaling on Sun Fire T2000".

Enhanced IPC-based parameter control.

This enhancement allowed for IPC parameters to be modified "without" rebooting a machine via the projects. This expanded control allows for even greater increased up-time.

Misc innovations
  • Improving efficiency of "dbwriters" using aio_waitn(3RT) API.
  • semtimedop(2) was added specifically for Oracle.
  • Sun Cluster enhancements for Oracle RAC:
  • Container upgrades (fine grained privileges, device management) to support Oracle RAC
 
Recently Updated
by Lisa_N (31 Jan)
Flash Storage Performance
by Lisa_N (31 Jan)
Home
by Lisa_N (31 Jan)
Tuning ZFS for the F5100
by Lisa_N (13 Nov)
Does 4k Alignment Really Matter?
by Lisa_N (02 Nov)
Aligning Flash Modules for Optimal Performance
by Lisa_N (02 Nov)
How to 4k Align Flash on Solaris x86
by nr157415 (19 Oct)
Oracle Middleware Performance
by allanp0 (19 Oct)
Oracle Database Performance
by allanp0 (19 Oct)
MySQL Performance
by Supernova (30 Sep)
Key Sun technologies for running Oracle

Labels

oracle oracle Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

© 2010, Oracle Corporation and/or its affiliates
Powered by Atlassian Confluence
Oracle Social Media Participation Policy Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact