... The default page size in Directory Server is 8K (8192 bytes). The name of the attribute is nsslapd-db-page-size, in dn: cn=config,cn=ldbm database,cn=plugins,cn=config.
h1. Background
When a directory is imported, it creates a set of ./db3 files in it's database directory. An example directory listing of what this directory might look like for an example directory is:
||size||file|| | 16,384 | userRoot_aci.db3 | 5,742,592 | userRoot_ancestorid.db3 | 4,547,936,256 | userRoot_cn.db3 | 5,316,575,232 | userRoot_entrydn.db3 | 209,932,910,592 | userRoot_id2entry.db3 | 4,393,762,816 | userRoot_nsUniqueId.db3 | 28,114,944 | userRoot_nscpEntryDN.db3 | 16,384 | userRoot_nsdsReplConflict.db3 | 16,384 | userRoot_numsubordinates.db3 | 58,105,856 | userRoot_objectclass.db3 | 2,867,200 | userRoot_parentid.db3 | 32,768 | userRoot_tpasubid.db3 | 2,346,557,440 | userRoot_uid.db3
{noformat} # of Entries: 42.8 million Avg entry size: 1K (without replication meta-data) Avg entry size: 2K (with replication meta-data) {noformat}
These average entry sizes are based on some calculations from LDIF exports (with and without the -r option). So, based on this info, if we calculate 42.8 million * 2K, we our userRoot_id2entry.db3 should only be around 81.8 GB. Instead we see that it is around 210 GB.
Why the big difference?
There are several reasons (replication meta data, database fragmentation) however, the biggest reason in this database is due to the large number of overflow pages.
h1. Database Overflow Pages
The berkeley db utilizes a B-Tree implementation. When each DB is built, it uses some info to decide on how to build that B-Tree. A B-Tree page can include several entries (each entry consiting of a key and data). In our directory, for the id2entry.db3 database, the key is the DN (i.e. uid=jdoe,ou=People,dc=example,dc=com) and the data is the entry (i.e. all the attributes, values, replication meta-data csn's).
If the size of the key or the data exceeds 25% of the page size, then instead of placing that data into that B-Tree node, it instead creates an overflow page that will only contain that one entry. Some of the settings in a default directory implementation are:
{noformat} Default DB Record/Page Size: 8K (or 8,192 bytes) Max Size of entry before overlow page is used: 25% of 8K = 2K or (2048 bytes) {noformat}
h1. Running db_stat
To run it, you'll need to:
{noformat} $ export LD_LIBRARY_PATH=...{ds lib directory}... (location of the libdb.so file)
$ ### Stop the directory instance $ db_stat_sparc_64 -N -d {db_location}/...id2entry.db3 {noformat}
Example:
{noformat} # ./db_stat_sparc_64 -N -d /export01/ds6data/db/userRoot_id2entry.db3 Tue Apr 3 00:18:41 2007 Local time 53162 Btree magic number 8 Btree version number Big-endian Byte order record-numbers Flags 2 Minimum keys per-page 8192 Underlying database page size 4 Number of levels in the tree 43M Number of unique keys in the tree (43051538) 43M Number of data items in the tree (43051538) 15977 Number of tree internal pages 640800 Number of bytes free in tree internal pages (99% ff) 7196657 Number of tree leaf pages 2758M Number of bytes free in tree leaf pages (95% ff) 0 Number of tree duplicate pages 0 Number of bytes free in tree duplicate pages (0% ff) 18M Number of tree overflow pages (18004268) 703M Number of bytes free in tree overflow pages (99% ff) 0 Number of empty pages 409672 Number of pages on the free list {noformat}
If the number of overflow pages is large (in the example over 18,000,000) increase the dbpagesize. The dbpagesize defaults to 8k (8192) and can be set to 8, 16, 32, or 64K. The database must be re-created (imported) when the pagesize is changed. Note that this is a good time to make sure one is using an appropriate value for allidsthreshold.
h1. Notes
* Ideally, the file system block size should match the dbpagesize. * Do not use CTRL-C to interrupt the running of *db_stat* * As noted above, the version of db_stat used must match the version of the database
h1. Contributors
{contributors-summary}
*From an article by Terry.Sigle@Sun.COM* |