Mining Repository Usage Data

Interaction between pkg(5)-based clients, web browsers feed readers and other clients and pkg(5) package repositories can be mined to help repository and distribution owners make informed decisions about new features and package content.

Usage Reports

Usage reports typically inspect the access logs generated by a front-end Apache HTTP server. Although off-the-shelf web analysis tools such as Webalizer can be used to report on HTTP traffic between pkg(5) clients and package repositories, there is additional data included in pkg(5) HTTP requests that cannot be easily reported on via these tools. Consequently, you'll likely end up creating custom usage reporting programs to extract the information that is most important to your projects.

Sun layered products can take advantage of the analytic reports (Sun internal link) managed by the Sun layered repo hosting service (Sun internal link). These reports includes charts and listings for specific products as well as for the overall hosting service.

Non pkg(5) Clients

Web browsers and feed readers will account for some of the traffic to your package repositories. Since these clients are not pkg(5)-aware, requests associated with these clients may not contain all of the data listed below.

Data Elements of Primary Interest

The following data is captured as a result of pkg(5) client interactions with package repositories:

  • Installation ID
  • Source IP Address
  • Request Date and Time
  • Client Platform
  • Java Version
  • Client Application Type
  • Package Information
  • Intent of the Request

Installation Image ID

An Installation Image ID can uniquely represent a pkg(5) installation image.

Since source IP addresses captured in HTTP access logs don't typically represent a users' systems, the installation image ID provides a more detailed understanding of image instances that are interacting with repositories.

The installation image ID is represented in the form of a UUID that is associated with each publisher referred to by the image. If your installation images typically have only one repository associated with them, then there will be only one UUID associated with the image.

When an initial installation program either deploys a pre-installed pkg(5) image or creates an image dynamically, the installation program should generate a installation image ID. When the pkg(5) Java Bootstrap facility is used to dynamically complete an pkg(5) client installation, an installation image ID is automatically generated.

An initial installation program or other programs can obtain the installation image ID and use it for application-specific reporting purposes.

The installation image ID is available the in the HTTP header X-IPkg-UUID.

Source IP Address

This data is useful to report on the general source of package repository interactions. For example, country or geography of origin can be gleaned from the source IP address.

Request Date and Time

Time at which the client interacted with the repository.

Client Platform

Binary distributions targeting multiple OS types, versions and CPU architectures can benefit from knowing which client platforms are interacting with the package repositories.

The HTTP header User-Agent includes the following information:

  • platform
  • processor architecture
  • OS version

pkg(5)-based Clients

pkg(5)-based clients provide the following client platform information in the User-Agent header:

Platform Processor Architecture OS Version Approach
windows Since only x86 systems are supported by pkg(5) at this time, we assume Windows equates to x86 Underlying numeric release version. See Windows Numeric Versions to Marketing Releases for more information.
linux   Underlying kernel version. (May be enhanced with distribution information).
sunos sun4u for SPARC-based systems and i86pc for x86-based systems
darwin i386 or ppc Underlying OS version. See Mac OS X Kernel Versions to Marketing Versions for more information.

Other Clients

Web browsers and other clients such as RSS and Atom clients provide their own form of platform, processor architecture and OS version information in the User-Agent header. Refer to their documentation for details on the layout of their User-Agent headers.

Java Version

When the pkg(5) Java-based tools are used, for example the pkg(5) Java Bootstrap and pkg(5) Java API, the version of the Java platform is included on requests to the package repositories.

Client Application

There are two levels of client application information available in the User-Agent header of pkg(5) requests.

  1. pkg(5) API type: Python or Java
  2. Client application name

At the beginning of the User-Agent header there is the overall type of pkg(5) API in use: either pkg/xxxxxxxxxxxx or pkg-java/x.y.z.xxxx. This indicates whether the client is using the Python or Java APIs for pkg(5), respectively.

Following this value is a parenthesized list of additional information about the client. Items in the list are separated by semicolons. The first two values indicate the operating system and hardware type. The third value is the type of pkg(5) image being used. The next value is the name of the client application that is using the pkg(5) API.

Different pkg(5) client tools can be used by end users and client applications to interact with package repositories. Distinct names are represented for each of the following client applications:

Client App Name Client App
pkg pkg(1) CLI
pkg-java-bootstrap pkg(5) Java Bootstrap
pkg-java pkg(5) Java API
updatetool Update Tool GUI
updatetool-software-update Update Tool Software Update GUI
updatetool-notifier Update Tool Desktop Notifier

Package Information

When a package is being installed or updated, the URL of the manifest request includes the fully qualified package name and version.

Download Size Information

Since the POST requests for the "filelist" URLs may result in content for multiple packages being downloaded, download size information is not readily available from the access logs on a per package basis. However, by summing the response sizes of successful filelist requests, you can gain an understanding of the overall number of bytes transferred to clients.

Intent of the Request

pkg(5) client tools convey the intent of their interaction with the repository on each manifest retrieval request. Examples of interactions where intent information is sent include:

  • Browsing catalog of packages
  • Obtaining detailed package metadata for display
  • Update check
  • Download package for update
  • Download new package for installation

The intent information is available in the HTTP header X-Ipkg-Intent.

The intent header is a parenthesized list of name=value pairs with items separated by a semicolon. For example:

(operation=notifier;reason=process)

The possible names and values include:

Intent Header Field Description
operation The type of operation being performed, such as list, install, notifier.
prior_version For an install, the previous version of the package.
reason Information about why the operation is being performed.
target The package that is being requested.
initial_target For an install, the package that the user asked to have installed that caused the current package to be installed.
needed_by For an install, the name of the package that led to this package being installed.

The operation field may have the following values:

Operation Value Purpose pkg(5) CLI Update Tool GUI Software Update GUI Notifier pkg(5) Java API
install A package is being installed or updated Yes Yes Yes (add-ons) N/A Yes
uninstall A package is being uninstall Yes Yes N/A N/A Yes
info Requesting pkg info (pkg info) Yes No No N/A No
image-update An image is being updated (all packages) Yes Yes Yes N/A No
list Requesting pkg info (pkg list) Yes Yes Yes Yes No
license Fetching a license No No Yes N/A No
image-create A new image is being created Yes Yes N/A N/A Yes
image-edit Changing image information No Yes N/A N/A No
Importance of the reason Field

The reason field should be taken into account when inspecting the intent header and interpreting the operation field. For example, only when the reason=process does an operation=install mean an installation is intended to take place for the package named in the request. In some cases, you will see a reason=info when operation=install, but this combination should not be interpreted as an imminent installation for the package named in the request.

Intent Header Examples

The following Apache access log records are extracts from actual interactions with a pkg(5) repository. In addition to demonstrating how the intent header, X-IPkg-Intent, is used under various circumstances, these examples also show how:

  • Fully qualified package name and version are represented in the resource portion of the manifest requests.
  • Client application and OS information is represented in the User-Agent header.
  • The UUID in the X-IPkg-UUID header. In these examples the UUID is "569d1dc7-6770-11de-936a-00144f47da9c"

Update Tool GUI: Install New Package

Note the "operation=install" and the lack of a prior_version. Meaning that it's a new install of the package.

129.150.32.62 - - [20/Jul/2009:06:52:10 -0700] "HEAD /dev/latest/manifest/0/updatetool-data@2.0%2C0-4.1%3A20080425T022038Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=updatetool-data@2.0,0-4.1:20080425T022038Z;reason=process;operation=install)" 
 

Update Tool GUI: Apply All Available Updates

Note the "operation=image-update" setting.

129.150.32.62 - - [20/Jul/2009:06:53:35 -0700] "GET /dev/latest/versions/0 HTTP/1.1" 200 132 "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "-"

129.150.32.62 - - [20/Jul/2009:06:53:35 -0700] "GET /dev/latest/catalog/0 HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "-"

129.150.32.62 - - [20/Jul/2009:06:53:36 -0700] "HEAD /dev/latest/manifest/0/pkg-toolkit-incorporation@2.3.0%2C0-33.2268%3A20090720T030956Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg-toolkit-incorporation@2.3.0,0-33.2268:20090720T030956Z;reason=process;operation=image-update;prior_version=2.3.0,0-33.2258:20090715T072125Z)"

129.150.32.62 - - [20/Jul/2009:06:53:36 -0700] "HEAD /dev/latest/manifest/0/wxpython2.8-minimal@2.8.10.1%2C0-33.2268%3A20090720T031109Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=wxpython2.8-minimal@2.8.10.1,0-33.2268:20090720T031109Z;reason=process;operation=image-update;prior_version=2.8.10.1,0-33.2258:20090715T072238Z)"

129.150.32.62 - - [20/Jul/2009:06:53:36 -0700] "HEAD /dev/latest/manifest/0/pkg-java@1.111%2C0-33.2268%3A20090720T025559Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg-java@1.111,0-33.2268:20090720T025559Z;reason=process;operation=image-update;prior_version=1.111,0-33.2258:20090715T070738Z)"

129.150.32.62 - - [20/Jul/2009:06:53:36 -0700] "HEAD /dev/latest/manifest/0/python2.4-minimal@2.4.5.0%2C0-33.2268%3A20090720T031018Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=python2.4-minimal@2.4.5.0,0-33.2268:20090720T031018Z;reason=process;operation=image-update;prior_version=2.4.5.0,0-33.2258:20090715T072147Z)"

129.150.32.62 - - [20/Jul/2009:06:53:37 -0700] "HEAD
/dev/latest/manifest/0/updatetool@2.3.0%2C0-33.2268%3A20090720T031051Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=updatetool@2.3.0,0-33.2268:20090720T031051Z;reason=process;operation=image-update;prior_version=2.3.0,0-33.2258:20090715T072221Z)"

129.150.32.62 - - [20/Jul/2009:06:53:37 -0700] "HEAD /dev/latest/manifest/0/pkg-extra-tools@0.2.0%2C0-33.2268%3A20090720T025559Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg-extra-tools@0.2.0,0-33.2268:20090720T025559Z;reason=process;operation=image-update;prior_version=0.2.0,0-33.2258:20090715T070737Z)"

129.150.32.62 - - [20/Jul/2009:06:53:37 -0700] "HEAD /dev/latest/manifest/0/pkg@1.111.3%2C0-33.2268%3A20090720T030957Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg@1.111.3,0-33.2268:20090720T030957Z;reason=process;operation=image-update;prior_version=1.111.3,0-33.2258:20090715T072126Z)"

Update Tool GUI: Selective Update

Note the "operation=install" and prior_version is not NULL.

129.150.32.62 - - [20/Jul/2009:08:03:24 -0700] "HEAD /dev/latest/manifest/0/updatetool@2.3.0%2C0-33.2268%3A20090720T031051Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=updatetool@2.3.0,0-33.2268:20090720T031051Z;reason=process;operation=install;prior_version=2.3.0,0-33.2258:20090715T072221Z)"
 

Software Update GUI: Apply Updates to an Image

Note the "operation=image-update" setting.

129.150.32.62 - - [20/Jul/2009:07:10:48 -0700] "HEAD /dev/latest/manifest/0/pkg-java@1.111%2C0-33.2268%3A20090720T025559Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool-software-update)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg-java@1.111,0-33.2268:20090720T025559Z;reason=process;operation=image-update;prior_version=1.111,0-33.2258:20090715T070738Z)"

129.150.32.62 - - [20/Jul/2009:07:10:48 -0700] "HEAD /dev/latest/manifest/0/updatetool@2.3.0%2C0-33.2268%3A20090720T031051Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool-software-update)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=updatetool@2.3.0,0-33.2268:20090720T031051Z;reason=process;operation=image-update;prior_version=2.3.0,0-33.2258:20090715T072221Z)"

129.150.32.62 - - [20/Jul/2009:07:10:48 -0700] "HEAD /dev/latest/manifest/0/pkg-extra-tools@0.2.0%2C0-33.2268%3A20090720T025559Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool-software-update)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg-extra-tools@0.2.0,0-33.2268:20090720T025559Z;reason=process;operation=image-update;prior_version=0.2.0,0-33.2258:20090715T070737Z)"

129.150.32.62 - - [20/Jul/2009:07:10:49 -0700] "HEAD /dev/latest/manifest/0/mysql-jdbc@5.1.5%2C0-0.7%3A20080425T022229Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; updatetool-software-update)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=mysql-jdbc@5.1.5,0-0.7:20080425T022229Z;reason=process;operation=image-update;prior_version=5.1.5,0-0.7:20080416T002509Z)"

pkg(1) CLI: pkg image-update

Note the "operation=image-update" setting.

129.150.32.62 - - [20/Jul/2009:07:14:57 -0700] "HEAD /dev/latest/manifest/0/pkg-java@1.111%2C0-33.2268%3A20090720T025559Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg-java@1.111,0-33.2268:20090720T025559Z;reason=process;operation=image-update;prior_version=1.111,0-33.2258:20090715T070738Z)"

129.150.32.62 - - [20/Jul/2009:07:14:57 -0700] "HEAD /dev/latest/manifest/0/javadb@10.2.2%2C0-0.7%3A20080425T022159Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=javadb@10.2.2,0-0.7:20080425T022159Z;reason=process;operation=image-update;prior_version=10.2.2,0-0.7:20080416T002439Z)"

129.150.32.62 - - [20/Jul/2009:07:14:58 -0700] "HEAD /dev/latest/manifest/0/updatetool@2.3.0%2C0-33.2268%3A20090720T031051Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=updatetool@2.3.0,0-33.2268:20090720T031051Z;reason=process;operation=image-update;prior_version=2.3.0,0-33.2258:20090715T072221Z)"

129.150.32.62 - - [20/Jul/2009:07:14:58 -0700] "HEAD /dev/latest/manifest/0/pkg-extra-tools@0.2.0%2C0-33.2268%3A20090720T025559Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=pkg-extra-tools@0.2.0,0-33.2268:20090720T025559Z;reason=process;operation=image-update;prior_version=0.2.0,0-33.2258:20090715T070737Z)"

129.150.32.62 - - [20/Jul/2009:07:14:58 -0700] "HEAD /dev/latest/manifest/0/ant@1.6.5%2C0-0.15%3A20080425T022038Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=ant@1.6.5,0-0.15:20080425T022038Z;reason=process;operation=image-update;prior_version=1.6.5,0-0.15:20080416T002318Z)"

129.150.32.62 - - [20/Jul/2009:07:14:58 -0700] "HEAD /dev/latest/manifest/0/mysql-jdbc@5.1.5%2C0-0.7%3A20080425T022229Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=mysql-jdbc@5.1.5,0-0.7:20080425T022229Z;reason=process;operation=image-update;prior_version=5.1.5,0-0.7:20080416T002509Z)"

129.150.32.62 - - [20/Jul/2009:07:14:59 -0700] "POST /dev/latest/filelist/0 HTTP/1.1" 200 245760 "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "-"

pkg(1) CLI: Updating Single Package via "pkg install mysql-jdbc"

Note the "operation=install" and prior_version is not NULL.

129.150.32.62 - - [20/Jul/2009:07:57:27 -0700] "HEAD /dev/latest/manifest/0/mysql-jdbc@5.1.5%2C0-0.7%3A20080425T022229Z HTTP/1.1" 200 - "-" "pkg/5bcd97b6fcb1 (darwin i386; 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386; user; pkg)" "569d1dc7-6770-11de-936a-00144f47da9c" "(initial_target=mysql-jdbc@5.1.5,0-0.7:20080425T022229Z;reason=process;operation=install;prior_version=5.1.5,0-0.7:20080416T002509Z)"
 















Confluence User's Guide
Plugins Available

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact