View Source

h1. AI transport Functional Specification

h3. Contents

# Overview
## Problem Statement
## Scope of Product
## Other Documentation
## Interdependencies
## Supported Configurations
## Definition of Terms
# Functional Overview
## Major Components
## Assumptions
## Development, Monitoring, and Observability Tools
## Test Environments
## Scalability
## Enhancements
# User Interface
# Deliverables
# Functional Description and Interfaces of Major Components
# Use Cases

h5. 1. Overview
## Problem Statement
\\
To provide a modular mechanism for securely delivering the required data to the AI client. The mechanism is scalable, delivers high performance, and provides a compelling user experience. (For an example of possible transport mechanism which have been brain-stormed see: http://www.opensolaris.org/os/project/caiman/auto_install/ai_design/transport_options.pdf.)
\\
\\
## Scope of Product
\\
The primary objective of this project is to define and document the modular boundaries around the data transport mechanism and to design and implement two example mechanisms.
\\
The example transport mechanisms will consist of one module implementing HTTP and another implementing a distributed peer-to-peer protocol. Both mechanisms may provide:
*** Encryption, authentication and non-repudiation - data integrity, access control and delivery verification
*** Scaling - from a single system employing no security and rapid setup, to an enterprise deployment using all security mechanisms
\\
* Note: This project will use the features of the transport rather than implementing them. So the scalability and security are optional. We will choose the mechanism that meets the requirements. This doesn't preclude using a mechanism that doesn't provide the listed features.
\\
Secure data transfer will occur from server to client and from client to server.
\\
In general, a transport mechanism should provide a compelling user experience. The client side should provide information such as current status, performance data, success or failure and the reason for failure if it failed. The application should also provide an observability tool that allows the user to review logs, analyze data (for both successes and failures), perform maintenance, and to archive significant information.
\\
This project will not affect the AI manifest publication process but will interface with the stored manifests to serve them. Further, the boot archive design will not be affected by this work but will also be served by this mechanism. The wan boot application can be modified to better support alternative transport protocols.
\\
\\
## Other Documentation
\\
Documentation for the original Automated Installer:
http://opensolaris.org/os/project/caiman/auto_install/Documentation/
\\
The data flow between the server and the client is described in the diagram
http://www.opensolaris.org/os/project/caiman/auto_install/ai_design/transport_data_diagram.pdf
\\
The transport research summary is at
http://www.opensolaris.org/os/project/caiman/auto_install/ai_design/transport_options.pdf
\\
The web server cache information is documented at
http://www.opensolaris.org/os/project/caiman/auto_install/ai_design/web_server_cache.txt
\\
Apache web server is documented at
http://www.apache.org
\\
CherryPy web server is documented at
http://www.cherrypy.org
\\
BitTorrent is documented at
http://www.bittorrent.org
\\
\\
## Interdependencies
\\
The transport mechanism has the following interdependencies.
\\
### Network connectivity is required between the device serving the requested data and the client.
\\
#### Network configuration data, such as routers, name server, and IP address which is necessary to set up network connectivity. This is usually set up through a network configuration protocol, such as Dynamic Host Configuration Protocol (DHCP), to simplify network parameter assignment to clients on the network.
\\
### An file transfer protocol for transmitting files over the network between server and client. The transport mechanism implements the file transfer protocol.
\\
### Location of the resources to be transferred.
\\
### Checksums of files to be sent over transport mechanism for client side integrity checking (done at time of image creation)
\\
## Definition of Terms
\\
\\
### Apache HTTP Server - A web server which is primarily used to serve both static content and dynamic content. Apache supports a variety of features, many implemented as compiled modules. These can range from server-side programming language support to authentication schemes. Some common language interfaces support mod_perl, mod_python, Tcl, and PHP.
\\
\\
### Automated Installer (AI) - an installation method used to perform unattended OpenSolaris installations.
\\
\\
### Web server cache - A Web cache sits between one or more Web servers and a client or many clients, and watches requests come by, saving copies of the responses - like HTML pages, images and files for itself. Then, if there is another request for the same URL, it can use the response that it has, instead of asking the origin server for it again. Caching is supported for HTTP, FTP, GOPHER etc
\\
\\
### BitTorrent - is a peer-to-peer file sharing protocol used for distributing large amounts of data. The protocol works when a file provider initially makes his/her file (or group of files) available to the network. This is called a seed and allows others, named peers, to connect and download the file. Each peer that downloads a part of the data makes it available to other peers to download. After the file is successfully downloaded by a peer, many continue to make the data available, becoming additional seeds. This distributed nature of BitTorrent leads to a viral spreading of a file throughout peers. As more peers join the swarm, the likelihood of a successful download increases. Relative to standard Internet hosting, this provides a significant reduction in the original distributor's hardware and bandwidth resource costs. It also provides redundancy against system problems and reduces dependence on the original distributor.
\\
\\
### CherryPy - is an object-oriented web application using the Python programming language.
\\
\\
### DHCP - Dynamic Host Configuration Protocol, used by the client to get the IP address and the location of the initial program
\\
\\
### Modular Boundaries - the logical boundaries between components. The modular boundaries defined for the transport mechanism Version2 project include the client, client transport, transport mechanism, server transport, and criteria database.
\\
\\
### Transport Mechanism Module- a pluggable two part mechanism with a client component and a server component. The module implements the protocol used to transmit data between the client and the server. The current web server implementation uses wget(1) as the current transportation mechanism on the client implementing the HTTP/1.1 protocol and on the server Apache and CherryPy are used,
again implementing the HTTP/1.1 protocol. The modularity applies to the ability to easily use different mechanisms for data transfer. Whatever mechanism is chosen initially, it shouldn't preclude adding other mechanisms.
\\
\\
## Supported Configurations
\\
The content delivery mechanism will be designed to handle the following configurations:
*** Small sets of identical clients (in the range of 10-25 client machines of identical/similar hardware)
*** Large sets of identical clients (in the range of 1000+ client machines of identical/similar machines)
*** Both large and small sets of heterogeneous clients (clients of dissimilar hardware/platform/architecture)
*** Any number of clients in between large/small, both homogeneous and heterogeneous in nature
\\
** Note: While an AI set-up can serve smaller groups of clients, as small as a a single client, the content delivery will not be *optimized* for such a situation, as that is not the target audience for AI.
\\
\\
# Functional Overview
\\
## Major Components of the Transport Mechanism Version 2 Project
\\
The functional components include a client that requests data and a server that sends the requested data. The data flow diagram is at http://www.opensolaris.org/os/project/caiman/auto_install/ai_design/transport_data_diagram.pdf illustrates the all the client data requests and the server responses. The client uses different transport mechanisms at different times. Some of those mechanisms cannot be changed because of architectual restrictions in the client. These restrictions are outside the scope of this project. For example, X86 systems uses PXE to do network booting. PXE uses DHCP to get the initial IP address and TFTP to download the initial boot program (GRUB).
\\
Later transport mechanism could be changed. This includes any client requests after the OpenSolaris kernel is booted on the client. The table in the document http://www.opensolaris.org/os/project/caiman/auto_install/ai_design/transport_data_diagram.pdf describes what transport is used for each data request and whether we have any control to change the transport. Based on the restrictions imposed by the client, we can control the transport to get the install programs and AI manifests. To perform the transport requires a client side component and server side component (e.g. a transport client and a transport server).
\\
Client side components:
*** The client program(s) that initiate the transfer
*** The client code that handles client side observability
*** The client code that handles providing observability metrics
*** The client code which can secure communication with the server (as a user option)
*** The client code to verify the data received, using server provided integrity data
\\
Server side components:
*** A published location to reach the transport server
*** Startup and shutdown mechanisms for the transport server
*** Transport configuration that allows tuning the transport to meet client needs
*** Provide interfaces for observability tools (download status, client connections, etc.)
*** Provide either a tool or documentation to enable secure data transport
*** Deliver integrity data to the client for data verification
\\
## Assumptions
The assumptions on the transport are as follows:
*** For cases where secure transfer is desired, the security features will be built-in to the transport mechanism.
\\
## Development, Monitoring, and Observability Tools
\\
Observability data will be provided through interfaces for monitoring tools. Client side data which will be provided includes network observability such as DHCP response data, download performance data, transport protocol specific data (such as HTTP/1.1 PUTs and GETs or BitTorrent peers) and the files requested. Server data which will be provided includes clients (IP addresses and perhaps more descriptive information as provided by the clients) and their states (such as downloads begun, completed, aborted) and which data was sent to the client. Explicitly, we will not be transporting observability data as that task will belong to future observability work.
\\
## Test Environments
\\
To ensure the transport mechanism performs under load as expected, a scalability test environment will be needed. Such an environment can consist of a large number of clients or can be an environment which consists of a number of load generators. Further, to ensure proper error handling and observability reporting a way to inject errors or cause aborted and failed transports will be necessary.
\\
\\
## Scalability and Performance
\\
The transport mechanism provided should be highly scalable. The intended audiences ranges from installing a single machine, up to enterprises installing on 1000+ machines. The primary content delivery system needs to perform well at both extremes, as well as at all points in between. The term "scalable" in an ideal sense means that, regardless of number of machines being served, performance remains the same. At this time, there are no hard performance requirements because it is highly unrealistic to expect a server to maintain identical performance regardless of load. With that in mind, a scalable solution must:
*** Recognize periods of heavy load, and adjust (if necessary) performance parameters as necessary (e.g. spawn sufficient additional processes to handle incoming requests, without spawning so many that neither the server machine, nor the content delivery service, crash)
*** Respond to incoming requests and deliver content in a timely manner
*** Incoming requests do not timeout or get dropped. Those which do, will be handled, when possible; this includes (when combined with observability mechanisms) re-initiating transfer to clients with failed downloads
*** In the cases of overload, the server is able to recognize missed requests, and re-establish connection, or the client can retransmit requests later and be served in a timely manner (2nd and 3rd attempts are handled properly)
*** Additionally, the server must be able to do this with minimal performance impact on other processes running on the server machine
\\
# End-User Interface
\\
User interface interaction with the transport mechanism are minimal. Server user interface is limited to starting and stopping the service. Observability data will be provided to the user through observability tools provided by later external work. Client user interface is limited to the SMF service which controls the start and stop of auto-install.
\\
# Deliverable
** A transport that meets the requirements
** Mechanism to start and stop the transport on the server
** Minimum requirements on the server to run the transport on the server
** Initial configuration file, if needed
** Documentation to explain how to tune the transport
** What are the security features supported and how to enable those features
** Expected scalability and performance data
** Observability features
** Tools/Programs used by the client to initiate the transfer.
# Functional Description and Interfaces of Major Components
\\
## _Client side program(s) to initiate data transfer_
For SPARC, the client request to get initial boot program using wanboot which currently provides HTTP support. For X86 GRUB is used and currently provides TFTP support. AI client methods for receiving data will interface with client program(s) of the transport mechanism to gather data.
## _Client side Observability_
The client program we choose should provide information such as:
*** DHCP response data
*** files requested
*** success/failure
**** reason for failure
*** transfer performance
*** how much data transferred etc.
*** number of retries etc.
*** protocol specific data
**** HTTP/1.1: GETs and PUTs
**** BitTorrent: peers
## _Security from the client side_
Transport mechanisms can optionally support secure transfer. To qualify for secure transfer, a transport should:
*** Support secure downloads (protocols like SSL)
*** Support public key exchange
**** Accept certificates, ca-authorities and private key to be specified with the program.
## _Server side transport_
The server side transport needs to be a running at a published location, so that clients can send the requests to the server. The server needs a way to determine where to serve data from, be it either statically stored or dynamically generated data.
## _Server side startup and shutdown_
The server should be reliably started and stopped. On Solaris, this should be provided by SMF. Further, any server configuration which lends itself to storage in SMF should be placed in the SMF repository.
## _Server side observability_
The transport software on the server side needs to be instrumented or monitored to provide status on client requests. Examples being:
*** Begun TFTP transactions
*** Data downloaded via TFTP
*** Begun transport protocol transactions
*** Data downloaded via transport protocol
*** Time to download data
*** Metrics on files requested
*** (Maybe outside the scope of this project.) Heuristics to determine when possible failure has occurred based on initial transactions not being followed up by expected final transactions to complete an AI install
*** Transactions being re-requested and other collectible error statistics
## _Data integrity_
The data sent by the server can further be verified by the client by computing the hash (checksum) and comparing with the hash that is embedded in the data.
# Use Cases
\\
The use cases describe the interaction between the transport mechanism, the client, and the server, during normal activity as well as during failures.
\\
The following use cases support a One-to-One protocol model:
\\
Actors and Associated Goals:
\\
Client Goals -
\\
1. Requests data files
2. Receives data files
3. Executes SMF auto-installer method
4. Verifies data integrity
\\
Transport Mechanism Goals -
\\
1. Transmits requests from client to server
2. Transmits requests from server to client
3. Provides required performance reporting
4. Reports success or failure of transmission.
\\
Server Goals -
\\
1. Receives requests for data files
2. Sends requested data files
3. Determines the AI manifest for each specific client
4. Receives request for AI manifest. Processes to get an AI manifest to use.(for either static or derived manifest)
\\
Observability -
1.Observes the processes and behavior of client, transport mechanism, and server.
-What the server is serving
-What the client is requesting
-Success and failure of the delivery through the transport mechanism.
2. Monitors network connection status.
3. Monitors system statistics
4. Reports, diagnoses and assists in repairing faults
5. Monitors and reports on performance statistics
6. Views progress of installation on the client
7. Reports successes and failures
8. Provides the manifest derived for a specific client (Transported from the client)
9. Provides the ability to tune the transport mechanism
\\
\\
## Use Case 1 - Client requests boot image. Transport mechanism transmits request from client for boot image file to server. Server receives request for boot image. Server responds to the request, and transport mechanism sends boot image to client.
\\
Description: A client has booted an automated installer boot archive and has initiated an installation, so the client requests the boot image with the intention of downloading and mounting it. The transport mechanism sends the request to the server. When the server receives the request from the client, it responds to the request. The transport mechanism delivers the boot image to the client.
\\
Success Case
##* Client succeeds in placing request.
##* Transport mechanism successfully transmits the request.
##* Server successfully locates the boot image and responds to the request.
##* The transport mechanism successfully delivers the boot image to the client.
\\
Failure Cases
##* Transport mechanism is unable to successfully transmit the request made by the client for the boot image.
##** The url is not valid
##** The server is no longer online.
##* Server cannot locate the boot image.
##** The boot image has been deleted.
##** The path to the location of the boot image no longer exists.
##* Transport mechanism is unable to successfully deliver the boot image to the client.
##** The client is no longer online.
\\
\\
## Use Case 2 - Client requests AI manifest file. Transport mechanism transmits request from client for the AI manifest file to server. Server receives request for the AI manifest file. Server receives request for the AI manifestfile and locates correct manifest based on information sent by the client. Server sends AI manifest.
\\
Description: The auto-installer SMF service executes on the client, so the client requests the AI manifest file with the intention of downloading it and using it to define the installation parameters for the client. The transport mechanism sends the request to the server. When the server receives the request from the client, it responds to the request. The transport mechanism delivers the AI manifest file to the client.
\\
Success Cases
##* Client succeeds in placing request.
##* Transport mechanism successfully transmits the request.
##* Server successfully locates the AI manifest file and responds to the request.
##* The transport mechanism successfully delivers the AI manifest file to the client.
\\
Failure Cases
##* Transport mechanism is unable to successfully transmit the request made by the client for the AI manifest file
##** The url is not valid.
##** The server is no longer online.
##* Server cannot locate the AI manifest file.
##** The AI manifest file has been deleted.
##** The path to the location of the AI manifest file no longer exists.
##* Transport mechanism is unable to successfully deliver the AI manifest file to the client.
##** The client is no longer online.
\\
## Use Case 3 - Client transmits, at specified increments, installation progress updates to a log file that is not on the client.
\\
Description: During installation, the AI client records the progress of the installation in a log file on the client. Incrementally, the information in the log file is synchronized into log file that is located in another location.
\\
Note 1: the appropriate location for storing this file needs to be determined. This is dependent on how the observability is implemented.
\\
Note 2: this could be done through the transport mechanism. It's possible that it could be done with FTP, rsync, scp directly.
\\
Success Cases:
\\
Initial transmission:
a. Installation progress step is successfully added to client log on the client.
b. Client succeeds in placing a request to transfer log contents.
c. Transport mechanism successfully transmits the request to the location of second log file.
d. Second log file is created for the client log information since one does not exist.
e. Log information is appended to second client log file.
\\
Incremental Transmission:
a. Installation progress step is successfully added to client log on the client.
b. Client succeeds in placing a request to transfer log contents.
c. Transport mechanism successfully transmits the request to the location of second log file.
d. Log information is appended to second client log file.
\\
Final Transmission:
a. Installation progress step is successfully added to client log on the client.
b. Client succeeds in placing a request to transfer log contents.
c. Transport mechanism successfully transmits the request to the location of second log file.
d. Second log file location determined.
e. Log information is appended to second client log file.
\\
Failure Cases:
\\
Installation fails.
a. Installation failure is successfully added to client log.
b. Client succeeds in placing a request to transfer log contents.
c. Transport mechanism successfully transmits the request to the location of second log file.
d. Second log file location is created or located.
e. Log information is appended into second client log file.
\\
Installation fails.
a. Installation failure information is not successfully added to client log.
\\
Installation fails.
a. Installation failure information is successfully added to client log.
b. Client does not succeed in placing a request to transfer log contents.
\\
Installation fails.
a. Installation failure is successfully added to client log.
b. Client succeeds in placing a request to transfer log contents.
c. Transport mechanism fails to transmit the request to the location of second log.
\\
Installation fails.
a. Installation failure is successfully added to client log.
b. Client succeeds in placing a request to transfer log contents.
c. Transport mechanism successfully transmits the request to location of second log.
d. Second log file does not exist.
\\
Installation fails.
a. Installation failure is successfully added to client log.
b. Client succeeds in placing a request to transfer log contents.
c. Transport mechanism successfully transmits the request to the location of second log file.
d. Second log file located.
e.Log information fails to be appended into second client log file.
\\
\\
## Use Case 4 - Client boots and server derives a manifest based on attributes of the client
Note: The will be fleshed out when more info about derived manifests is available.
\\
\\
\\
## Use Case 5 - Modular transport mechanism: Peer to Peer (P2P)
\\
Goal: Implement a P2P Module for the transport mechanism.
\\
Overview: A user, Jill desires to make use of a future capability of AI which involves the download of an arbitrary file, ARB-FILE, of large size (in excess of 1GB). To reduce load on her internal server, she decides to implement a BitTorrent module as the method of transport for that specific file. This use case is meant to demonstrate the steps a user might go through to implement an additional transport mechanism.
\\
##* *Steps for adding the P2P module and enabling it:*
### Add BitTorrent client to the AI image(s) used. The user adds the BitTorrent package, SUNWtransmission, to a DC manifest and creates the image.
### Jill writes the client startup code, BT-Start for beginning the transfer on the client. This is generic scripting or code that will be fired by the AI client when it is ready to transfer the file. For this specific case, this startup code performs two actions:
#### Downloads a specific .torrent file from a known location. Note: This is a small file containing configuration information specific to ARB-FILE.
####* Alternate Scenario: The .torrent file is known and not likely to change, so Jill includes it in the AI client image.
#### Initiates the BitTorrent transfer protocol using the information in the .torrent file.
####* In addition, the BT-Start provides access "observability data" to the AI Client code, including information on:
####** Status of .torrent transfer
####** Tracker IP Address
####** Active connections to peers (IP addresses, upload/download speeds, etc)
####** Overall status of transfer (Percent complete, "total failure" such as losing connections to all peers with remaining required data, etc)
### Jill associates ARB-FILE with the transport startup script/code. During an AI install, if the AI client finds it needs ARB-FILE, it will execute the transport startup script/code to begin transfer.
### Jill writes the server code, BT-manage for enabling the management of "server" side of the BitTorrent protocol. For BitTorrent, this could mean the code will instruct the server to distribute the .torrent file via other transport, and to act as a tracker and seed for ARB-FILE when AI services requiring it are active. It also provides access to observability data such as which clients are currently being tracked, whether or not seeds are available, etc.
###* Alternate Scenario: Since the .torrent file is included on the AI client image, Jill does not need to host it anywhere.
###* Note: For transport mechanisms that support secure transfer, both this and the client code may provide hooks into managing the security configuration; or the security could be managed independently by Jill or other admins.
\\
\\
##* *AI Install Steps:*
### Initial boot done by TFTP. Client A request Pxegrub from the server, which returns the Pxegrub file directly.
### Pxegrub requests menu.lst.
### TFTP boot begins. Requests for archive files occur.
### Client requests AI archive image from AI server (using original transport mechanisms).
### Client begins installation. At some point, installation requires ARB-FILE.
### Client checks hook for ARB-FILE, uses BT-Start code to begin transfer of ARB-FILE. AI Client code is unaware of underlying transport mechanism. However, AI Client code can check on status of transfer by using interfaces hooked into BT-Start.
### Upon receipt of complete ARB-FILE, BT-Start shuts down BitTorrent protocol and returns success.
###* Alternate Scenario: BT-Start returns success, but leaves BitTorrent protocol active, acting as a seed for other AI Clients coming up.
###* Alternate Scenario: At some point, BT-Start loses contact with the tracker. BitTorrent handles this failure internally by maintaining contact with existing peers, and reports a warning, but continues transfer.
###* Alternate Scenario: At some point, BT-Start receives a bad piece of data. BT-Start recognizes the data as corrupt, and contacts a peer to request re-transfer of that piece of data. BT-Start notes this to observability, but since it is recoverable, transfer continues.
###* Failure Scenario: BT-Start loses contact with all peers, seeds, and the tracker. Transfer of ARB-FILE cannot continue, and BT-Start reports failure to AI Client code.
### AI Client continues install normally.
\\
\\
## Use Case 6 - Client performs an AI Installation using the Null Transport
\\
Description: The client and the install data files are located on a local system. The client invokes an AI installation, and the transport mechanism interacts with the local system to access the required files necessary.
\\
Success Cases:
### Client boots with the transport channel designated on the local system.
### Client requests and receives data files from the local system.
### Client executes SMF auto-installer method
### Client verifies data integrity
\\
\\
Failure Cases:
##* The boot image has been deleted.
##* The path to the location of the boot image no longer exists.
##* Transport mechanism is unable to successfully deliver the boot image to the client from the local system.
\\
\\
## Use case 7 - Client installs from an archive. The archive could be local (DVD) or remove (network)
\\
Description: The installation could be using interactive installer or automated installer.
\\
It is assumed successful boot and start of the installation (it is covered by other use cases). The path to the archive is provided by the user. In the case of interactive installer, the user would have an opportunity to enter the path to the archive (local or remote). In the automated installer, the archive is specified as part of the install manifest. The replication project is responsible for design and implementation of archive specification.
\\
Success Case:
1. The client gets the IP Address, initial boot program and boots successfully
2. The client downloads boot_archive and solaris is booted successfully
3. The client gets the install archives successfully and starts the installation
4. The client requests for the archive
5. The server sends the archive back successfully
6. The client validates the archive (tools provided by the replication project)
7 Append the log information to the log file.
8. The install continues using the archive
\\
Failure case(s):
\\
If the transport cannot complete downloading the archive, it is a failure. It could be due to transmission error, incorrect file name or bad file content.
\\
Failure case 1 - File not found
1. The client gets the IP Address, initial boot program and boots successfully
2. The client downloads boot_archive and solaris is booted successfully
3. The client gets the install archives successfully and starts the installation
4. The client requests for the archive
5. The server returns an error "File not found"
6. The client logs the error to the log file and return failure.
7. The installation fails.
\\
failure case 2 - Transmission error
1. The client gets the IP Address, initial boot program and boots successfully
2. The client downloads boot_archive and solaris is booted successfully
3. The client gets the install archives successfully and starts the installation
4. The client requests for the archive
5. The client start receiving the archive but before completing the transfer, the transmission fails.
6. The client logs the error messages, other relevant information and return failure
7. The installation fails
\\
Failure case 3 - Bad file content
1. The client gets the IP Address, initial boot program and boots successfully
2. The client downloads boot_archive and solaris is booted successfully
3. The client gets the install archives successfully and starts the installation
4. The client requests for the archive
5. The server sends the archive back successfully
6. The client validation of the archive fails (tools provided by the replication project)
7. The client logs the error messages, other relevant information and return failure
8. The installation fails

The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact