Overview of Indexing and Search Service
This information provides an overview of the Indexing and Search Service (ISS), including a description of the product architecture and discussion of the major components.
Topics:
Indexing and Search Service Software Architecture
The following figure shows the high-level overview of the ISS software architecture.
Sun Java Indexing and Search Service High-Level Architecture

This figure shows that ISS is composed of an indexing service and a search service. The indexing service indexes data repositories in real time. Indexing is provided through web services, enabling you to index arbitrary data. Clients of ISS consume RESTful web services that provide the search capabilities.
The following figure shows a more detailed look at components of the ISS software architecture.
Sun Java Indexing and Search Service Detailed Architecture

| Note Two JMS servers are present, one for Messaging Server notifications and one for internal ISS communication. In a simple configuration, the JMQ broker for Messaging Server notifications is running on the Messaging Server system and the JMQ broker for ISS is running on the ISS system. Indexing services communicate only to the ISS JMQ server. |
How ISS Searches Messages
The Sun Java Indexing and Search Service Architecture figure shows that IMAP clients perform searches on the ISS store by first connecting to the Messaging Server IMAP daemon. The IMAP SEARCH ISS gateway component of Messaging Server (available as of the Messaging Server 7 Update 2 release) diverts appropriate searches to ISS. If searches include the to, cc, or from fields, or subject or body, and don't include functionality that ISS currently does not support, such as an OR request, the search is sent to ISS. Additionally, if a problem occurs while obtaining a response from ISS, the search is handled by Messaging Server as a fallback.
From the client perspective, the ability to use ISS search is seamless as the client continues to communicate over the IMAP protocol with Messaging Server. However, by communicating with ISS directly, mail clients have the ability to conduct different and faster search queries. For example, current IMAP searches of an entire mailbox encounter a performance issue, as the search must proceed one folder at a time, then collate the results. In ISS, you do not need to specify a folder argument, thus enabling you to obtain the results faster. ISS also gives you the benefit of conducting "fuzzy" and "near" searches, which are not supported in IMAP SEARCH syntax. Further, ISS enables you to search attachments, for example, return all PDF attachments. Searching attachment types is not possible in IMAP syntax.
Thus, ISS supports two kinds of searches: regular mail search and attachment search. For attachment search, the ISS storeui component (at the Application Server level) returns thumbnail images plus links to the actual images in the ISS store.
The ISS architecture enables two ways of conducting a search. Either the mail client communicates directly to ISS by using the RESTful web service (deployed in an Application Server web container), or Messaging Server communicates to the ISS interface. In a large deployment, you can load-balance the ISS URL by using either a hardware load balancer or a DNS type of load balancing. The load balancer distributes requests to the Application Server instances running ISS. Search queries are posted to a search service JMS topic. At the back end, the search is picked up by the search service consumer that is handling that user. Only the search store instance responsible for that particular user responds. The search service consumer performs the search request and returns the results to the client.
How ISS Indexes Messages
You must bootstrap user accounts to enable ISS to index the user's email. To bootstrap accounts, run the the indexSvcBootstrap.sh script on the message store instance to be indexed. This script triggers the ISS Crawler to connect by using the IMAP protocol to the Messaging Server message store. The Crawler obtains the list of folders for that user, walks through each folder, downloads the email, and adds it to the ISS store.
After initial bootstrapping of accounts, indexing of new messages in the ISS store actually begins when an email message change occurs in the Messaging Server message store. Email events that are significant for ISS include:
- Arrival of a new email message
- Deleting an email message
- Viewing (reading) an email message
- Setting an email message flag
- Creating a new folder
- Moving an email message to a new folder
These events generate JMQ notifications containing the type of change. The JMS Producer (actually the jmqnotifyplugin) posts the notification message to the Event Notification Queue (the imqbroker that you configure on Message Server). On the ISS side, the JMQ Consumers (MS Event Consumers) are listening to the Event Notification Queue. Events are tagged by the user, that is, the user who generated the event. Thus, the ISS store instance knows how to serve that particular user (knows which store instance that user is on), takes the message, and processes it.
When a user receives a new email message in the message store, an event notification is generated, which attempts to fit the entire text of the email message into its payload (event message). ISS then is able to process all of the new message for indexing. Because the event message attempts to contain all the text of the email message, IMAP processing is conserved. Additionally, Messaging Server does not have to perform extra work on its end, as the email message is in memory when the event happens.
When you configure JMQ, you can set the size of the event message body. Currently, the ISS configuration instructions describe setting the message body at 256 Kbytes. When the message size is larger than the configured sized, the original message will need to be retrieved over IMAP.
If a user copies a message or sets a message flag, the event notification message contains all information that ISS needs to update the ISS store. ISS does not need to download any more information to keep its store in sync with the Messaging Server message store.
If the event notification is for a new email message that has arrived in a user's mailbox, ISS passes it to the Parser/Converter for processing. The message is separated into the fields that ISS indexes. Attachments are separated from the body text for processing by the Converter. As long as ISS has a plugin for the attachment type, it extracts the "meaningful" text. The ISS HTML plugin indexes only text outside of HTML tags. That is, the HTML plugin ignores HTML markup, and indexes only the content.
In the case of text, PDF, or OpenOffice attachments, the ISS plugins convert the format to text content. Additionally, ISS discards stop words such as "the." Only some of the attachments that ISS indexes are actually saved to the attachment store. The reason for this restriction is that some attachments do not have thumbnail images and so it does not make sense to store them. For example, ISS does not store thumbnail images for .txt and .xml attachments. ISS does support indexing Microsoft Office documents, including Word, Excel, PowerPoint, and Visio, in the attachment store.
ISS Security and Authentication
The files in the ISS attachment store and index are owned by the ISS user whom you specified during configuration. This is analogous to the Messaging Server user owning the files in the message store. Regular users can read only their own files in the store. They cannot access other users' files.
To search mail in the ISS store, users need to authenticate to LDAP to be able to use the RESTful web service. A second means of authentication is for the Messaging Server host itself to authenticate to ISS through the mail.server.ip property that you specified during the configuration and is defined in the jiss.conf file. This verification grants access to the Messaging Server host or hosts, through the host IP address, to access the RESTful web service.
| Note A future release of ISS might include the ability for the root user to use proxy authentication. |
When securing your ISS deployment, be sure to change the passwords for the JMQ guest and admin users, as shown in . If necessary, you can also configure JMQ to use SSL, though this configuration has not officially been tested yet.
Search Query and Sort Criteria
For guidance on generating search queries to Indexing and Search Service (ISS), see Communications Suite 7 Installation Scenario - Indexing and Search Service.
About Search Results and Pattern Matching
The search web service allows for pagination of results through the start and count parameters, but when the queries come in from Messaging Server, the count is always set to the max (count=2147483647).
If you are not seeing all the search results that you think you should be receiving, it could be because ISS is not doing the search the same way as IMAP does. A partial match does not provide a match unless you provide a wildcard character to the search web service. Currently, this capability is not available through the Messaging Server search integration.
That is, searching for "apple" will not match "apples," but searching for "apple*" does match both "apple" and "apples." Currently, you can use the wildcard if you use the RESTful web service directly and omit the double quotes, but not if you search by using IMAP. Right now, Messaging Server puts the terms in double quotes, so if you put "apple*" in your IMAP client it is interpreted by ISS as "apple*" and the * is not interpreted as a wildcard.
To experiment with the richer search experience from "talking to" the web services directly, go to the following URLs:
- http://iss-host:8080/rest
- http://iss-host:8080/searchui/
ISS provides sample searches at these URLs. These searches talk directly to the web service and utilize the thumbnail images.

