64bit Indexing Engine Processes Explained
|Article:HOWTO56293|||||Created: 2011-07-28|||||Updated: 2014-02-21|||||Article URL http://www.symantec.com/docs/HOWTO56293|
The purpose of this document is to define and describe the various new processes introduced by the addition of the 64-bit Indexing Engine. Having moved to a new indexing model, there are now a lot of additional Indexing Related processes that run alongside the legacy 32-bit processes.
Typically, each index volume being ingested into - be it live Archiving and Indexing or during an Index Rebuild/Upgrade - spawns a number of processes, identified below:
In addition, the 64-bit Indexing Engine will consume as much of a system's CPU resource as it can in order to process as many concurrent index volumes as possible. The number of concurrent volumes being processed can be controlled via the Enterprise Vault Server extended setting "Maximum concurrent indexing capacity", which requires a restart of the Indexing Service.
When the default number of concurrent volumes (30) is set, it's not uncommon to see upto 300 indexing processes loaded. To understand more about how to monitor the performance of Indexing activities, please refer to HOWTO56264
What follows is a list of all these new processes along with a description of what they are responsible for.
The collection broker is responsible for loading and unloading index volumes (collections) from memory to best utilise system resources. The collection broker also ensures that if system resources are fully utilised and a request for an end user search is received, an index volume loaded for a less time critical operation, such as ingestion, is unloaded so that the search can be executed in a timely fashion. The maximum number of index volumes that the collection broker will load into memory at any given time is determined intelligently based upon the available system resources.
If the collection broker receives a request to load an Index Volume for data ingest/modification and the maximum number of Index Volumes are currently loaded, the requests are stored in the respective Index Volume’s offline queue. The offline queue is processed as soon as the Index Volume is next loaded into memory and before any other ingest/modification requests are handled.
Index Volumes are unloaded either when they have been idle for a fixed period of time (if no requests to load additional Index Volumes have been received) or immediately following the completion of an indexing/search operation if there are more Index Volumes awaiting processing.
The Collection Service manages the Indexer and Crawler services for a given Index Volume, starting and stopping them as required. The service also handles locking when the Indexer and Crawler services are active. One instance of the Collection Service is run for each individual active Index Volume.
Collection Dispatch Service
The Collection Dispatch Service has the same responsibilities as the generic Dispatch Service, but is responsible for activities on an Index Volume level.
The crawler service validates and prepares the index data to be added to the index volume. There will be one Crawler Service for each index volume currently active in memory.
The Dispatch service is responsible for starting other services utilised by the Indexing Engine.
The Index Admin Service (IAS) is the new Enterprise Vault Indexing Service. IAS is responsible for the overall administration of all Indexing operations including managing EVIndexVolumeProcessor and the legacy Index Broker as well as the launch and shutdown of the 64-bit Search Engine and managing its configuration and error handling. Requests for indexing operations are relayed by IAS to either the 32-bit or 64-bit Indexing Engines as appropriate. As such IAS is the bridge between Indexing Engines and allows Enterprise Vault to forward all indexing work through one service. IAS has a dedicated thread within it that directly checks the appropriate Enterprise Vault databases to determine if there is any pending Indexing work to be carried out on the Index Volumes associated with the Indexing Server on which it is running. These checks occur when the Indexing Service starts up and at predefined intervals during operation. The delays between checks are configurable via the extended Indexing settings on individual Indexing Services, with the default values as follows:
- Frequency of checks for index volumes to process: 1 hour
- Frequency of checks for failed volumes: 6 hours
- Frequency of full checks for index volumes to process: 10 hours
Index Query Server (IQS) encapsulates all the new 64-bit Search Engine functionality exposed by the 64-bit Indexing Service. This service interfaces directly with the 64-bit Search engine and transforms Enterprise Vault and other accepted query formats into the native form used by the 64-bit Search Engine. Upon receiving results from the Search Engine, the IQS transforms the result set into EV formatted search results. IQS is not involved in the handling of search queries targeting legacy index volumes, these are processed by IndexServer (32-bit) as before.
Index Volume Processor (IVP) is a fully 64-bit single process that interacts with the local 64-bit indexing engine to push data and manage index volumes. The IVP will process a maximum of 30 index volumes at any one time. This restriction is configured using the “Maximum Concurrent Indexing Capacity” extended setting on each Enterprise Vault server hosting an Indexing Service. For each volume, there will be 1 or 2 threads pushing item additions, deletions or updates to the indexing engine, and a single thread retrieving item additions from Storage Crawler. IVP is launched and managed by the Indexing Administration Service. The process will periodically scan the engine for asynchronous item errors and retry failed additions/deletions/updates, where appropriate. Failed actions for 64-bit index volumes are stored in one of three new tables in the vault store database: ItemAdditionStatusLog, ItemDeletionStatusLog and ItemUpdateStatusLog.
Execute-Worker processes are spawned by the Crawler Service to carry out tasks such as document conversion.
The Indexer Service is responsible for storing the documents converted by the crawler service into the actual high performance index volumes on disk. There will be one Indexer Service for each index volume currently active in memory.
The query server accepts formatted velocity queries from the EVIndexQueryServer, after this the service contacts the Collection Service associated with the Index Volume to be searched. The Query Server then forwards the query to the Indexer service to perform the search on the appropriate index volume(s). The Query server also aggregates the results returned by the index volumes being searched and serves the results back up to the EVIndexQueryServer. Only one Query Server Service runs on each Enterprise Vault Indexing Server.
Article URL http://www.symantec.com/docs/HOWTO56293