Enterprise Vault Tool - Identify Indexes with Missing Attachment Information

Article:TECH203995  |  Created: 2013-03-18  |  Updated: 2013-10-22  |  Article URL http://www.symantec.com/docs/TECH203995
Article Type
Technical Solution


Issue



The IndexingAttachmentAnalyser tool should be used to identify and subsequently re-index any items archived on the current Enterprise Vault (EV) Server that are impacted by the issue described within TECH203789.  
 
Dependencies:
Before running the tool, please ensure the following. The tool will not check these pre-requisites:
 
  • If running EV 10.0.2 or EV 10.0.3, the attachment property hotfix must be applied.
  • Any on-going index subtasks (rebuilds or upgrades, verifies and move-locations) must be cancelled (i.e. stopped and deleted).
    • No new sub-tasks are to be created whilst the tool is running
  • No index locations are to be put into or taken out of backup mode whilst the tool is running
  • The tool must be deployed to run on each EV server running a Storage service
 
Note: The Directory Service and Indexing Service must be running on every EV server associated with the current Storage Server. Live indexing and searching will continue whilst the tool is running, so Index Locations do not need to be in backup mode.

Solution



Additional Notes:

Installation Instructions and Tool Workflow:
Extract the zip file to a suitable location on each Enterprise Vault Storage Server and it must NOT be the Enterprise Vault installation path

To run the tool, open a command prompt whilst logged in as the Vault Service account and navigate to the location of the tool.

The tool can take 3 command line arguments:

“-l   <logging folder path>” (mandatory in the first run, however subsequent runs will pick up the path from the IndexingAttachmentAnalyser.exe.config file if not entered) 
“-d  < installation date override>” (optional YYYY-MM-DD)  this option can be used to override the EV 10.0.2/10.0.3 installation date. If the system has had both 10.0.2 and 10.0.3 installed, then the installation date provided should be that of 10.0.2.
“-f ” (optional) this option can be used to enable automatic re-indexing of impacted items (if any)
 

Example: IndexingAttachmentAnalyser.exe -l c:\indexing-attachments-run
 

The log results are placed in the c:\indexing-attachments-run folder and will not automatically fix any impacted index volumes. Upon loading, the tool will ask for the EV 10.0.2/3 installation date. The log file contains a list of volumes which are clear, impacted or require review are recorded in log files as well as the impacted items within the volume.
 

When run, the tool will perform a series of checks on the current server, and will stop if it determines that there are no impacted index volumes associated:

  • These items will cause the tool to terminate:
    • EV version isless than 10.0.2
    • Storage server does not host saveset files archived before and since EV 8 was released.
       
  • For each index volume associated with archives on this EV server:
    • If no items have been indexed to the volume post 10.0.2 was installed, then the volume is marked as not impacted
    • If the volume contains items with missing attachment properties, then it IS impacted. 
        
  • If the tool is in fix mode (-f) and there are impacted items, the items listed in ImpactedVolumes.xml will be re-inserted into the index volumes.
     
  • The tool can be re-run post fix to validate that all impacted items have been successfully re-indexed
     
  • If the same log path is used in the command line, the tool will only re-verify the archives that had been identified as containing impacted items or could not be verified previously.
     

CTRL+C will stop the tool will gracefully at any point. When re-running the tool with the same logging folder path, or no -l specified at all, previously verified and cleared archives (see ClearArchives.xml file below for more detail) will not be re-verified. Only unverified archives will be picked up in the next run.
 

Note: The tool needs to be run on every EV Server with a Storage Service configured. The index volumes verified may not necessarily be located on the same EV Server.

Configuration Settings:
IndexingAttachmentAnalyser.exe.config contains some configurable settings that can be modified if necessary:

    <add key="NumberOfVerifierThreads" value="10" /> - Specifies the maximum number of concurrent index volumes verifies (searches) to run
    <add key="NumberOfFixerThreads" value="5" /> - Specifies the maximum number of concurrent fix threads to run
    <add key="SqlCommandTimeoutSecs" value="600"/> - Specifies the maximum amount of time that allowed for any queries made directly to SQL before timing out
    <add key="SearchTimeoutSecs" value="600"/> - Specifies the search timeout against an individual index volume
 <add key="RecordedLoggingFolderPath" value="<loggingfolderpath>" /> - This setting is only added after the first run of the tool. If the key exists and the -l parameter is not set, the value here is used


 




Outputs:
All output files are stored under the LoggingFolderPath, and are used to report and track progress. The ImpactedVolumes.xml file is then also used to determine which, if any, items need to be re-inserted.

Note: Do not modify the contents of any of the files created by the tool.

 

ClearArchives.xml

  • An xml list of all archives associated with the current EV Storage Server that have been verified as being not-impacted
     
<Archive Id="109A207B4554B424AB531FCF61BAB84FC1110000EVServer" Name="Paul Jones" APIdentity="-1"        VaultStoreId="1B852BFF639BF024A81469D4046D3A1261210000jorasses-ev" />
 
  • This file is also used to checkpoint the progress of the tool, in case of being stopped mid-run. For this reason, this file is stored at the root of the LoggingFolderPath location.
     
  • If the tool is restarted using a LoggingFolderPath that contains a ClearArchives.xml file, the contents of that file will determine which archives have already been verified as not-impacted and subsequently will not check them again.

     

At the start of the run, a folder named "YYYY-MM-DD_hh_mm_ss" is created, which will contain the following files:

Report.log

  • Text report file associated with the latest tool run.

 

ClearVolumes.xml

  • An xml list of all index volumes associated with the current EV Storage Server that have been verified as being not-impacted: 

<Archive Id="175AE04184A031945ABC49E53F1D516481110000EVServer" Name="shared" APIdentity="1" VaultStoreId="1EFD7A4C486A2B04F874D9E473A31DEF41210000EVServer">
 <IndexVolume Identity="1872" FirstISN="1" HighestISN="1587" />
 <IndexVolume Identity="1761" FirstISN="1588" HighestISN="2147483647" />
</Archive>

 

Note: The last Index Volume in the Archive will always have HighestISN="2147483647"  - this is to ensure that any items pending indexing are also included in the verification process.
 

Review.xml

  • An xml list of index volumes that could not be fully verified. Reasons include:
    • The index volume failed to be searched
    • Hidden/offline index volumes
    • Volumes with long-pending items to process
       
  • If there are no index volumes that could not be verified, this file will not be created.
     
  • These index volumes will not be marked as clear, and subsequently will be re-verified on the next run of the tool.

 
ImpactedVolumes.xml

  • An xml list of impacted index volumes, along with the details of the individual impacted items.
     
  • If there are no impacted index volumes, this file will not be created.
     
  • Entering -f in the tool parameters will automatically reinsert items identified in this file.
     
  • The <Item SeqNum… /> identifies the items that are reinserted into the impacted index volumes:

<Archive Id="146B14BD8E81F4945A05469FB2BFFE1A31110000EVServer" Name="Stephen Smith" APIdentity="147" VaultStoreId="1ED574E4EF65735479862F611735D172D1210000EVServer">
 <IndexVolume Identity="1870" FirstISN="1" HighestISN="1986" ImpactedItems="59" TotalItems="1986">
  <Item SeqNum="1278" Id="201303264324331~201303261806230000~Z~C0EAD1E14CD3DC969A0EDC7872867F71" /> ...
 </IndexVolume>
</Archive>

 

  • If the tool is run without fix mode set, and subsequently identifies impacted volumes, the tool can be re-run to address the impacted items. The tool will re-verify the index volumes *not* recorded in ClearArchives.xml and then subsequently all impacted items be automatically re-indexed.
     
  • To perform this operation, use the same command line parameters as in the previous run, with the single addition of -f:

 IndexingAttachmentAnalyser.exe -l c:\indexing-attachments-run -f

 

NOTE:  The ImpactedVolumes.xml file may continue to list items that contain embedded attachments, such as tables or images, even after the update and fix have been performed.   To validate the items that remain an EVSVR operation can be performed to extract the full item from storage.   Once the item has been extracted it can be opened from the Recombined folder to verify it contains an embedded attachment.   For more information about this EVSVR operation please refer to the Related Articles section.  


Attachments

Generic_EV_Hotfix_Etrack_3119508.zip (872 kBytes)

Supplemental Materials

SourceETrack
Value3119508
Description

Tool to identify indexes affected by TECH203789




Article URL http://www.symantec.com/docs/TECH203995


Terms of use for this information are found in Legal Notices