Data Loss Prevention

 View Only

The SQL Preindexer 

Jun 19, 2014 03:24 AM

This article describes how to use the SQL Preindexer. The SQL Preindexer utility is always used with the Remote EDM Indexer utility. It is installed in the \Vontu\Protect\bin directory during installation of the Remote EDM Indexer. The SQL Preindexer utility generates an index directly from a SQL database. It processes the database query and then pipes it to the Remote EDM Indexer utility.

Read my below article about the Remote EDM Indexer in this guide before running the SQL Preindexer.

https://www-secure.symantec.com/connect/articles/remote-edm-indexer

The SQL Preindexer runs from the command line. If you are on Linux, change users to the "protect" user before running the SQL Preindexer. (The installation program creates the "protect" user.) The SQL Preindexer only supports Oracle databases.

An example of a command to run the SQL Preindexer follows. The SQL Preindexer runs a SQL query to capture the name and the salary data from the employee data table in the Oracle database. This example shows how to pipe the output of the SQL query to the Remote EDM Indexer. The Remote EDM Indexer indexes the results using the ExportEDMProfile.edm profile. The generated index files are stored in the EDMIndexDirectory folder.

SqlPreindexer -alias=@//myhost:1521/orcl -username=scott -password=tiger -query="SELECT
     name, salary FROM employee" | RemoteEDMIndexer -profile=C:\ExportEDMProfile.edm
     -result=C:\EDMIndexDirectory\Because you pipe the output from the SQL Preindexer to the Remote EDM Indexer, review the section about Remote EDM Indexer command functions and options.

SQL Preindexer command function and options :

The SQL Preindexer requires the -alias option and the -username option. All of the command options for the SQL Preindexer are described in the following table. If you omit the -query option, the utility indexes the entire database.

The SQL Preindexer command has the following options:

-alias
 Specifies the database alias used to connect to the database in the following format: @//localhost:port/sid

For example: @//myhost:1521/orcl

This option is required.
 
-driver
 Specifies the JDBC driver class (for example, oracle.jdbc.driver.OracleDriver).
 
-encoding
 Specifies the character encoding of the data to index. The default is iso-8859-1, but data with non-English characters should use UTF-8 or UTF-16.
 
-password
 Specifies the password to the database. If this option is not specified, the password is read from stdin.
 
-query
 Specifies the SQL query to run.
 
-query_path
 Specifies the file path that contains a SQL query to run. This option can be used as an alternative to -query when the query is a long SQL statement.
 
-separator
 Specifies whether the output column separator is a comma, pipe, or tab. The default separator is a tab. To specify a comma separator or pipe separator, enclose the character in quotation marks as in "," or "|".
 
-subprotocol
 Specifies the JDBC connect string subprotocol (for example, oracle:thin).
 
-username
 Specifies the name of the database user. This option is required.
 
-verbose
 Displays a statistical summation of the indexing operation when the index is complete.
 

 Troubleshooting preindexing errors :

You may encounter errors when you index large amounts of data. Often the set of data contains a data record that is incomplete, inconsistent, or inaccurate. Data rows that contain more columns than expected or incorrect column data types often cannot be properly indexed and are unrecognized.

The SQL Preindexer can be configured to provide a summary of information about the indexing operation when it completes. To do so, specify the verbose option when running the SQL Preindexer.

To see the rows of data that the Remote EDM Indexer did not index, adjust the configuration in the Indexer.properties file using the following procedure.

To record those data rows that were not indexed

1] Locate the Indexer.properties file at \Program Files\Vontu\Protect\config\Indexer.properties (Windows) or /opt/Vontu/Protect/confide/Indexer.properties (Linux).
2] Open the file in a text editor.
3] Locate the create_error_file property and change the "false" setting to "true."
4] Save and close the Indexer.properties file.
The Remote EDM Indexer logs errors in a file with the same name as the data file being indexed and the .err suffix.

The rows of data that are listed in the error file are not encrypted. Safeguard the error file to minimize any security risk from data exposure.

 

Statistics
0 Favorited
3 Views
0 Files
0 Shares
0 Downloads

Tags and Keywords

Related Entries and Links

No Related Resource entered.