Data Loss Prevention

 View Only

The Remote EDM Indexer 

May 02, 2014 01:36 AM

The Remote EDM Indexer is a utility that converts a comma-separated value, or tab-delimited, data file to an Exact Data Matching index. The utility is similar to the local EDM Indexer used by the Enforce Server. However, the Remote EDM Indexer is designed for use on a computer that is not part of the Symantec Data Loss Prevention server configuration.

Using the Remote EDM Indexer to index a data source on a remote machine has the following advantages over using the EDM Indexer on the Enforce Server:

It enables the owner of the data, rather than the Symantec Data Loss Prevention administrator, to index the data.

It shifts the system load that is required for indexing onto another computer. The CPU and RAM on the Enforce Server is reserved for other tasks.

The SQL Preindexer is often used with the Remote EDM Indexer. The SQL Preindexer is used to run SQL queries against SQL databases and pass the resulting data to the Remote EDM Indexer.

About the SQL Preindexer

This chapter describes how to use the SQL Preindexer. The SQL Preindexer utility is always used with the Remote EDM Indexer utility. It is installed in the \Vontu\Protect\bin directory during installation of the Remote EDM Indexer. The SQL Preindexer utility generates an index directly from a SQL database. It processes the database query and then pipes it to the Remote EDM Indexer utility.

Read the chapter about the Remote EDM Indexer in this guide before running the SQL Preindexer.

The SQL Preindexer runs from the command line. If you are on Linux, change users to the "protect" user before running the SQL Preindexer. (The installation program creates the "protect" user.) The SQL Preindexer only supports Oracle databases.

An example of a command to run the SQL Preindexer follows. The SQL Preindexer runs a SQL query to capture the name and the salary data from the employee data table in the Oracle database. This example shows how to pipe the output of the SQL query to the Remote EDM Indexer. The Remote EDM Indexer indexes the results using the ExportEDMProfile.edm profile. The generated index files are stored in the EDMIndexDirectory folder.

SqlPreindexer -alias=@//myhost:1521/orcl -username=scott -password=tiger -query="SELECT
     name, salary FROM employee" | RemoteEDMIndexer -profile=C:\ExportEDMProfile.edm
     -result=C:\EDMIndexDirectory\Because you pipe the output from the SQL Preindexer to the Remote EDM Indexer, review the section about Remote EDM Indexer command functions and options.

Below are the steps to index a data file on a remote machine and then use the index in Symantec Data Loss Prevention.

Please follow the below procedure steps Steps to use the Remote EDM Indexer
 

Procedure Step 1 : Install the Remote EDM Indexer on a computer that is not part of the Symantec Data Loss Prevention system :

Installing the Remote EDM Indexer :

The Remote EDM Indexer is installed from the same installation program as the other Symantec Data Loss Prevention components. Copy the ProtectInstaller_11.1.exe file to the remote machine where the data that needs to be indexed resides. The Linux version of Symantec Data Loss Prevention has a text-based command console option in the installation program that can be used.

Installing from the command line (for Linux):

The following procedure describes how to install from the command line for Linux.

To install a Remote EDM Indexer

1] Log on as root and copy the ProtectInstaller_11.1.sh file to the /tmp directory on the computer.
2] Change the directory to /tmp by typing:
cd /tmp

3] You may need to change permissions on the file before you can run the file. If so, type:
chmod 775 ProtectInstaller_11.1.sh

4] Once the file permissions have been changed you can run the ProtectInstaller_11.1.sh file, by typing:
./ProtectInstaller_11.1.sh -i console

Once the console mode installation launches, the Introduction step is displayed. For most circumstances, it is recommended to use the defaults during installation whenever possible. Press Enter to proceed to the next step.

5] In the Choose Install Set step, specify the component to install. To install the Remote EDM Indexer, type the number beside the option and press Enter.
6] In the Install Folder step, type the absolute path to the directory where you want to install the files. The default location can be selected by pressing Enter.
7] In the Pre-Installation Summary step, review the installation configuration that you have selected. If you are satisfied with the selections, press Enter to begin the installation. Or, type back and press Enter until you reach the step you want to change.
8] When the installation completes, press Enter to close the installer.

Uninstalling Remote Indexer on a Linux platform :

The files to uninstall the Remote EDM Indexer are located in the root level of the Symantec Data Loss Prevention installation directory. Follow this procedure to uninstall the utility on Linux.

To remove a Remote EDM Indexer from the command line

1] Log on as root and change to the Uninstall directory by typing:
cd /opt/Vontu/Uninstall

2] Run the Uninstall program by typing:
./Uninstall -i console

3] Follow any on-screen instructions.

To install the Remote EDM Indexer on a Windows Platform :

Note:  Symantec recommends that you disable any antivirus, pop-up blocker, and registry protection software before beginning the installation process.
 

To navigate through the installation process:

Click Next to display the next installation screen.

Click Back to return to the previous installation screen.

Click Cancel to terminate the installation process.

To install the Remote EDM Indexer on a Windows Platform

1] Go to the directory where you copied the ProtectInstaller_11.1.exe (Windows) or ProtectInstaller_11.1.sh (Linux) file.
In some circumstances, you may need to change the file permissions to access the file.

2] Run the installation program (either ProtectInstaller_11.1.exe or ProtectInstaller_11.1.sh).
The installer files unpack and the Welcome screen displays.

3] Click Next and then accept the Symantec Software License Agreement to continue.
4] Select Indexer from the list of components that appears and click Next.
5] On the Select Destination Directory screen, click Next to accept the default installation location (recommended). Alternately, click Browse to navigate to a different installation location, and then click Next.
6] For Windows, choose a Start Menu folder and then click Next.
7] The Installing screen appears and displays an installation progress bar. When you are prompted, click Finish to complete the installation.

Uninstalling Remote Indexer on a Windows platform :

The files to uninstall the Remote EDM Indexer are located in the root level of the Symantec Data Loss Prevention installation directory. Follow this procedure to uninstall the utility on Windows.

To uninstall Remote EDM Indexer from a Windows system

1] On the computer where the Indexer is installed, locate and run (double-click) the \Vontu\uninstall.exe program.
The uninstallation program begins and the Uninstall screen is displayed.

2] Click Next. When the uninstallation process is complete, the Uninstall Complete screen is displayed.
3] Click Finish to close the program.

Procedure Step 2 : Create an Exact Data Profile on the Enforce Server to use with the Remote EDM Indexer.
 

Creating an EDM profile for remote indexing :

The EDM Indexer uses an Exact Data Profile when it runs to ensure that the data is correctly formatted. You must create the Exact Data Profile before you use the Remote EDM Indexer. The profile is a template that describes the columns that are used to organize the data. The profile does not need to contain any data. After creating the profile, copy it to the computer that runs the Remote EDM Indexer.

To create an EDM profile for remote indexing

1] From the Enforce Server administration console, navigate to the Manage > Data Profiles > Exact Data screen.
2] Click Add Exact Data Profile.
3] In the Name field, enter a name for the profile.
4] In the Data Source field, select Use This File Name, and enter the name of the index file to create.
5] In the Number of Columns text box, specify the number of columns in the data source to be indexed.
6] If the first row of the data source contains the column names, select the option Read first row as column names.
7] In the Error Threshold text box, enter the maximum percentage of rows that can contain errors.
If, during indexing of the data source, the number of rows with errors exceeds the percentage that you specify here, the indexing operation fails.

8] In the Column Separator Char field, select the type of character that is used in your data source to separate the columns of data.
9] In the File Encoding field, select the character encoding that is used in your data source.
If Latin characters are used, select the ISO-8859-1 option. For East Asian languages, use either the UTF-8 or UTF-16 options.

10] Click Next to map the column headings from the data source to the profile.
11] In the Field Mappings section, map the Data Source Field to the System Field for each column by selecting the column name from the System Field drop-down list.
The Data Source Field lists the number of columns you specified at the previous screen. The System Field contains a list of standard column headings. If any of the column headings in your data source match the choices available in the System Field list, map each accordingly. Be sure that you match the selection in the System Field column to its corresponding numbered column in the Data Source Field.

For example, for a data source that you have specified in the profile as having three columns, the mapping configuration may be:

Data Source Field                       System Field
 
Col 1                                          First Name
 
Col 2                                          Last Name
 
Col 3                                          Social Security Number
 

12] If a Data Source Field does not map to a heading value in the options available from the System Field column, click the Advanced View link.
In the Advanced View the system displays a Custom Name column beside the System Field column.

Enter the correct column name in the text box that corresponds to the appropriate column in the data source.

Optionally, you can specify the data type for the Custom Name you entered by selecting the data type from the Type drop-down list. These data types are system-defined. Click the description link beside the Type name for details on each system-defined data type.

13] If you intend to use the Exact Data Profile to implement a policy template that contains one or more EDM rules, you can validate your profile mappings for the template. To do this, select the template from the Check mappings against policy template drop-down list and click Check now. The system indicates any unmapped fields that the template requires.
14] Do not select any Indexing option available at this screen, since you intend to index remotely.
15] Click Finish to complete the profile creation process.
16] Once you have finished the configuration of the Exact Data Profile, click the download profile link at the Manage > Data Profiles > Exact Data screen.
The system prompts you to save the EDM profile as a file. The file extension is *.edm. Save the file to the remote machine where you intend to run the Remote EDM Indexer utility.

Procedure Step 3 : Copy the Exact Data Profile file to the computer where the Remote EDM Indexer resides ;

Procedure Step 4 : Run the Remote EDM Indexer and create the index files :
 

Remote EDM Indexer command options:

The Indexer runs from the command line. If you are on Linux, change users to the "protect" user before running the Indexer. (The installation program creates the "protect" user.)

The data, profile, and result options are required with the Remote EDM Indexer. However, if the data option is not specified, the utility reads stdin by default. Often the data is piped from the SQL Preindexer utility.

Option                               Description
 
-data                                 Specifies the file with the data to be indexed. If this option is not specified, the utility reads data from stdin.

-encoding                         Specifies the character encoding of the data to index. The default is ISO-8859-1, but data with non-English characters

                                       use UTF-8 or UTF-16. Optional.

-ignore_date                     Overrides the expiration date of the Exact Data Profile if the profile has expired. (By default, an Exact Data Profile expires

                                       after 30 days.) Optional.

-profile                              Specifies the Exact Data Profile to be used. (This profile is the one that is selected by clicking the "download link" on the Exact

                                        Data screen in the Enforce Server management console.) Required.

-result                               Specifies the directory where the index files are generated. Required.

-verbose                           Displays a statistical summation of the indexing operation when the index is complete. Optional.

For example, to specify the profile file named ExportEDMProfile.edm and place the generated indexes in the EDMIndexDirectory directory, type:

RemoteEDMIndexer -profile=C:\ExportEDMProfile.edm
-result=C:\EDMIndexDirectory\

When the indexing process completes, the Remote EDM Indexer generates several files in the specified result directory. These files are named after the data file that was indexed, with one file having the .pdx extension and another file with the .rdx extension. Note that indexing a large data file may generate multiple .rdx files with numbered extensions. For example: my_edm.rdx.1, my_edm.rdx.2 and so forth.

Procedure Step 5 : Copy the index files from the remote machine to the Enforce Server.
 

Copying and using generated index files
After you create the index files on a remote machine, the files must be copied to the Enforce Server and loaded.

To copy and load the files on the Enforce Server

1] Go to the directory where the index files were generated. (This directory is the one specified in the result option.)
2] Copy all of the index files with .pdx and .rdx extensions to the index directory on the Enforce Server. This directory is located at \Vontu\Protect\Index (Windows) or /var/Vontu/index (Linux).
3] From the Enforce Server administration console, navigate to the Manage > Policies > Exact Data screen. This screen lists all the Exact Data Profiles in the system.
4] Click the name of the Exact Data Profile you used with the Remote EDM Indexer.
5] To load the new index files, go to the Data Source section of the Exact Data Profile and select Load Externally Generated Index.
6] In the Indexing section, select Submit Indexing Job on Save.
7] Click Save.
Consider scheduling a job on the remote machine to run the Remote EDM Indexer on a regular basis. The job should also copy the generated files to the index directory on the Enforce Server. You can then schedule loading the updated index files on the Enforce Server from the profile by selecting Load Externally Generated Index and Submit Indexing Job on Schedule and configuring an indexing schedule.

Procedure Step 6 : Load the index files into the Enforce Server :

Copying and using generated index files
After you create the index files on a remote machine, the files must be copied to the Enforce Server and loaded.

To copy and load the files on the Enforce Server

1] Go to the directory where the index files were generated. (This directory is the one specified in the result option.)
2] Copy all of the index files with .pdx and .rdx extensions to the index directory on the Enforce Server. This directory is located at \Vontu\Protect\Index (Windows) or /var/Vontu/index (Linux).
3] From the Enforce Server administration console, navigate to the Manage > Policies > Exact Data screen. This screen lists all the Exact Data Profiles in the system.
4] Click the name of the Exact Data Profile you used with the Remote EDM Indexer.
5] To load the new index files, go to the Data Source section of the Exact Data Profile and select Load Externally Generated Index.
6] In the Indexing section, select Submit Indexing Job on Save.
7]  Click Save.
Consider scheduling a job on the remote machine to run the Remote EDM Indexer on a regular basis. The job should also copy the generated files to the index directory on the Enforce Server. You can then schedule loading the updated index files on the Enforce Server from the profile by selecting Load Externally Generated Index and Submit Indexing Job on Schedule and configuring an indexing schedule.

Procedure step 7 : Troubleshoot any problems that occur during the indexing process.
 

Troubleshooting index jobs

You may encounter errors when you index large amounts of data. Often the set of data contains a data record that is incomplete, inconsistent, or incorrectly formatted. Data rows that contain more columns than expected or incorrect data types often cannot be properly indexed and are unrecognized during indexing. The rows of data with errors cannot be indexed until those errors are corrected and the Remote EDM Indexer rerun. Symantec provides a couple of ways to get information about any errors and the ultimate success of the indexing operation.

The Remote EDM Indexer generally displays a message that indicates whether the indexing operation was successful or not. The result depends on the error threshold that you specify in the profile. Any error percentage under the threshold completes successfully. More detailed information about the indexing operation is available with the verbose option.

Specifying the verbose option when running the Remote EDM Indexer provides a statistical summary of information about the indexing operation after it completes. This information includes the number of errors and where the errors occurred.

See Remote EDM Indexer command options.

To see the actual rows of data that the Remote EDM Indexer failed to index, modify the Indexer.properties file.

To modify the Indexer.properties file

1] Locate the Indexer.properties file at \Program Files\Vontu\Protect\config\Indexer.properties (Windows) or /opt/Vontu/Protect/config/Indexer.properties (Linux).
2] To edit the file, open it in a text editor.
3] Locate the create_error_file property parameter and change the "false" value to "true."
4] Save and close the Indexer.properties file.
The Remote EDM Indexer logs errors in a file with the same name as the indexed data file and with an .err extension. This error file is created in the logs directory.

The rows of data that are listed in the error file are not encrypted. Encrypt the error file to minimize any security risk from data exposure.

 

 

Statistics
0 Favorited
10 Views
0 Files
0 Shares
0 Downloads

Tags and Keywords

Comments

Aug 08, 2014 05:06 AM

Informative..but too lengthy 

Related Entries and Links

No Related Resource entered.