Video Screencast Help

Modify an existing EDM

Created: 16 May 2012 • Updated: 05 Jun 2012 | 14 comments

As part of protecting PII, I have, in production, a single column EDM looking for US SSN's about which my Company cares. I have built policies and associated response rules around this particular EDM profile. Taking a baby-step forward, I now need to include the safeguarding of Canadian Social Insurance Numbers. As the SIN is part of our PII definition, I think that it makes sense to modify my existing single column EDM to now be a two column EDM - one column US SSN the other column CA SIN. Unfortunately, I do not see a way to make such a modification. It appears as though I must create a new EDM with two columns, map the fields appropriately, and then go back and change my existing policies to use this new EDM.

Can anyone point me in the right direction in this regard? Must I create a new EDM and modify my policies? Can I not modify an existing EDM to add a column and avoid all the associated policy changes?

Thank you in advance!

Comments 14 CommentsJump to latest comment

stephane.fichet's picture

hello

as far as i know, if you change EDM structure (adding a column for example) you have to create a new EDM and update your policy it in your policy.

If you dont want to change the previous one, you can add a new EDM dedicated to your canadian SIN and add it to your policy, keeping the previous one unchanged (It depends how you manage your EDM source file and how you generate them).

 

regards

don_berlin's picture

Thank you Stephane. I believe that you are correct. I finally heard back from my IIP SE and his answer was very similar. Are you aware of any way to export/backup/save the existing related policies, for rollback purposes, before I make any changes to them? I certainly do not see any options for this.

Thanks again.

Keith Reynolds - ExchangeTek's picture

Something about this doesn't make sense to me.  Yes, in order to add a column to that EDM, you would have to deploy an entirely new EDM profile and adjust your policies accordingly.  But what you say you're doing is adding Canadian SIN's to the scope of what you're monitoring.  This wouldn't be a new "column" in that profile, unless you're saying that everyone whose SSN your profiling also has an SIN (which I doubt is the case). 

There's no real reason in my estimation that you couldn't simply include those numbers into the existing 1 column in your datafile, unless you care about definitively identifying what is a SSN vs what is an SIN.  Your data file then would just be a 1 column file of all SSNs and SINs you want to protect.

Otherwise, you're going to want to create an entirely new EDM profile for just the SINs, and modifiy your policies to include rules/responses as necessary to accomodate the SINs.  I suspect this is probably what you'll end up doing.  You could then leave the existing rules related to the SSN data in tact, and either add SIN rules to the policies you have in place, or add entirely new policies related to how you will handle SIN data.  You don't need to have all data in one profile.

~Keith

Keith Reynolds - ExchangeTek's picture

To answer your last post, yes, you can use the Export feature on the policy, which will create an XML export of the policy.  You could use that to subsequently import the original policy back into DLP (although it would now be a new policy that is copy of the original, and may impact some of your reporting).  Also, response rules assigned to the policy  are not included in that export, so you would want to manually document which response rules were applied at the time of the export so you can reassign them.

There is no "roll back" feature in the system.  Put that on the DLP wish list.

~Keith

don_berlin's picture

Keith,

Thank you for your thoughtful responses.

I had considered having the SSNs and SINs in the same single column. My concern with the single column is the mapping of the Data Source Field to the System Field of type Social Security Number that I currently have in place. I would want to have the SIN Data Source Field mapped to a System Field of type Social Insurance Number. I think that is what you were referring to when you stated, "...unless you care about definitively identifying what is a SSN vs what is an SIN."

It seems that your suggestion to "...create an entirely new EDM profile for just the SINs, and modify your policies to include rules/responses as necessary to accomodate the SINs" is the better solution for me. My initial resistance to the creation of an entirely new EDM was based in the amount of work that will need completed to modify my scripts that handle the RemoteIndexer process.Thank you for helping push me on this.

As for the Export this policy as a template feature, the missing response rules is what made me consider it a non-option. Ideally, the tool would provide a mechanism that allows administrators to export specific objects for change management purposes. If there is such a thing as a DLP "wish list" that the developers (or at least Product Managers) of Vontu read, then, please direct me to that list - I have a few more wishes that I would like to add!

Once I implement the above changes (indicating that they are the solution), I'll circle back and mark your response appropriately as the solution.

Thanks again, and have a great weekend!

Don

don_berlin's picture

To satisfy the solution suggested, I created a new EDM profile for the Canadian SIN. In doing so, I mapped the Data Source Field: Col 1 to the System Field: Social Insurance Number. I expected that this mapping would tell DLP to look for the SIN in xxx-xxx-xxx, xxx xxx xxx, and xxxxxxxxx format. Unfortunately, it detects only the format in which the SIN is stored in the Reference Data Source. In looking through the DLP documentation, I find no reference to what the System Field: Social Insurance Number tells DLP to expect.

Conversely, the Data Identifier: Canadian Social Insurance Number tells DLP very specifically (depending on Rule Breadth) in what format to detect a SIN.

Can anyone point me to documentation that explains what I should expect DLP to detect when I map the Data Source Field to the System Field Social Insurance Number? Should I have to provide the SIN in various formats within the Reference Data Source?

Thank you in advance!

 

Don Berlin

Daniel K.'s picture

You should consider using Data Identifiers of your own creation.  The use of data identifiers provides rulesets with the ability to include modifications of detection rules. Patterns detected by the DI are highlighted but the modifiers are not. 

Full regex is not supported in DI's. However, you can enter as many patterns as you like as long as there are no pipes involved. 

\d{3}-\d{3}-\d{4}

\d{3}\s\d{3}\s\d{4}

444-444-4444 Would be highilighted.

Then use modifiers to validate the existence of "customer data". 

CustomerID, CustID, Customer Data

None of the terms would be highlighted if placed as a DI modifier.

There are many potential modifers that can be used to validate PII. 

The most obvious method to maintain accuracy against SIN/SSN is to fingerprint customer records.  As long as your data is well formed EDM is the perfect method for an organization to protect its assets.  You can get nice sample of data from fakenamegenerator . com  so that you can test the effectiveness.

When \s and - are not enough because these assets are modified during transit consider using regex to look for derivations.  I usually combine rules into various combinations and cross compare results.  

don_berlin's picture

Daniel,

Thanks for the reply.

My concern with DI and regex is the number of false positives. The EDM affords me the opportunity to safegaurd only the numbers about which I care. We have many other 9-digit numbers that would certainly (and have in testing) cause a high number of false positive detections. I need to find a way to get the EDM to work for my situation.

Thanks again!

Don

Keith Reynolds - ExchangeTek's picture

Hey Don -

As with any data field that you map within your EDM profile, the data itself in the column should be loaded in raw form, without any formatting. For instance, a phone number should be '2155551212', a SSN should be '187543210', etc.  I do this/recommend this as best practice.  Not sure what exactly you mean by:

     "Unfortunately, it detects only the format in which the SIN is stored in the Reference Data Source" 

If you can explain that further, I might be able to provide more input.

Regards,

~Keith

don_berlin's picture

Hey Keith,

I appreciate your input and apologize for being less than clear with my description of my issue. To help clarify what I meant by "...it detects only the format in which the SIN is stored in the Reference Data Source", I will use SSN as an example. I hope I do not instead create additional confusion.

In creating my EDM for SSN, I was told by support (and I am pretty certain that I read it, as well) that selecting the System Field: Social Security Number would detect SSN in the following patterns: xxxxxxxxx, xxx xx xxxx, and xxx-xx-xxxx. I believe (but, would have to revisit my testing notes to confirm) that testing verified this to be true. If I recall correctly, my source data is, to borrow your phrase, in raw form - xxxxxxxxx and, I am able to detect the variations as listed previously. Additionally, the Field Mappings Advanced View offers the ability to select a Custom Name and Type of Social Security Number. According to the information provided when clicking on the Description link, the Social Security Number custom type will detect "... 3 digits, optionally followed by spaces or dashes, followed by 2 digits, optionally followed by spaces or dashes, followed by 4 digits."

Unfortunately, this seems not to be the case with the Canadian SIN. Configuring the System Field: Social Insurance Number and testing source data in raw form xxxxxxxxx yields detection of only xxxxxxxxx and not xxx xxx xxx or xxx-xxx-xxx. Moreover, the SIN does not appear in the drop down list for Custom Type in the Field Mappings Advanced View.

All of this leaves me wondering if I am overlooking something or whether there is a solution other than to list the SIN in the various formats in my source file?

I hope I did not further confuse this issue.

 

Don

Keith Reynolds - ExchangeTek's picture

Yeah, that makes sense to me now.  Sorry I don't have additional advice for you on this one...I'd have to play with it a little to figure it out.  I did a quick look to see if there's anywhere those field validators can be modified (or one added), but I didn't see anything.  Think this might be better left to Support/Engineering to get some clarity on it.

don_berlin's picture

Keith,

Thanks for your effort and response. I have created a support ticket. I will update this post with the official word once the ticket has been resolved.

Don

don_berlin's picture

Just to update anyone searching on this same topic. This fix for this issue is an official enhancement request. A timeline for the enhancement has not been made public. In my case, the PSE recommended putting the Canadian SINs in a single column in an EDM in each of the formats I care to detect. 

I will update this post when/if the enhancement materializes.

 

Don

Keith Reynolds - ExchangeTek's picture

Nice of you to update on this...very helpful so that everyone knows.  Thanks.

~Keith