Data Loss Prevention

View Only

Back to discussions

Expand all | Collapse all

DLP EndPoint Regular Expression Parser

Jump to Best Answer

1. DLP EndPoint Regular Expression Parser

0 Recommend
Migration User
Posted Jun 23, 2011 09:55 AM

Reply Reply Privately
I recently encountered an issue with a policy definition that included a Regular Expression for content interrogation on content being “sent” from a PC to a USB attached media device. The RegEx was one that is being used for policies deployed to all of the other available Monitor and Prevent capabilities with no issue. Through the assistance of Symantec Support it was discovered that the EndPoint parser does not support the use of (i) for ignore case but requires “simple match” with all iterations [aA]. The existing documentation addressing DLP and RegEx is fairly light and due to this possible deviation in “parsing” there appears to be a greater need for more “rich” content speaking to the RegEx support and best practices.

Is there any documentation that outlines differences in RegEx parsers for each of the DLP agents (Network, EndPoint, File Discover, SharePoint Discover, etc.)?

Is there any documentation that addresses best practices and examples for building proper RegEx’s for each agent?

Does Symantec recommend any RegEx validation utilities when these types of complex expressions are needed for policy definition?

This information would be very valuable to the community as a whole.
2. RE: DLP EndPoint Regular Expression Parser

0 Recommend
Migration User
Posted Jun 23, 2011 10:56 AM

Reply Reply Privately
Hi John,

These regex guidelines can be found in the DLP Administration guide document that comes with the software. They don't provide a tutorial or anything but there is a syntax list.

In the online training for DLP, they suggest Regex buddy but I found that this site works pretty well for me:
http://www.regextester.com/

I believe that it uses the Javascript engine but I'm really not 100% sure about that. Hopefully someone else can speak to that.

Regards
~Xavier
3. RE: DLP EndPoint Regular Expression Parser

0 Recommend
Migration User
Posted Jun 23, 2011 10:58 AM

Reply Reply Privately
Look at this post in the Connect forum for more info. It seems I was right about the Java engine for all section except endpoint which apparently uses the "boost" engine. I've never heard of it and I'm not sure what tools can be used to test it. It may have even changed since that version for all I know. The post is a good read in terms of resources though. Check it out.

https://www-secure.symantec.com/connect/forums/expression-prefixsuffix
4. RE: DLP EndPoint Regular Expression Parser
Best Answer

0 Recommend
Migration User
Posted Jun 23, 2011 03:20 PM

Reply Reply Privately
Xavier is dead on about that. Endpoint DOES use the Boost regex engine, whereas detection servers use the Java engine. In MOST cases, these are pretty similar and that regextester.com site gives a pretty good representation of the results you'll see with both.

I recall running into problems with some differences between the two implementations of regex with a customer of mine some time ago. It had to do with the support of positive look aheads (or maybe it was negative look behinds) with Boost (one or the other didn't seem to be supported), so be on the lookout for that if you're using those structures in your regex. To date I have never found a good online tester for the Boost implementation of Regular Expressions.

Hint, and what I've been playing with lately...you might be able to accomplish what you want better with a Custom Data Identifier with a Custom Script Validator (available in V11 now). The scripting language (basically a very limited implemenation of Perl, and descibed in the Custom Detection Guide) will allow for additional validation of the match as built into the Custom DI, and would work consistently among Endpoint and Detection Servers. It may even be more efficient than a Regex.

~Keith
5. RE: DLP EndPoint Regular Expression Parser

0 Recommend
Migration User
Posted Jun 24, 2011 12:44 AM

Reply Reply Privately
This is invaluable information, does anyone have any idea what version of Boost or Java is compiled into the specific agents? The information at http://www.boost.org is much more detailed than what is provided by Symantec, but there are multiple releases which have variants. Keith I will start looking at the "Custom Data Identifier" and "Custom Script Validator" as a consistent implementation of rules interogation would benefit environments where there are rich deployments of monitor and prevent rules across all the available agents.

Data Loss Prevention

DLP EndPoint Regular Expression Parser

Migration UserJun 23, 2011 09:55 AM

Migration UserJun 23, 2011 10:56 AM

Migration UserJun 23, 2011 10:58 AM

Migration UserJun 23, 2011 03:20 PMBest Answer

Migration UserJun 24, 2011 12:44 AM

1. DLP EndPoint Regular Expression Parser

2. RE: DLP EndPoint Regular Expression Parser

3. RE: DLP EndPoint Regular Expression Parser

4. RE: DLP EndPoint Regular Expression Parser Best Answer

5. RE: DLP EndPoint Regular Expression Parser

4. RE: DLP EndPoint Regular Expression Parser
Best Answer