hi
policy fine tuning is always a long task but for sure it is mandatory if you want to use DLP on long term. I am sure people there will give you lot of advice but here is my philosophy about that :
- DLP default policy are nice at the beginning but they are not perfect for your company expectation, so use them as a template but update keyword list / threshold / ..... to fit exactly what you need.
- As you already have some false positive in your DLP (and may be too much) use them to find a pattern which match most of them. so you will be able to use this as an exclusion rule in your policy. Then once you have excluded the most common one, you will be able to analyze and define an other pattern etc etc etc....
- Then you can fine tune policy rules using keyword proximity matching, thresholds, be sure to match only on some component. When you have a set of keywords, try to split this list in two and define a compound rule so like that you will reduce number of event which match several times on same keyword. with data identifier use "uniqueness" and narrow breadth.....(so many possibilities with dlp that it is difficult to define them all here).
For what you describe in your message, i think one way could be to define a regular expression looking for one set of words first than an other set after that...but take care regexp could require lot of ressource on your detection server.
But as always when you will increase policy efficiency , you will decrease policy quality (the best policy to catch data leakage, is the one who match all emails :)
If you need more help contact me on MP there and we can share more detailed information on how to tune your policy.