Data Loss Prevention

 View Only
  • 1.  filter.exe output doesn't match incident

    Posted Jul 21, 2017 12:01 PM

    We have a incident involving a PDF matching on sensitive information, however when viewing the PDF directly none of the sensitive information is visible. We think it might be due to hidden layers. 

    To verify this we've used the filter.exe located within SymantecDLP\Protect\plugins\contentextraction\Verity\x64

    filter.exe incident.pdf indicent.txt

    However, the output within incident.txt does not contain any of the matched incident data.

    Are any additional flags required to output the data or are we missing a crucial step?

     



  • 2.  RE: filter.exe output doesn't match incident

    Trusted Advisor
    Posted Jul 23, 2017 11:00 PM

    Jonathan,

    That is why you use the filter.exe... It will out put ALL of the text that it can ready. Keep in mind that certain PDF files have embedded text because the PDF was printed as a picture so the text cannot be extracted.

    Are you getting any text as the output??

     

    Ronak

     

     



  • 3.  RE: filter.exe output doesn't match incident

    Posted Jul 24, 2017 04:15 AM

    Hi Ronak,

    Thanks for your reply.

    To clarify. When we use Filter.exe it outputs all of the visible PDF text, it does not output any of the text flagged in the incident.

    The PDF is basically a form sent to clients to populate with details: names, addresses etc. It's been sent to a prospective client via email and has been flagged as containing names and addresses, however, when viewed directly it just appears to be a blank PDF form.  I suspect the PDF has been recycled several times and it contains previous entries of old clients.

    If I use filter.exe against the PDF it outputs the blank form text, but none of the names and addresses flagged in the incident.

     

    Regards

    Jonathan



  • 4.  RE: filter.exe output doesn't match incident

    Posted Jul 24, 2017 04:16 AM

    Hi Ronak,

    Thanks for your reply.

    To clarify. When we use Filter.exe it outputs all of the visible PDF text, it does not output any of the text flagged in the incident.

    The PDF is basically a form sent to clients to populate with details: names, addresses etc. It's been sent to a prospective client via email and has been flagged as containing names and addresses, however, when viewed directly it just appears to be a blank PDF form.  I suspect the PDF has been recycled several times and it contains previous entries of old clients.

    If I use filter.exe against the PDF it outputs the blank form text, but none of the names and addresses flagged in the incident.

     

    Regards

    Jonathan