Hi Neil,
It depends on how the PDF was created as to if OCR is required or not, with PDFs there's two possible types, a world document or a native PDF which has actual text that DLP can read and identify and policy violations,
The second, the reasoning for OCR, is for example if you have a paper document and you scan this to your emails, it arrives as a .pdf but in reality its purely an image of the physical document and doesn't have identifiable text, therefore OCR comes into play to review it,
I hope this helps,