Last year Jane started as a new employee in your organization. As with all employees she had forms to fill out and supporting documentation to provide. To prove her citizenship status, she offered a copy of her passport via scan and email.
If you’ve heard anything about GDPR you know, it has a lot to do with identifying and protecting personal information. A passport is a great example. Well, no problem you may be thinking – “we have a process that HR follows that all scanned email attachments go into system X and they’re secure, and we know exactly where they are.” That’s all well and good that a process exists – but can you genuinely say that in all cases it was followed correctly? Maybe Bob from HR was stepping out for a meeting when that email with the passport came through. Perhaps it was a high priority, and he decided to forward that email to a colleague so it could be addressed right away. Well – guess what – Bob just created a copy. Now there are two copies of that passport. Will Bob remember to delete that email when he returns from his meeting? Will it be buried under 30 more emails by the time he gets back? You can see where this is going.
You can’t protect personal information unless you can identify it and identify every location where it exists. Sure – System X, where all of those “secure” files are meant to be stored, may, in fact, be where 100% of them are – but what about the copies? What about much older files that have been around before System X was in place? Are they out in some archive?
No matter the size of a company or how organized they are – they all have the same problem. I’ll repeat – you can’t protect personal information unless you can identify it and identify every location where it exists. That task of identification can become insurmountable and infeasible for a human to handle. This is where Docxonomy comes to the rescue.
Our system supports numerous pre-built connectors for various systems. These connectors allow Docxonmy to index files from any number of cloud and on-premise repositories. They enable you to search in one place for data in multiple storage locations so that we can automatically identify sensitive, personal information. You can, of course, choose to store all of that data in our cloud repository, but you don’t have to. Just set up a connector and Docxonomy will index it all for you in our cloud, leaving the native files where they already live.
Docxonomy eats text for breakfast. Our text analysis capabilities are at the core of the system. We break down files at the word level – building context and understanding through the usage of the words within sentences, identifying the parts of speech, people, places, dates, quantities, and more. This breakdown allows our system to understand the text at a deep level which drives the ability to identify those bits of information that are truly sensitive.
Back to that passport issue – “but wait” you’re thinking – that was a scanned image of a passport. That’s right, but in Docxonomy, optical character recognition (OCR) is built right in, and you won’t pay a penny extra. Once those documents and/or images have been OCRed, it’s just a matter of our text engine doing its magic.