I spoke with a customer who is researching technologies which can help them secure sensitive documents and emails within their organization. I went to see them with our Information Rights Management product manager, Andy Peet, and IRM was of course the main topic of discussion. However they were also researching Data Loss Prevention (DLP) and wondered how the two technologies fitted together. So the following is an overview of DLP, its benefits and limitations, and its fit with Information Rights Management.
First some definitions:
- Information Rights Management (IRM) refers to technologies that use encryption to persistently protect information contained in documents and emails from unauthorized access inside and outside the organization.
- Data Loss Prevention (DLP) refers to technologies designed to detect and prevent the unauthorized transmission of information from the computer systems of an organization to outsiders.
The definitions sound quite similar, but under the hood the two technologies represent quite different approaches to two closely related problems.
DLP overview
DLP products are content- and context-aware filtering products that monitor outbound information flows from the network, servers and endpoints in order to detect and prevent the unauthorized transmission of information to outsiders. The core intellectual property in DLP is the natural language filtering used to classify information into categories such as PCI, PII, ITAR, GLBA, SOX, etc. Information categories can then be associated with policies, and policy violations logged and automatically remediated.
DLP systems are typically made up of the following components:
- MONITOR – Passive network monitoring and reporting (“data-in-motion”), typically operating at the Internet gateway in an appliance form factor.
- PREVENT – Active remediation by the network component. Remediation actions include alerting, warning, blocking, quarantining, encrypting, self-remediation, etc.
- CAPTURE – Stores reconstructed network sessions for later analysis and rule tuning (only supported by a few DLP vendors).
- DISCOVER – Discovers and classifies information (“data-at-rest”) in repositories and on endpoints.
- ENDPOINT – DLP capabilities extended to desktop application-operating system interfaces such as local file systems, removable media, wireless, etc.
Benefits of DLP
The network monitoring and discovery components of DLP can be relatively easy to deploy, without IRM’s requirement for an endpoint agent. They do tend to immediately generate a bewildering number of policy violations so it is important that (a) the DLP reporting engine can be tuned to exclude most violations and focus on high-priority applications, e.g. PCI (b) the DLP classification engine not generate too many business-disruptive false positives (we are still far from Terminator-style artificial intelligences, fortunately ;).
The reports from DLP network monitoring and discovery provide a useful information security feedback loop: identifying compliance “hot spots” and poor working practices, mapping the proliferation of sensitive content throughout (and beyond) your enterprise and enabling organizations to tune their existing access control systems.
Limitations of DLP
With all the best will in the world DLP is only ever going to be a partial solution. There are simply too many information flows to monitor and too many violations to process. For all the claims of the vendors true natural language “understanding” remains a pipe dream, and some classification engines are little more than regular expression pattern matching. DLP cannot monitor encrypted information or information that leaves the corporate network to partners, customers or suppliers.
Most DLP customers would agree that moving from passive detection to active prevention is a massive leap. The shortcomings in the classification algorithms result in too many false positives (non-sensitive information mis-classified as being sensitive) and false negatives (where sensitive information is not classified as such), which combined with crude blocking techniques, such as cryptic network drops, wreak havoc on business productivity. Most of the real-world value of DLP is in monitoring and feedback, not active prevention. DLP tells you that you forgot to close stable door, which horses bolted and in what direction.
DLP classification filters are complex and in a global enterprise will require localization into all the languages in which data may be leaked. This makes maintaining and extending these filters difficult, slow and expensive.
DLP vendors have been forced to add endpoint components because of the numerous channels for data leaks from the endpoint, invisible to network DLP components. These components are for the most part very rudimentary, for example only scanning information sent to removable disks, but not to file shares, DVDs, printers, etc.
There can be widespread employee antipathy towards what is perceived as “big brother” monitoring or enterprise spyware, and some corporations may believe that in terms of policy violations “ignorance is bliss”, i.e. if they detect a million policy violations someone is going to expect them to fix a million policy violations, which is going to be expensive.
DLP and IRM compared
From the above discussion it should be seen that DLP and IRM address similar problems, but not the same problem.
DLP is more useful when an organization wants to protect itself from data leaks but doesn’t really know what information it needs to protect, or where that information resides. It can then use DLP network monitoring and discovery to map the proliferation of its sensitive information and use that map to improve its existing access control systems or apply new systems, such as IRM.
IRM is more useful when the enterprise already knows which information it needs to protect, and wants it secured and tracked both inside and outside the enterprise.
IRM’s value proposition is more towards providing higher assurance security for an enterprise’s most sensitive IP, for example trade secrets or draft financials. Once encrypted all copies of that information are secured and tracked, regardless of location or distribution mechanism.
DLP’s value proposition is more as a feedback/tuning mechanism for other more proactive access control mechanisms, than as an access control system in its own right. Having a means of observing the information actually flowing out of your existing applications and repositories is nevertheless tremendously useful.
IRM and DLP overlap in terms of cost of deployment. Network-based DLP monitoring and discovery are easier to deploy, since they do not require an endpoint agent, but have a huge blind spot in terms of endpoint activity. Introducing endpoint agents can make DLP more costly to deploy, since it now needs to manage gateway, server and endpoint agents compared to IRM’s endpoint-only agent.
Bottom line
The bottom line is that IRM and DLP are more complementary than competitive.
Standalone they address similar but different problems. DLP and IRM vendors have long talked about integrating the two technologies, to provide a solution greater than the two parts. This would mean a DLP solution automatically applying IRM encryption to content discovered “at rest” or “in motion”, so that it remains secure and tracked “in use”, inside and outside the firewall. The link between DLP discovery and IRM is particularly attractive, since if content were IRM-encrypted at source then all subsequent copies would automatically remain secure “at rest”, “in motion” and “in use”, even on unmanaged systems.
Both technologies are highly extensible and offer comprehensive APIs, making their integration straightforward. I am not aware of many real-world integrations to date, but I’m sure this will change.