If you are an SAP customer, chances are that some of your company’s most critical business data is securely stored inside SAP applications and its underlying databases. Access to such data is carefully restricted through a complex permissions system. By relying on roles and authorizations, a company can very selectively grant access to certain data or subsets thereof, depending on the user requirements and their business role. SAP provides all the tools necessary to prevent or at least minimize unauthorized access to sensitive data stored in SAP applications. However, these tools are sometimes time-consuming and maintenance-intensive.
Problems arise when data leaves the secure boundaries of the SAP environment and suddenly ends up on users’ computers, emails, mobile devices, or unauthorized cloud storage (all part of today’s shadow IT environment), thus significantly increasing the risk of data loss and theft. Just remember the last time you or someone you know accidentally sent an email to the wrong recipient. Hopefully, it didn’t contain any sensitive (or embarrassing) information.
The issue is that most companies lack the essential knowledge of how much data is extracted from an SAP system every day and how much of the extracted data is sensitive or needs to comply with regulations. Most companies also don’t know what actions users perform with that data extracted from an SAP system.
These issues pose a huge security risk, as companies typically have no control over extracted information, leaving it exposed to malicious or accidental wrongdoing. Gartner Group estimates that two-thirds of all corporate data lives outside the data center, and the majority is unprotected by IT. According to Gartner’s predictions at its 2011 US Data Center conference, that number was expected to grow by 50 percent in the next few succeeding years.
That means all the controls you have put in place to secure your SAP applications and the network perimeter around your data center can protect only about one-third of your corporate data. The majority is potentially exposed to loss and theft or it is at the mercy of other mechanisms and controls over which the SAP IT team has at best only limited control.
Protecting the intellectual property that drives your business is crucial. Protecting the information of your employees, customers, and partners is not only a moral question, it’s often influenced by laws and regulations. Examples of such regulations include the Federal Information Security Act (FISMA), Health Insurance Portability and Accountability Act (HIPAA), Safe Harbor law, Payment Card Industry Data Security Standard (PCI-DSS), European Data Protection Directive (EU GDPR), and many more.
Gaps in Yesterday’s Data Protection Solutions
For the longest time there has been a friendly competition between SAP IT and enterprise IT. Enterprise IT manages everything but the SAP system, and I see this separation in every larger company I talk to. While competition is usually a good thing, it can lead to a gap in security controls and processes. This gap is especially fatal when it comes to protecting business-critical data that originated from an SAP system, but is then consumed outside an SAP system.
The SAP security folks see it as their responsibility to make sure all data is secure and properly protected while stored inside an SAP system. They may also govern the transferring of data between SAP and non-SAP applications through defined interfaces, such as a Remote Function Call (RFC). However, the same resources usually have very limited control over how the data extracted from SAP applications is protected outside the SAP system, such as on end users’ computers, cloud storage, email, or mobile devices.
That’s where the regular IT security team takes over. This team deploys solutions such as Data Loss Prevention (DLP), Data Classification (DC), or file or folder encryption. Most of these so-called downstream solutions (from the SAP application perspective) rely either on end-user input (i.e., to store a file in a certain encrypted folder) or apply pattern matching or other content-scanning techniques to guess how sensitive the data is and to decide if it needs to be protected.
Relying on end users to make the right decisions about when and how to protect sensitive information is not a good idea. Data protection applications are often in the way of end users, who may try to bypass or avoid them whenever possible. Plus, end users often don’t have enough training to determine how certain data types need to be classified and protected.
An example of an end-user-driven data protection solution is having special folders that encrypt all files that are placed inside them. The action of encrypting the file placed inside this folder is often completely transparent (handled by a driver running in the background), but it’s triggered by the user’s decision to place files containing sensitive data into such a special folder.
Forcing users to only save files containing sensitive data, such as a Flexible Employee Report Data exported from the SAP system into an encrypted folder, is very difficult. Therefore, it’s not uncommon, even in organizations that have such tools in place, for sensitive data to be found unencrypted in other storage locations, emails, or mobile devices.
Shifting the decision making from the end user to software appears to be a reasonable approach. The question is, however, how can software determine what data needs to be protected and how? Content analysis and pattern matching are common strategies followed by such solutions as DLP and e-discovery. Certain data types, for example, such as Social Security or credit card numbers, follow a certain pattern for which solutions such as DLP can look. If they find a number in a file that matches, for instance, the predefined pattern of a Social Security number, DLP solutions assume the file needs to be protected.
Such an approach sounds good on paper, but unfortunately, it’s relatively error prone. First of all, the solution is only as good as its pattern definition, and second, not all sensitive data follows a pattern. Even those data types that do follow a certain pattern may occasionally deviate from their normal patterns. For example, a Social Security number could be written as 123–45–6789 or 1–2–3–4–5–6–7–8–9 or in many different other ways, making it very difficult to develop a complete and reliable pattern definition.
Context Versus Content
One solution to this problem is to deploy data security applications that are directly embedded in SAP’s core technology frameworks, such as SAP NetWeaver or SAP HANA Extended Application Services (XS).
The advantage of this approach is the availability of contextual attributes that can help accurately determine the sensitivity of data, without involving the end user or error-prone content scanning. Contextual attributes describe where the data in question originated, such as the application server, application component, package name, transaction code, table name, and user, including roles and authorizations, type of front end (i.e., SAPGUI), or IP address (to determine the geographic location).
Let’s look at an example of an HR user who has elevated privileges and downloads a Flexible Employee Data report, including executive pay information from transaction code CS11 of a core SAP ERP Human Capital Management (HCM) system. In a second example, a significantly less privileged HR user downloads the same report (CS11), but only has access to basic employee information, such as names, job titles, and phone numbers. Looking at both downloads from the perspective of a data protection solution operating outside an SAP system, they appear very similar, and only error-prone content inspection may reveal subtle differences.
Data protection solutions that are deeply integrated into the SAP system, however, can make out the differences easily by determining who the user is, what roles and authorizations the user has, and what data the user is actually accessing (i.e., what tables or fields within a table). No content inspection or end-user involvement is necessary.
Unfortunately, however, solutions that harvest the power of such contextual attributes for the purpose of protecting data that is extracted from an SAP system, are still rare, but they do exist. Aside from SAP NetWeaver’s powerful roles and authorizations concept, SAP has so far focused on leveraging context for the purpose of auditing read and change access. Examples include Read Access Log (RAL), SAP Access Control’s Emergency Access Management (formerly Firefighter), SAP Fraud Management, and the new Enterprise Threat Detection (ETD).
The task of classifying and protecting data extracted from SAP applications through contextual awareness is a white spot SAP has left for its trusted partners to fill.
Meanwhile, many data protection vendors operating outside SAP have recognized the value of contextual awareness and have adapted their solutions accordingly. Enterprise Data Classification and DLP solutions, for example, make increased use of deep application integration as well as behavioral analysis to augment and improve their legacy content-inspection engines. This trend, in combination with SAP’s own data protection and analytics effort, can further increase awareness of the importance of context-aware data protection solutions that are tightly integrated into SAP.
Context Awareness in an SAP System
Contextual awareness requires applications to operate inside the SAP application platform, such as SAP NetWeaver. For third-party applications or custom programs, such an integration can be achieved through add-ons, enhancements, Business Add-Ins (BAdIs), and code modifications.
For instance, SAP NetWeaver ABAP offers all the above capabilities, thus enabling third-party applications to gain access to transactional attributes that can be used to evaluate what the user does inside the SAP system.
Example: Intercepting Downloads from SAP NetWeaver ABAP
When end users extract data from SAP NetWeaver ABAP through reporting transactions (such as transaction codes PAR1 or CS11) as spreadsheets or text files using the SAPGUI, those downloads are tunneled through a core function of SAP NetWeaver called GUI_DOWNLOAD. By intercepting this function through an enhancement or modification, a custom program would get instant access to contextual attributes, such as:
- Users and their roles and authorizations
- Application components (such as HR)
- Transaction codes (such as PAR1)
- Tables downloaded (in case of transactions, such as transaction code SE16)
- IP addresses (of users)
- Types of front ends
- Operating systems of the front end
The above attributes are available as part of the run-time environment to which any custom application running inside of SAP NetWeaver has access. It’s then up to the custom application to evaluate those attributes and to put them into context for the purpose of making a data classification, blocking, or protection decision. An application can, for example, determine that data extracted from an HR reporting transaction by a user with elevated privileges (based on roles and authorizations) is automatically classified as confidential. Or an application can automatically prevent an authorized user from extracting table USR02, containing all user password hashes, through transaction code SE16 if such behavior lacks a proper business case.