Data volume management and data archiving may seem like simple enough concepts, but putting these concepts into practice can be tricky. Starting with a smart data volume management strategy built from a clear understanding of both the business needs and the value of the data will ensure data is stored and managed efficiently and cost-effectively. How do you formulate the best approach for using SAP technology to store your company’s important documents and protect proprietary information?
Dolphin President and CEO Dr. Werner Hopf answered readers' questions on data volume and Information Lifecycle Management best practices. Questions included:
- What are our options for data archival in SAP?
- Can you provide some best practices as well as some thumb rules regarding data archiving?
- Is there a general rule for the time frame of archiving or does it depend on the data volume?
- We are migrating to SAP HANA in the next 6 months. Do we need to continue archiving currently or wait for the HANA migration?
- My company’s data is highly regulated. What do I need to consider before embarking on a data archiving strategy in terms of legal requirements?
- What would be the storage mechanism for big data and HANA?
Check out the chat replay and the full, edited transcript below.
Meet the panelist:
Dr. Werner Hopf, CEO and President, as well as the Archiving Principal at Dolphin
Dr. Hopf is responsible for setting the company’s strategic corporate direction. With more than 20 years of experience in the information technology industry (14 years focused on SAP), Dr. Hopf specializes in SAP Information Lifecycle Management initiatives including Data and Document archiving, SAP Archive Link Storage Solutions, and Business Process solutions. His experience spans both large and mid-sized companies across all major SAP modules. Having worked on SAP projects across North America and Europe, he has extensive experience in global markets and is well known for his expertise. Dr. Hopf earned a Masters of Computer Science degree and a PhD in Business Administration from Regensburg University, Germany.
Natalie Miller, Moderator: Hello everyone, and welcome to today’s Q&A on SAP archiving and data volume management best practices. I’m Natalie Miller, features editor of SAPinsider and insiderPROFILES, and I’m thrilled to introduce today’s panelist, Dr. Werner Hopf, CEO and President of Dolphin.
Hi, Dr. Hopf, thank you so much for being here today to answer readers’ questions on building a smart data management strategy!
Dr. Werner Hopf, Dolphin CEO and President: Hello, Natalie, glad to be here today.
Natalie Miller, Moderator: As readers get their questions in, can you describe the data volume challenge for businesses today? Where do you feel most companies are right now in the adoption of SAP archiving and data volume management strategies? Are there certain industries that are currently ahead of the curve?
Dr. Werner Hopf: A lot of companies have started with archiving initiatives, but from what we hear from our customers, only a few have what we would consider an efficient strategy for archiving and information lifecycle management. Best-in-class companies typically have more than 80% of their historical transaction data stored in archive. So for most companies, there is a still a lot of potential to reduce their system size and, correspondingly, total cost of ownership. Verticals that generate very high data volume — retail of consumer goods, for example — typically have more advanced approaches to archiving compared to others.
Comment from Narasimha: We are migrating to HANA in the next 6 months. Do we need to continue archiving currently or wait for the HANA migration?
Dr. Werner Hopf: We recommend archiving prior to the HANA migration to ensure that you minimize your HANA hardware footprint as much as possible. The size of the in-memory HANA database has a big impact on your total cost of ownership.
Comment from Bhalchandra: Can you provide some best practices as well as some thumb rules regarding data archiving?
Dr. Werner Hopf: Some best-practice tips are:
- Start early.
- Involve data owners and users — archiving is never a technical initiative.
- Make sure you analyze your systems re-volume and growth rates to address the most important areas.
- Don’t underestimate the change management effort. Add-on solutions for transparent retrieval can greatly simplify the required changes and training for the user community.
A rule of thumb: With an efficient archiving strategy, more than 80% of your transactional data should reside in archive storage.
Comment from Guest: My company’s data is highly regulated. What do I need to consider before embarking on a data archiving strategy in terms of legal requirements?
Dr. Werner Hopf: When planning for archiving in regulated systems, you might want to consider the following:
- Storage architecture — regulated systems typically require usage of WORM (write once read many) storage. Also, it is important to use an SAP-compliant archive storage software that can set and manage retention periods for individual stored objects, not just for the complete repository.
- Retention requirements are typically country specific. You will have to set up your residence and retention rules by group of organizational units for each country. Sometimes it is even necessary to apply different rules by document type. This will lead to a fairly large number of archiving sessions for each object, so you want to automate the process as much as possible. This will not only reduce the work effort, but also ensure consistency with your residence and retention rules.
- Fast, transparent access to archived data is critical to being able to respond to regulatory audits quickly and effectively. You might consider investing in add-on tools for transparent retrieval and on-demand audit extraction.
- Managing end of life — archived information should be deleted once the end of the defined retention is reached.
Comment from Manuel S.: We are getting close to starting to archive our purchase orders and purchase requisitions, but it will be the first time we undertake such an activity in the company. Where do we start to make sure we succeed?
Dr. Werner Hopf: Make sure to look at the complete procure-to-pay process holistically, and gather requirements from all end-user departments that need access to purchasing documents. Use a prototype setup in a sandbox or test system to review and validate requirements.
Comment from Guest: How can a data management strategy help mitigate the risks associated with data storage?
Dr. Werner Hopf: An efficient data management strategy allows you to store and protect information according to its value and associated risk of loss. It allows you to separate high-value/high-risk information from data and documents that are less important for the organization. You can then select the appropriate security and protection levels based on your classification.
Comment from Frank: Is there a general rule for the time frame of archiving or does it depend on the data volume?
Dr. Werner Hopf: Retention periods (time from creation to archiving) and residence periods (overall lifetime) typically depend on the specific type of information and might also vary between verticals and even individual companies. Also keep in mind that a prerequisite for archiving is that the transactions to be archived have to be business complete (i.e., no more changes needed).
Comment from Jeff LeBlanc: What are our options for data archival in SAP? We have a requirement to have seven years of data in production, then after that we would like to offload to a separate database for archival access. Does SAP have any options to accomplish this?
Dr. Werner Hopf: Requiring seven years of data in your production system doesn’t necessarily mean that you need seven years of data online in the live database. Most organizations with a successful archiving strategy archive shortly after data is business complete. You just have to make sure that archived information can be easily accessed and retrieved according to your users’ requirements.
SAP systems (both Business Suite systems and BW) have built-in archiving technology that includes the basic functionality of moving online data to archive storage. Most companies implementing data archiving augment the built-in functionality with both secure and scalable archive storage for low-cost, long-term retention of archived information as well as add-on tools for transparent retrieval and extraction to meet users’ access requirements.
Comment from Guest: What would be the storage mechanism for big data and HANA? Is your recommendation to have an add-on tool to store the archived data, rather than in SAP BI landscape?
Dr. Werner Hopf: For high-volume data, it is typically better to store less frequently used, older data in a near-line database (for example SAP Sybase IQ). Cost of ownership is significantly lower compared to keeping data in memory. You can query near-line data transparently, so there is no change management impact for end users.
Comment from Narasimha: During an analysis, we found FI_DOCUMNT secondary index tables are at the top of the list in sizes. What would be reason? Are they not being archived?
Dr. Werner Hopf: Secondary index tables (BSIS, BSAS, etc.) contain redundant information, for example, the same fields that are in the FI document header (BKPF) or line items (BSEG), just organized by account instead of by document number.
The archiving process does not copy data from secondary index tables to archive storage. In order to reduce the size of these tables, you have to run the post-processing step for FI_DOCUMNT after you’ve completed archiving write and deletion jobs.
Keep in mind that line item reports in finance (FBL1N, FBL3N, FBL5N) will no longer show archived data once post processing is complete. It is important to keep this information accessible since it is also used from account balance transactions when drilling down into individual periods.
Comment from Guest: Can you discuss security concerns with data archiving? What should we look out for?
Dr. Werner Hopf: Can you please be a bit more specific about security concerns? Data archiving just moves information from the online database to archive storage. When users access archived information with archive-enabled transactions or reports, the same authorization checks apply that are in place for online data.
Comment from Brenda: Would the country-specific setup apply similarly to multiple company codes in a shared service?
Dr. Werner Hopf: Multiple company codes for a single country typically share the same archiving and retention rules. The same applies to other organizational units like sales orgs, purchasing orgs, etc. There are of course exceptions to this general rule: One or more organizational units from one country could have different retention requirements than others if required by the business process. For example, think about a manufacturing organization where one division produces regulated goods, but other divisions don’t.
Comment from Ken: What kind of ROI can my company expect from a data volume management strategy?
Dr. Werner Hopf: Most companies look at different metrics to measure ROI. Reduced infrastructure cost is typically the easiest to quantify. Once you determine how much of your transactional data can be archived, you can project the cost for system components (storage, CPU, memory, backup, etc.) for both scenarios (with and without archiving).
Other metrics are more difficult to quantify. The online database for a system using efficient archiving is much smaller compared to the same system without archiving, so disaster recovery times are shorter. But it is difficult to quantify the cost of downtime in a disaster recovery situation. Also difficult to quantify is the compliance aspect: What is the value of protecting information in a read-only format?
Comment from Guest: What do I have to do in terms of change management and user education? How do I support an archive environment after implementation is complete?
Dr. Werner Hopf: Make sure users clearly understand the lifecycle of transactional data: When will it be archived and when will it be deleted? Provide user training on how to access information once it is archived.
In terms of support, archiving is a process, not a project. Automate the archiving process as much as possible to minimize ongoing support effort.
Comment from Pradeep Gopalakrishnan: I have a question on dynamic tiering which is an add-on component to the HANA database. Can the disk used for dynamic tiering be the same as the disk used to persist HANA in-memory data?
Dr. Werner Hopf: Dynamic tiering is different from archiving. It’s primarily a strategy to reduce the in-memory space requirements, but doesn’t address data lifecycle at all.
Comment from Narasimha: Is there a way to load all the archived data back to the HANA platform so we have the data for analytics and queries?
Dr. Werner Hopf: For BW on HANA, it’s fairly easy to reload archived data back. ECC is more difficult since not all archiving objects provide an option to reload. Also, reloading is generally not recommended and has to be very thoroughly tested in order to not impact ongoing transactions.
But keep in mind that you don’t need to reload to be able to access archived data. With a columnar near-line system for archive storage, query performance for archived information is very good.
Comment from Narasimha: Is there a way to bring back the ECC archived data (data archived before HANA migration) into HANA and keep it available in the cold store so the data is available for analytics/queries?
Dr. Werner Hopf: Cold storage is part of the data aging framework. Data archived in ECC cannot be loaded into a cold partition created by the data aging framework. However, it is possible to access and retrieve archived data from ECC on HANA or S/4HANA directly from ADK storage.
Comment from Pradeep Gopalakrishnan: Is it possible to archive data from advanced DSOs using the standard DAP process?
Dr. Werner Hopf: Yes, DSOs will use the standard DAP process. However, please note that an existing DSO has to be empty before it can be converted to an advanced DSO. Advanced DSOs will also support new functionality like “straggler management” planned for future support packs for BW on HANA.
Natalie Miller, Moderator: As we come to the end of today’s Q&A, I’d like to thank you all again for joining us and for your great questions. And a big thank-you to Dr. Hopf for these insightful answers!
Dr. Werner Hopf: Thank you, Natalie!
For more information or to ask additional questions, please contact Dolphin (www.dolphin-corp.com). You can also access the Gartner Magic Quadrant on structured data archiving.