Information Lifecycle Management in an SAP Environment

SAP White Paper Information Lifecycle Management in an SAP Environment Version 3.0 February 2008 © Copyright 2008 SAP AG. All rights reserved. No p...
Author: Gertrude Taylor
0 downloads 0 Views 989KB Size
SAP White Paper

Information Lifecycle Management in an SAP Environment Version 3.0 February 2008

© Copyright 2008 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft®, WINDOWS®, NT®, EXCEL®, Word®, PowerPoint® and SQL Server® are registered trademarks of Microsoft Corporation. IBM®, DB2®, OS/2®, DB2/6000®, Parallel Sysplex®, MVS/ESA®, RS/6000®, AIX®, S/390®, AS/400®, OS/390®, and OS/400® are registered trademarks of IBM Corporation. ORACLE® is a registered trademark of ORACLE Corporation. INFORMIX®-OnLine for SAP and INFORMIX® Dynamic ServerTM are registered trademarks of Informix Software Incorporated. UNIX®, X/Open®, OSF/1®, and Motif® are registered trademarks of the Open Group. Citrix®, the Citrix logo, ICA®, Program Neighborhood®, MetaFrame®, WinFrame®, VideoFrame®, MultiWin® and other Citrix product names referenced herein are trademarks of Citrix Systems, Inc. HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology. JAVA® is a registered trademark of Sun Microsystems, Inc. JAVASCRIPT® is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. SAP, SAP Logo, R/2, RIVA, R/3, SAP ArchiveLink, SAP Business Workflow, WebFlow, SAP EarlyWatch, BAPI, SAPPHIRE, Management Cockpit, SAP Business Suite

2

Logo and SAP Business Suite are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other products mentioned are trademarks or registered trademarks of their respective companies.

Disclaimer This white paper outlines our general product direction and should not be relied on in making a purchase decision. This white paper is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this white paper or to develop or release any functionality mentioned in this white paper. This white paper and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent.

CONTENTS DISCLAIMER..................................................................................................................................2 INTRODUCTION ............................................................................................................................4 BACKGROUND TO ILM................................................................................................................4 WHY ILM?.....................................................................................................................................5 ILM IN PRACTICE ...........................................................................................................................6 THE LIFE CYCLE OF DATA ..............................................................................................................7 SAP’S STANDARD TOOLS AND FUNCTIONS TO SUPPORT YOUR ILM STRATEGY ........7 MAIN CONCEPTS AND TECHNOLOGIES IN DATA ARCHIVING AT SAP ................................................8 ADDITIONAL STANDARD FUNCTIONS AND CONCEPTS TO SUPPORT ILM..........................................13 ILM SOLUTION FROM SAP .......................................................................................................18 NON-DISRUPTIVE INNOVATION AND OPEN INFRASTRUCTURE .........................................................18 MAIN CONCEPTS, PROCESSES, AND TECHNOLOGIES IN RETENTION MANAGEMENT ..........................18 MAIN CONCEPTS AND TECHNOLOGIES IN RETENTION WAREHOUSE ................................................23 ILM FROM SAP VS. ENTERPRISE CONTENT MANAGEMENT (ECM).................................................26 AVAILABILITY ..............................................................................................................................26 CONCLUSION AND OUTLOOK .................................................................................................26 ADDITIONAL INFORMATION ..........................................................................................................27 GLOSSARY....................................................................................................................................28

3

Introduction Information Lifecycle Management (ILM) from SAP is built upon three main cornerstones. The first is data archiving/data management. The second cornerstone is retention management, which is composed of a range of new functions such as retention policy management and legal hold, to help you adequately control the retention of your data along its entire life cycle. The third cornerstone is the retention warehouse, a standardized solution for decommissioning legacy systems and methods for tax and other type of reporting on the data from the decommissioned system. Version 3.0 of this white paper presents a close-up view of the three cornerstones of the new ILM solution from SAP. The new functions of the two new cornerstones are a major step towards more automation and completeness in ILM, addressing retention management and end-of-life data in a legal compliance setting, as well as retention management and legal compliance issues during system decommissioning. The new functions of the solution all build upon tried and tested functions that customers have been using for years and can therefore be called “non-disruptive innovation”. They were partly designed in conjunction with an ILM Influence Council at the Americas’ SAP User Group (ASUG). Through this close cooperation with our customers, SAP was able to obtain specific feedback on the pain points companies are facing in their ILM and legal compliance strategies. As a result, development of the new retention management and retention warehouse functions has been directly aimed at solving these pain points and has been developed with a special focus of enhancing or building on existing capabilities. ILM from SAP is currently being rolled out, and a first version is scheduled to be released to ramp-up customers mid 2008. Further developments and enhancements to the ILM solution from SAP will be released in the future. Note! Some of the concepts and functions described in this white paper are not yet part of the first version of ILM from SAP scheduled for ramp-up mid 2008. Some concepts and functions will be developed in the second version of ILM from SAP or other future versions.

4

Background to ILM Over the last few years information lifecycle management has gone from being a mere buzzword to representing a solid strategy and concept with a clear definition that most agree upon. While at the beginning, ILM was strongly driven by the storage industry and often used as a synonym for tiered storage solutions, Hierarchical Storage Management (HSM), or other storage-related concepts, most have recognized that a more holistic approach to information management is needed to meet the demands of today’s business world and get a better handle on the increasing complexity of IT system landscapes. The most widely accepted definition of ILM is that issued by the Storage Networking Industry Association (SNIA), a vendor neutral association of storage companies, application software producers, and analysts (among others) who have joined forces to deal with information management questions and related topics. SNIA defines ILM as follows: Information Lifecycle Management is comprised of the policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost effective IT infrastructure from the time information is conceived through its final disposition. Information is aligned with business processes through management of policies and service levels associated with applications, metadata, information, and data. This is the definition we have also adopted at SAP. Due in large part to SAP’s efforts towards a shift in thinking regarding ILM, the industry is increasingly recognizing the importance of a more holistic approach regarding information management and moving away from a purely technological approach. It is impossible to manage data and information from “the time it is conceived through its final disposition” by only focusing on storage hardware and software. A large amount of information is born in application systems and lives there for some time before it can be moved to storage systems. And even when it has been moved to storage, the business application still “owns” and controls the data. Therefore, if ILM is to evolve, it is

essential for applications and storage to communicate and be compatible. The misconception that ILM is an off-the-shelf product still exists in the market. Companies have to recognize that ILM represents a change in thinking—a new mindset regarding information management, which needs to be present on the level of each individual employee. This is the direction toward which the entire information management industry is currently evolving. Our advice to companies who are struggling with data management issues, or who are thinking about centralizing their system landscape, is to begin thinking about a complete ILM strategy and use the technology that is available today to lay the groundwork for a full-fledged ILM strategy in the near future. Implementing ILM is a gradual process, and not something that can be done over night. In this white paper we present how companies can use the technology available from SAP today to implement a solid ILM strategy and pave the way towards seamlessly integrating future SAP offerings for ILM. We also describe technological advances that will support our customers’ ILM and retention management strategy through more automation and completeness and by helping to reduce complexity. We also cover an important aspect of ILM that is often overlooked: system decommissioning.

Why ILM? ILM is not something that was simply invented without reason. Rather, ILM came about because of different factors and events that worked together and have forced companies to shift the way they do business. Disasters such as 9/11 and events such as the Enron scandal have triggered a virtual explosion of legal requirements that cover digital information. It is estimated that currently there are many thousands of regulations worldwide that mandate the maintenance of electronic data, and this number is continuously growing. The most recent of these are the reworked e-Discovery rules in the United States, which officially took effect in December 2006 (see Amendments to the Federal Rules of Civil Procedure for a summary of the actual amendments).

5

As a result of these changes and new laws, the value of information in an enterprise has risen considerably. In fact, for some companies, information is the single most important asset. Another change that has come about as a result is that the complexity of managing this information based on its value has increased to such an extent that traditional data management strategies are no longer sufficient to meet this challenge. Laws are covering more and more data types, across a growing number of industries and countries, and in some cases for longer periods of time. It is not unheard of, in the German public sector, for example, to have legal retention periods of 100 years. This is expected to lead to an even greater increase in data volumes, to such an extent that the savings attained through the falling cost of storage are nullified, or even reversed. A further reality that enterprises are dealing with is the decommissioning of legacy system. It is not uncommon among large SAP customers to have over 200 legacy systems they need to decommission. So far, there has not been a standardized solution to deal with this daunting task. One of the major questions a company has to resolve in such an endeavor is how to ensure legal compliance for the decommissioned data and to continue to garner the value of the information without having to keep the original system alive and incurring the resulting costs. What happens, for example, during a tax audit involving data from the decommissioned system? Luckily ILM has emerged in answer to these pressing questions, largely also because of technological innovation. Without such innovation, ILM would not be possible. On the one hand, as we have already explained, what differentiates ILM from traditional data management practices is that it is a more holistic approach. On the other hand, it involves more automated processes, which help reduce the complexity of managing information and retention management. As technology and automation evolve, ILM will become more and more grounded as a strategy. As a result of these changes in the last four to five years, data management has developed into information lifecycle management. While traditional data management aspects, such as data volume control, Total Cost of Ownership

(TCO), and performance are still as important as they have always been, the two additional aspects that have come into play and are driving ILM are technological innovation and legal compliance.

ILM in Practice To show how ILM might look in practice, SAP has developed a four-phase model to depict the processes and tasks involved. The goal of this model is to provide you with guidelines on how to gain a firm understanding of your data and how to manage it. Simply put, ILM means understanding your data and managing it in terms of location and time. This entails knowing what kind of data your company deals with and how it is affected, where it resides, how long it must be retained, and when it must be destroyed at the latest. Once you know your data, you need adequate tools and solutions to manage it – move it to the optimal storage location, store it, access it again, for example during an audit, and destroy it once the data retention period has been completed. Understanding your data and your processes is no small feat. Our four-phase model can help you break down these tasks into clearly defined steps.

specific areas that are prone to rapid growth, such as financial documents. You also identify the different data types you have, such as structured and unstructured data. Then you group the identified data into appropriate areas depending on your needs. In the second phase you define your policies, based on your company’s legal requirements, Service Level Agreements, retention periods, etc. A typical rule or policy might concern your financial documents. They have a residence time of two years, which would mean that they must be kept in the database for two years before they can be archived, due to the different requirements of internal user departments. But they have a retention time of 7 years in the US and 10 in Germany, which means that they have to stay in the archive until the retention time has expired for each country (ideally separated according to the relevant regulations). In the third phase you apply the defined policies to your data and processes. In SAP environment this means preparing your SAP system and all supporting partner systems (for example storage) for the fourth phase. Here you map generic terms like financial data and countries to system specific terms like the archiving object FI_DOCUMNT and company code. You also set up customizing and get the system ready for archiving. Often times this phase involves test runs of your archiving sessions in a test system, to ensure that everything has been customized and set up correctly. Finally, in the fourth phase, you physically realize or implement the strategy you have developed in the first three phases. The phases break down the different activities that are involved as part of an ongoing ILM strategy. They do not have a set beginning or end, rather they should be viewed as an ongoing and recurring process, or as we have stated, an actual change in a company’s mindset.

Figure 1, ILM at SAP

In the first phase, you analyze and categorize your data. For example, you analyze your database table to determine

6

ILM is nothing new in the SAP environment. Many of the tools offered to support the different phases are tried and tested functions that have been used by SAP customers for many years. Other newer functions have been added recently in order to provide even better support to

customers in their ILM strategy through more automation and completeness.

The Life Cycle of Data A basic concept in any ILM strategy, of course, is the life cycle of data. Figure 2 shows this life cycle and the frequency with which SAP data is typically accessed over its lifetime. Logically, when data is created, it is accessed frequently. As time goes by and business processes are completed, end-users need to access it with less frequency, or maybe even never. At this point it makes sense from a database space and hence a cost point of view to move this data out of the database and into a file system or third-party storage system. Here, it can still be accessed if necessary, for internal or external audits, for example. Finally, after the time has passed that you are legally required to keep certain data, it can be, and in some cases even must be, destroyed.

Figure 2, The Life Cycle of SAP Data

In the context of the entire life cycle, data is only modifiable and relevant for everyday business for a short period of time. The largest portion of the life cycle of data is the retention period, during which the data has to be kept, but is accessed very infrequently. Note! In the SAP data archiving environment, residence time is the length of time that must be exceeded before application data can be archived. Depending on the application, the basis for calculating the residence time can be the creation date, the posting period, the goods issue date, and so on. If the residence time is 18 months, for example, all of the business-complete data that has been in the database for more than 18 months will be archived during the next

7

archiving phase. Retention period is defined as the length of time up to the point at which the data/documents can be destroyed. In ILM, we also speak of minimum and maximum retention period. The minimum retention period refers to the amount of time data has to be at least kept due to legal requirements, before it can be destroyed. The maximum retention period refers to the amount of time data has to be kept before it MUST be destroyed due to legal or other requirements. For the retention time it does not matter where the data resides, i.e. in the database or a storage system. Therefore, the residence time always falls within the retention period.

In the following sections we provide a detailed view of the standard tools SAP has been providing to support you in your ILM strategy. Our aim is to show you what basic ILM means on a day-to-day basis, by offering guidelines, helpful hints, and tips for a sound and efficient ILM project.

SAP’s Standard Tools and Functions to Support Your ILM Strategy As mentioned in the introduction, ILM in an SAP environment is supported by three main cornerstones, data archiving/management, retention management, and retention warehouse. Data archiving/management is a standard SAP function that has been available as part of SAP software since the early SAP R/3 releases. Most large SAP customers use data archiving as part of their data management strategy to keep data volume growth in check. You can use the standard data archiving concepts and technologies for basic support in your ILM strategy. However, as you will see later on, data archiving is also the core of the new ILM solution from SAP, on which ILM retention management and retention warehouse are built. This means that if you have already been archiving for some time, you have a perfect foundation for a full-fledged ILM solution from SAP, and can implement it with minimal disruption. In this section we will introduce the first cornerstone of ILM: data archiving/management.

Main Concepts and Technologies in Data Archiving at SAP Data archiving means writing business-complete data from the database into archives, and then deleting this data from the database. As we show in Figure 2, the term business complete refers to data that is unchangeable and no longer needed in everyday business—in other words, data that you no longer need to access on a regular basis. In this case we are only talking about structured data, meaning transactional or business data, not unstructured data such as attachments, scanned invoices, or incoming and outgoing documents. Therefore, data archiving should not be confused with the term optical archiving, which refers to the storage of unstructured data.

Incoming Documents

Outgoing Documents

Figure 4, Data Archiving at SAP

The following concepts are the main building blocks of SAP data archiving. They play an important role in the four phases of ILM, and, as you can see in Figure 1, support you in the management of your information throughout the life cycle of the data. Archiving Object: The archiving object is one of the most valuable data archiving concepts in the context of ILM. It relieves you of a good part of the tasks involved in the analyze and categorize and the apply policy phases.

Hyperlink

Archive files Print lists

Figure 3, Optical Archiving vs. Data Archiving

Figure 4 shows the steps involved in data archiving at SAP. Note that archiving data does not imply that data is removed from the system. Rather, it implies that data is removed from the database. Therefore, even after it has been archived, it is still considered to be part of the system and can be accessed if necessary. Figure 5, Archiving Object

The archiving object is a logical object of related business data. In other words, it contains the definition of the logical units in business processes. It also contains the programs necessary for the archiving of its particular data (primarily write and delete programs), as well as the definition of the

8

required customizing settings for archiving that data. When you archive the related business data that has been predefined in the archiving object, the data is written from the database to archive files using the corresponding write program, and then deleted from the database, using the corresponding delete program for that reads the safely archived data.

Archive Administration (transaction SARA): Archive Administration is the central user interface from which you can perform most of the activities involved in data archiving. These activities include Customizing, starting the different programs (write, delete…), checking archiving statistics and logs, managing archiving sessions, and maintaining Archive Routing, among others.

Tip! With SAP ERP 6.0 and current industry solutions, SAP offers more than 600 archiving objects and classes (see Categorizing). They cover business processes and data across all components and industry solutions. The top 30 archiving objects, such as FI_DOCUMNT, cover the most important and most commonly implemented business processes at customer sites.

Archive Information System (AS): The Archive Information System is the main indexing tool in the data archiving environment and is fully integrated into data archiving.

Experience at several customer sites have shown that you can often decrease your data growth rate by about 50% using only around 20 of the top archiving objects.

Archive Development Kit (ADK): ADK is the fundamental technology on which data archiving at SAP is built (for the ABAP environment). It provides the necessary technological basis for all archiving processes and programs, and is also used by both SAP development and customers to develop new archiving functions, including reporting (see Figure 6). In addition it is responsible for ensuring, for example, that data is retrieved correctly when accessed in the archive, through the adjustment of code pages, formats, or structural changes. It can handle data archived before and after a Unicode conversion. Other technical issues, such as the compression of the data that is archived and file handling, also fall under the responsibility of ADK.

Figure 7, Archive Information System (AS)

You can use AS to define indexes called archive information structures for searching and accessing your archived data. This tool plays an important role in the life cycle of SAP data, since it is one of the main ways to support access to archived data, in case of an audit, for example. Tip! The Archive Information System is an extremely flexible tool, which allows you to tailor your search capabilities to match your specific needs. For example, you can create very slim indexes for searches that are either specific, or for high-volume searches (large amount of data with few selection criteria).

Figure 6, Archive Development Kit (ADK)

9

Document Relationship Browser (DRB): This tool can be especially helpful during both the analyze and categorize and the realize phase. It helps you better understand your processes and the relationships between the different data

in your system even after some of the data has already been archived. With it you can display the business objects along a business process and show how they are related. The benefit is that the relationships shown by DRB include the data in the database as well as the data that is already archived, thus providing a full picture of a business process and its data.

Figure 8, Document Relationship Browser (DRB)

Archive Routing: With Archive Routing you can create rules and conditions based on which archive files are automatically routed to specific locations in the file system and/or content repositories at the time of archiving. It can support you in your ILM strategy, because it allows you to segregate data based on different criteria, such as company code, or date. File System

Content Repository Content Repository

File System Germany USA

A

Germany B

For storage systems connected through file system (HSM, magnetic disc systems)

Selection criteria for archiving

USA

1 2

For storage systems connected through ArchiveLink (DVD, WORM/MO, CD)

Database

Archiving Session 1

Company Code 0010

Company Code 0010

Archiving Session 2

Company Code 0020

Company Code 0020

Rule 1

Rule 2

Figure 9, Archive Routing

10

Content Repository 1 Content Repository 2

For a detailed view of how Archive Routing works, see the Archive Routing Close Up section below.

Archive Routing Close Up Archive Routing is a newer data archiving function, which was released with SAP NetWeaver 7.0. It was developed in close cooperation with an Americas’ SAP Users Group (ASUG) influence council concerning the common issue some customers are facing regarding how best to segregate archived data to facilitate the final destruction of data and provide support for legally compliant data retention in terms of data placement. These issues have become a focal point for many customers, in part because of the growing demands of legal compliance and also because of ongoing TCO concerns. Some laws mandate that certain data be kept in the country it is created. Other data needs to be immediately destroyed as soon as the required retention period for this data has passed. In other cases, companies may prefer to destroy their data as soon as the legal retention period has been fulfilled, even though this destruction may not be required by law. This strategy is often implemented to free up storage space and ease administration costs, or so that this data does not represent a liability for the enterprise, should any legal situation come up. With archive routing, although it does not offer any functions to deal with retention time, you can at least segregate the archived data so that the manual enforcement of retention times is easier to implement. The length of the required data retention period varies greatly and depends on many factors such as the industry and its specific legal requirements, type of data, the country where this data is produced, and many more. If your company has several international locations and operates in different industries, for example, managing the storage and destruction of data can get quite complicated. Having a clear concept for organizing, or “segregating”, archived data can make this task much easier and more productive. Archive Routing supports you in this endeavor. Before Archive Routing, you could specify one file system folder or content repository per archiving object in data archiving Customizing, into which the archive files for that archiving

11

object would be written. This helped you segregate your data per archiving object, regardless of other criteria, such as company code or fiscal year, for example. Until recently this was sufficient. However, with increasingly complex data management scenarios and the growing pressure to better and more efficiently manage data along its lifecycle, ASUG members required more automation and flexibility in the area of data segregation. With Archive Routing you can create rules and conditions based upon which archive files are automatically routed to specific locations in the file system, and/or content repositories at the time of storage. It provides support along the life cycle of your data and requires that you plan out how you manage your archived data until its destruction – for you to be able to enter valid rules, conditions, file system folders, or content repositories, you should have a clear picture of which data needs to be kept and for how long, and whether or not it must be destroyed immediately after fulfilling its legal retention period. Note! During data archiving data is first written to the file system and can then be optionally moved to a third party storage system. These are storage systems that are either directly connected to the file system, such as magnetic disc-based systems or HSM, or are accessible through the ArchiveLink interface, such as DVD, CD, WORM systems. Archive Routing covers both options, because it allows you to set up rules for routing to a folder in the file system and/or to content repositories.

Entering the Rules and Conditions using the example of Content Repositories Figure 9 shows an overview of the concept of Archive Routing. You can enter rules for routing only to specific file system folders, which serves those storage systems, such as HSM or magnetic disc systems, connected directly to the file system. It is also possible to set up rules and conditions for routing only to a content repository. This approach serves those storage systems, such as WORM or Jukeboxes, which are connected via the ArchiveLink interface. It is also possible to use file system and content repository routing concurrently.

The lower part of Figure 9 shows the actual routing process after you have set up the rules and conditions. These are set up in data archiving customizing, where you can create rules for each archiving object that determine which content repository is to be chosen at the time of storing. For each rule you enter conditions, which contain selection criteria and a corresponding value or interval. The criteria used in the rules and conditions can be on the level of organizational unit (such as company code) or time-based (such as fiscal year), or both. It is possible to create one or more rules per archiving object and one or more conditions per rule. The complexity of the rules depends on how specific your criteria are for separating the archived data into different content repositories. Tip! Handle Archive Routing with care and keep it small and simple! If rules and criteria become too numerous and too complicated, it may be difficult to trace any problems, should they occur.

The smallest unit for which a content repository can be determined using Archive Routing is an archiving session. This means that the individual archive files, and therefore all the archived data in an archiving session, are routed to one and the same location. If you want to route data for the same archiving object to different locations, you must start a separate session for each separate location, using different selections matching another routing rule. The Archiving Process During the archiving process the rules you entered in Customizing are checked twice by the system: once during the write session and again during the storage phase. Note that Archive Routing does not use the actual contents of the archive file to determine the content repository, but the selection criteria entered in the variant for the write session. The set of data covered by the selection criteria in the variant does not have to be exactly the same as the range of data covered by the routing rules and conditions - rather the variant selection must fall inside the range provided by them. If this is not the case, the archiving session is terminated. If it is the case, the archiving session is carried out and the archive files are routed to the appropriate location during the storage phase.

12

From Archive Routing to Information Retention Manager (IRM) As mentioned, Archive Routing was released with SAP NetWeaver 7.0, as the answer to some of the pain points our customers were facing. It was also a first step towards ILM, because of the support it provided for legal compliance and data destruction strategies. It is a very good basic data management function, and can help you get started on your ILM strategy.

Additional Standard Functions and Concepts to Support ILM ILM is about knowing your data and your business processes so well that you can control them in the most efficient and cost-effective way possible. Broadly speaking, this covers all the data and information produced within your company, from e-mails and in- and outgoing documents to structured or transactional data. In this white paper, we will continue to concentrate on structured data to illustrate how you can use SAP programs and functions to support your ILM strategy (however, you can apply the model to all information in your environment). Before you can begin to manage anything, you must first know what you need to manage and how much of it there is. This is done in the analyzing and categorizing phase. There are many standard tools and functions within your SAP system that help you obtain a close to complete picture of your system and the data in it, how much of it there is, how much it has grown, and where the pain points are. In this chapter, we want to show you some of the most important ones:

Table Analysis (Transaction TAANA): Once you have an overview of the state of the database and know, for example, which tables are growing the fastest, you can use the Table Analysis function to further analyze the contents of the tables. This function enables you to find out how many entries a table contains for a specific field such as company code or date. This can help you determine more specifically, for example, where in your system the pain points are in terms of data growth. Based on this information you can better plot your archiving strategy during the Realize phase. If you want to reduce the size of your database quickly, for example, you may want to archive those years with the most entries in a particular area. Figure 10, Table Analysis (transaction TAANA) shows the components of the Table Analysis function. To run an analysis you need to use an analysis variant. You can either use one provided by SAP or create your own. You can also use ad-hoc variants, which serve only once and cannot be saved in the system, although the results of the ad-hoc analysis are saved for later review.

Variant Definition

Analyzing Database Monitor (Transaction DB02): A good place to begin checking the state of your data and system is with the database monitor. With this transaction you can find out the size and number of your database tables and indexes and display their growth history. You can also view a variety of other important database-specific indicators. The display of this transaction varies depending on the database you are using. Tip! You can also use other supporting tools to run further analyses of your database activities, including ST03 and BRSPACE for Oracle databases. BRSPACE is part of BR*Tools (for more information see SAP Note 12741). Transaction DB15, which is used to determine which archiving objects correspond to which tables and vice versa, for some databases, can also be used to see how large a specific table is (Space Statistics push button).

13

Header

Analysis

Details

Figure 10, Table Analysis (transaction TAANA)

Archive Administration Statistics (Transaction SARA): Knowing your data also means knowing what you have already achieved through archiving—if you have archived before—and how much space has been freed through these processes. The Statistics function in Archive Administration provides you with comprehensive information about

archiving sessions, the data objects written, and the duration of the different jobs involved in data archiving (write, delete). Moreover, you can see a close estimation of the database space that was freed through data archiving and the amount of space occupied by the archive files. ADK collects these statistics during the write and delete phases of data archiving. You can display the statistics by entering specific selection criteria, such as archiving session, client, or archiving object. You can also display the statistics for all archiving objects. Categorizing Categorizing structured data can be difficult, due to the complex relationships and dependencies between the different data in a system. In this area SAP has already done a good part of the work for you, especially with respect to data archiving. Business Object: SAP business objects represent actual business objects from the real world, such as sales orders or invoices. This type of encapsulation reduces complexity because the inner structure of the business object remains concealed. From a technical point of view, a business object is an instance of a business object type that has concrete values. Business objects are managed in the Business Object Repository (BOR), where their properties are modeled and their relationships to one another stored. Archiving Object: The archiving objects we explained under SAP’s Standard Tools and Functions to Support Your ILM Strategy, are directly related to SAP business objects. They cover a majority of the business processes and data across all components and industry solutions. The archiving object already covers the complex dependencies and relationships between the data, so that you do not have to determine these on your own. The following example gives you a better idea of what this means. Without the archiving object, you would only be able to archive this data if you manually determined the relationships between the data from the list of tables shown. Considering that SAP offers over 600 archiving objects and classes, you can imagine the amount of time and effort it would take to do this for all the data you need to archive.

14

Example: Archiving Object SD_VBAK – Sales Orders When you use the archiving object SD_VBAK, the system archives data from the following tables: Table

Table Name

AUSP

Characteristic Values

CMFK

Storage Structure for the Error Log Header

CMFP

Storage Structure for Errors Collected

FMSU

FI-FM Totals Records

FPLA

Billing Plan

FPLT

Billing Plan: Dates

INOB

Link between Internal Number and Object

JCDO

Change Documents for Status Object (Table JSTO)

JCDS

Change Documents for System/User Statuses (Table JEST)

JEST

Object Status

JSTO

Status Object Information

KANZ

Assignment of Sales Order Items – Costing Objects

KEKO

Product Costing – Header

KEPH

Product Costing: Cost Components for Cost of Goods Mfd

KNKO

Assignment of a Cost. Est. Number to Config. Object

KOCLU

Cluster for Conditions in Purchasing and Sales

KSSK

Allocation Table: Object to Class

NAST

Message Status

SADR

Address Management: Company Data

VBAK

Sales Documents: Header Data

VBAP

Sales Documents: Item Data

VBEH

Schedule Line History

VBEP

Sales Document: Schedule Line Data

VBEX

SD Document: Export Control: Data at Item Level

VBFCL

Sales Document Flow Cluster

VBLB

Sales Document: Release Order Data

VBSN

Change Status Relating to Scheduling Agreements

VBUK

Sales Document: Header Status and Administrative Data

VBUP

Sales Document: Item Status

VBUV

Sales Document: Incompleteness Log

VEDA

Contract Data

Furthermore, the following archiving classes are involved to archive data from additional tables: K_TOTAL, TEXT, CHANGEDOCU, K_UNITCOST, CU_CONFIG Note! Archiving classes are mechanisms used to archive data that cannot stand alone from a business point of view and is archived together with application data objects. Examples of data archived using archiving classes are SAPscript texts, change documents, and classification data.

Tables and Archiving Objects (Transaction DB15): This function helps you get a better idea about which data is covered by a particular archiving object or which archiving object covers particular data. You can use this function to determine from which tables a specific archiving object archives data. You can also enter a specific table to help you choose which archiving object you should use, if you need to reduce the size of this table through archiving. Defining and Applying Policies Phases two and three of our ILM model deal with defining policies based on your company’s internal and external requirements, and then applying those policies to the data you have identified in the first phase. In other words, you map generic terms, such as financial data, or country, to specific terms in your system, like FI_DOCUMNT or company code. This phase also means setting up your system and taking the necessary steps to pave the way towards the implementation phase.

15

To put it simply, first you find out what kind of data and how much of it you have. Next, you must find out what your internal and external requirements are and see what needs to be done in terms of preparing your SAP system as well as partner storage or other complimentary systems, so that theory can be put into practice. Internal requirements Your internal requirements are largely dictated by the business and the user departments and how they use the data in the system. These requirements are often formalized in Service Level Agreements (SLAs). For example, based on its business processes the controlling department may have the specific need for the data to be kept online for a certain number of months. The residency of data and documents in an SAP system is highly dependent on the balance between system health and user community demands to view and modify that data. Communication between IT and the different user departments is of utmost importance here, because often users will request longer residence times for data, based on the unfounded fear that data is no longer accessible once it has been archived. Tip! The Data Management Guide (DMG) is available to SAP customers and is an important document to support you in the Define and Apply Policy phases. It currently covers about 70 database tables across different components and solutions, which are often among the top 20 database tables in our customers’ systems. The DMG provides tips and tricks for the best strategy for the data in these tables. For each table it covers data prevention, aggregation, deletion, and archiving. It is updated on a quarterly basis and is available in the SAP Service Marketplace under quick link /ilm.

External Requirements Closely related are the external requirements, which are usually dictated by laws and regulations, such as Sarbanes Oxley or HIPAA, and which in turn influence business processes and hence internal requirements. Most companies have made headway on building a legal compliance strategy, but as the number of requirements keeps increasing, more and more effort has to be exerted in this area. Focus is particularly on automating legal compliance processes. As we have seen, here SAP is

making important advances to support its customers. In addition to the Retention Management and ILM functions we have described, SAP also offers Solutions for Governance, Risk, and Compliance (GRC). For more information see www.sap.com/grc. Based on the internal and external requirements you have collected, you must set up some kind of guidelines or rules for the data in your system, as well as for the data outside of your system. A key concept that comes into play here is the life cycle of your data (see Section 1) and the frequency with which it must be accessed, or whether or not it should or must be destroyed after a certain number of years. This will greatly influence how long certain data has to be kept online, what data can be archived and when, and which data can be destroyed. Tip! The Data Retention Tool (DART) is a tool that is widely used in the US and in recent years also in Germany, for tax relevant data. It takes extracts of data, which can later be used in case of a financial audit, even if the data has been archived. SAP highly recommends running DART extracts before data archiving.

Data Retention Data retention policies lay out exactly which data must be kept for how long, and where, either online, or in some kind of a storage system. Although the existence of a “Corporate Document Retention Policy” may not be advertised, most large U.S. and global companies already have some type of policy regarding the retention of documents (paper) and data. Many recent Sarbanes-Oxley Act initiatives may have identified and/or updated these policies. For example, one of the most common general retention rules in the U.S. is seven years for IRS or tax relevant data and documents. In Germany the general retention rule for tax relevant data is ten years. Many organizations have additional legal policies beyond their “Corporate Document Retention Policies”. These legal polices were usually developed as a result of legal experiences. One common policy includes legal hold orders. Retention policies may vary for a single business object by company code, plant, sales organization, or some other unique qualifier. This has to do with the fact that different

16

countries have different laws and legal requirements that have to be taken into consideration. The groundwork for every successful ILM strategy begins not only with the technical identification of business objects in the SAP system, but a comprehensive inventory of the policies and regulations that are specific to each organization. End-of-Life of Data Another important point to consider when defining your policies is the end-of-life of your data. Many SAP customers in the US, for example, are approaching their seven year anniversary of going live with SAP. For some data many companies use seven years as the rule for keeping original transactions in ERP systems. After seven years, and in some cases earlier, the IRS does not require you to keep some of the underlying original business data. That does not mean that data does not still posses value for your organization, only that the original document can be purged or destroyed without risk of violating IRS procedures and incurring penalties. Keeping original documents longer than seven years could not only violate your document retention policy, but expose your company to unnecessary liability. In addition, in general, it pays to keep the total amount of retained data as small as possible, to avoid unnecessary effort and costs related to the management of that data. Tip! In close cooperation with ASUG, SAP developed the Guide to Final Deletion of Data. This guide deals with the end-of-life of data in an SAP system, and discusses the complex relationships between data that have to be taken into account if you want to delete certain data from your system for good. It is available in the SAP Service Marketplace (service.sap.com) under quick link /ilm.

As part of its new ILM solution, SAP is beginning to offer new features that will support you in the data destruction process (see Main Concepts, Processes, and Technologies in Retention Management). Deciding When and What to Archive Another consideration in prioritizing the business objects on your retention schedule is gathering business requirements for each archiving object—if the data were to be archived

and destroyed tomorrow what would the impacts be on the business? One consideration in this regard would be whether or not you should build and populate SAP NetWeaver Business Intelligence (BI) info cubes with the information before it is destroyed. There are different issues to contemplate in this respect, concerning legal compliance. In some cases, for example, the same data available in BI may not have any legal repercussions for your business, even after the legal retention period for your data is over and the original data has been deleted. This may be due to the fact that data in BI is often highly aggregated and can no longer be identified as belonging to a specific object. However, this should be decided on a case by case basis, and should be carefully analyzed before the final deletion of the original data. Another question would be whether or not DART extracts need to be taken for tax-relevant data, before data archiving, and which DART field catalogs would be best to use. Based upon your above findings, you will need to develop a complete retention schedule. Applying the Policies Now that you have defined the policies for your data, you must begin to transform your rules into action. This means deciding how your policies can best be supported by technology, what settings need to be made in your SAP system, which archiving objects you should use, what customizing needs to be performed, how best to support end-of-life of data, what partner solutions and storage systems you need to use, and how these should be set up. It may also entail choosing the most important archiving objects for archiving, deciding on the selection criteria for archiving, or scheduling and running DART extracts. Tip! If you are using data archiving as part of an ILM strategy, it is impractical to include all 600+ SAP standard archiving objects in your project. Chances are good that your organization is not using all 600 business objects, but many of the motivations for archiving are greatly diminished after the top 20 or so archiving objects. Therefore, every ILM strategy must prioritize business objects based on criteria such as data value and liability or risk of not destroying data in accordance with your retention schedule.

17

Tip! Use transaction DB15 to determine which archiving objects to choose in your particular strategy. See the Categorizing above.

Realize: The “Doing” Phase The last phase deals with the actual doing part of your ILM strategy. Someone once said that ILM is 90% policy and 10% technology. We tend to agree with the general idea of this statement and in a standard functions scenario see most of the technology, as part of this last phase. However, automated support for the other phases is growing and becoming more sophisticated in ILM from SAP described in the next sections. Nonetheless, without a proper and detailed plan, technology cannot be used in the most optimal way possible. So let’s return to our ILM example to illustrate how this might look in practice. A typical data archiving project Based on your data retention policies and the analyses you have made, you have set up an archiving project and determined that you need to first archive financial documents, before you can archive the related transaction figures and customer master data. You know that the archiving object to use for financial documents is FI_DOCUMNT. Your archiving strategy is company wide and includes all your branches, which are located in the U.S. and Germany. You have determined that you want to destroy the archived documents after a certain number of years, when they are no longer subject to legal retention requirements, to free up storage space. Based on your internal requirements and the differing legal requirements for each location, you have determined that in the U.S. you need to keep the archived financial documents for seven years and in Germany for ten. You have also designated the appropriate storage location for each. Your Document Retention Policy also demands that data from the previous and current fiscal year be kept in your database. Since we are in the year 2007, you know that you can only archive the financial documents from 2005 and earlier. This being your first archiving project, you have a lot of data to archive in the initial phase and decide to archive intensively on three consecutive weekends at first. After that you can reduce

your regular archiving sessions for financial documents to once a quarter. You know that it is highly probable that your company will be audited by the IRS in the next five years. Because you may need to access your archived data at that point, you decide to keep the archive files of the most recent data from up to 6 years ago in the file system. You are also required to provide the financial authorities with financial data on external media. Therefore, you have taken DART extracts of the data before archiving and have saved them on discs. The other years and the data archived in Germany are moved to a tape library, because it is highly unlikely that you need to access this data again, before its final destruction. For the data that you expect to have to access you have filled archive information structures. Note! Archive information structures are indexes on archived data. They are a central element of the Archive Information System and are necessary if you want to access archived data. Sometimes these infostructures can get too large. SAP offers a new function called partitioning, where you can control the size of the tables by using creation date criteria.

In the last sections you have seen how the standard data archiving/management functions can provide basic support for your ILM strategy. More automation and technological support of ILM helps reduce complexity, effort, and the costs of implementing an ILM strategy. With its new ILM solution, described in the next sections, SAP provides the necessary technology and functions for a more automated ILM strategy.

ILM Solution from SAP As mentioned, the ILM solution from SAP consists of three cornerstones: 1. Data archiving/management: focuses on data volume management 2. Retention management: focuses on end-of life data 3. Retention warehouse: focuses on end-of-life system

18

The first cornerstone, described so far in this white paper, is an integral and necessary part of the overall ILM solution from SAP; it is the core of ILM from SAP. In this section we describe the other two cornerstones and show how, combined with data archiving/management, they provide end-to-end support for your ILM strategy.

Non-Disruptive Innovation and Open Infrastructure The new ILM solution from SAP offers new functions and capabilities for more automated and comprehensive ILM support. While the traditional tools and functions involved in standard data management provide basic support in your ILM strategy, the new ILM solution from SAP builds upon the traditional functions. Because the new solution is so tightly integrated with the traditional data management tools, implementing ILM from SAP will not disrupt your current data archiving practices. In the first version of ILM from SAP, ILM support is available only for SAP systems. However, the new ILM solution has been designed in such a way that later on it can also be used as an infrastructure for ILM strategies for non-SAP systems. Through open interfaces and the use of standards the new ILM solution is open to technology partners and independent software vendors (ISVs) and in the future can be extended to also serve non-SAP systems.

Main Concepts, Processes, and Technologies in Retention Management The second cornerstone, retention management, involves different tools and methods for managing all aspects of retaining data based on legal requirements. This includes retention policy management, ILM-aware storage integration, managing the destruction of data, as well as methods and tools for dealing with e-discovery and legal hold.

Retention Management: Information Retention Manager (IRM) IRM is the central tool for managing internal and external policies covering your information. It is highly flexible, allowing you to enter rules and policies based on different criteria including how long data should be kept, when it can be destroyed, and the storage location where it should be stored. These policies can express (external) legal requirements or (internal) Service Level Agreements (SLAs). They can cover structured and unstructured data, and even paper documents. And the rules can cover SAP and nonSAP objects. Objects can either have their own policies and rules, or certain objects can also inherit the policies from other superordinate objects. For example, a scanned check can inherit the policies of the financial document it belongs to. To allow for maximum flexibility in the creation of policies, objects are grouped into object types and policies are grouped into policy types. Object types can be for example, sales order, financial documents, etc. Policy types can be tax or product liability, for instance. For each object type in IRM, you can enter policy types, and you can decide which policy types have to be considered for each object type before the set of rules is considered to be complete. For example, for object type financial documents, you would most likely enter policy type tax. You also have the flexibility to determine who is allowed to enter or maintain which policy types. You can establish, for example, that only one person from the finance department may enter or change rules for policy type tax. For each policy type you select the parameters that determine the retention policy and decide how long and where the data is to be stored.

Figure 11, IRM and Retention Rules

19

Figure 12, Example of Retention Rules in IRM shows an

example of rules that may be entered to reflect the retention requirements derived from tax laws (policy type “tax”). Because these laws differ from country to country, it is not surprising that one of the major parameters that determines how long a document has to be maintained is the country. Note also that tax laws may not only require a certain retention period, but also the location where the data is to be stored. This means that in your retention policy you can already designate an area in the storage system, where specific data is to be archived, much as you would do in Archive Routing. Country

Min. Retention Time

Max. Retention Time

Unit for Retention Time

Archive Store

Financial Documents

USA

6 7

-

Years

US_01

Financial Documents

DE

10

-

Years

DE_01

Sales Documents

USA

Figure 12, Example of Retention Rules in IRM

Retention Management: Legal Hold Adhering to data retention policies is a prerequisite for any ILM strategy. In addition, it is necessary to be prepared for exceptional prolongations of the retention period by legal holds. As mentioned, legal hold and e-discovery are very current topics companies are dealing with, especially in the United States, due to the recent changes to the e-discovery law, which went into effect at the end of 2006 (see Background to ILM, under Why ILM?). Legal holds are kind of “freezes” placed on information and data that is or may be involved in a legal case. This means that even though the retention period for certain data may have already expired and the data could be destroyed, if a legal hold is place on it, it must be retained until the legal proceedings have been completed. These issues also need to be addressed by ILM. The legal hold management function was developed using SAP Case Management, and is therefore also tightly

integrated with existing SAP functions. It allows you set up legal cases and manage all information related to that case. Once a legal hold has been set on a specific document or group of documents, all other policies and rules for that document are overridden until the legal hold has been lifted. This ensures that no document is destroyed until the legal proceedings or case has been closed. It is also possible to have several cases per object. Legal holds are handled as exceptions in IRM. To ensure that both data in the database and archived data in the storage system are covered by the legal hold indicator, metadata is passed in the form of WebDAV properties to the storage system (see Figure 13, Legal Hold).

of “template” for special reports that have to be adapted to the specific requirements of the case under consideration. For all objects associated with a case and discovered either manually or via a discovery report, SAP offers the option of extending the legal hold to objects belonging to the same business process. This is done by using the knowledge of the business relations that is also utilized in the Document Relationship Browser (DRB). The discovered objects are not copied into the legal case, so as to avoid redundancy, but key fields of the data and documents relevant to a specific case are stored persistently. The function also collects and keeps corresponding linked documents.

In addition, legal holds are not only passed to the store for archived data. They are also considered when the option “data destruction” is chosen to select data for destruction in the database.

Figure 14, Automated e-Discovery

Figure 13, Legal Hold

Retention Management: Automated e-Discovery Support Whereas legal hold deals with preventing the destruction of information, e-discovery deals with collecting all information that is relevant to a specific case. In a large application system environment this is not always easy. The new retention management functions of SAP allow you to automatically or manually collect the information related to a specific legal case. E-discovery for SAP data is facilitated by special reports that are connected to a legal case. SAP delivers some example reports that can be used directly or serve as a kind

20

Retention Management: Archiving and Destruction Due to its flexibility, the IRM is the ideal tool to store and manage all retention rules in your company. However, the true value of the IRM becomes evident for those object categories for which the rules are enforced automatically. To this end, we have enhanced archiving objects for applying retention to business objects within the SAP Business Suite. Figure 15, ILM Actions of an ILM-Enhanced Archiving Object

shows a write program selection screen of an ILMenhanced archiving object delivered with ILM from SAP. It offers three different ILM actions: archiving, selection for destruction, and writing of a snapshot.

Figure 15, ILM Actions of an ILM-Enhanced Archiving Object

The archiving ILM action allows archiving of businesscomplete objects and corresponds directly to the standard archiving function described in the first sections of this white paper. The system runs an archivability check to ensure that the object is really business-complete. With the ILMenhancements, during the write phase the system checks the IRM rules for each object instance to determine how long and where the object information should be stored. Based on the rules, the system writes the data to a different archive file; in other words, it “organizes” the archived data based on the rules you have entered in IRM. Compared to archive routing in the SAP standard you gain the flexibility that the parameters for the rules do not have to be identical to the parameters of the selection screen. You also do not have to start a special archiving run for each combination of parameters relevant for the determination of the rules. In the case that you have already archived data and have archive files from previous archiving runs, ILM from SAP provides an ILM file conversion function. This function enables you to adapt your old archive files to fit into the new ILM environment. When you convert these files, you obtain new files containing one to one copies of the old instances, but sorted according to the rules you defined in IRM. Once converted, you can easily apply your ILM policies also to data you archived years ago. When you choose the data destruction instead of the archiving ILM action, the system does not run an archivability check. Instead, it checks the retention rules you entered for the archiving object in IRM and makes sure

21

that only the data that can be destroyed immediately is written to an archive file. In this process it also takes legal holds into account. The archive file creation is not just an intermediary step. It also helps to clarify what data is actually going to be destroyed. With this option you can finally remove obsolete data from your database, even though you do not want to use archiving. It allows you to remove data from the database, even if it is not business complete. This could be data whose business process was never formally closed, but that is still old enough to be destroyed without any problems. Because the system makes sure that only the data is selected that is destructible according to the rules in IRM, you are ensured transparency in case of an audit and can prove or document which data was finally disposed.

Figure 16, Destruction of Data

Files that have been created with the data destruction ILM action cannot be stored. Instead, after the data to be destroyed is removed from the database, the delete program also deletes the file written into the file system by the write program. The “detour” via the archiving program ensures consistent treatment of data destruction for archived and online data. The snapshot ILM action is important in conjunction with the retention warehouse (see Main Concepts and Technologies in Retention Warehouse). At this point we just want to mention that when you choose this option, the system performs no other checks when writing the archive file besides the restrictions given in the selection variant. In other words, this option allows you to write recent data from open business transactions (not business-complete data) to the archive file. To make sure that you have no data loss

the files written with this option cannot be selected by the delete programs. As a result, with the “snapshot” option you always create redundant copies of data from the database. Here, too, the system calls the IRM for each object to make sure that the content of each file contains only those objects for which the same IRM rule applies. However, the retention period for snapshots always remains unknown: first of all, because in most cases the date of the beginning of the retention period cannot be calculated yet, and second, because there is no requirement to retain the snapshot data, as long as the data is still available in the original system. Retention Management: ILM-Aware Storage Integration An ILM strategy can only be complete if both the application side and the storage side are considered. Therefore, an essential part of retention management at SAP is ILMaware storage integration, based on secure, preferably WORM-like, storage technology.

Figure 17, ILM-Aware Storage Integration

In an ILM strategy, the life cycle actions that occur with the data from your SAP system are based on the rules and policies entered in IRM. Through the enhanced WebDAV interface (see glossary for a definition) not only data (ADK files) is transferred to final, ideally WORM-like, storage, but also metadata is passed on as WebDAV properties. In order for the storage system to understand and allow actions on the stored data based on the rules, it must be able to receive this metadata and act accordingly. If it does, it is considered to be ILM-aware. Storage vendors can obtain an ILM certification from SAP to show that their storage systems are ILM-aware. To enable you to use ILM-aware storage for your archived data, you now have the option of also storing your archive files using the new WebDAV interface. When you choose

22

this option, the archive files are stored in a hierarchy that reflects the parameters which determined the retention rule in the IRM (see Figure 19, Archive Hierarchy: Structured & Unstructured Data). Snapshots (copied data) and truly archived data (moved data) are stored in separate hierarchy paths, whose nodes are formed using the parameters for the determination of the IRM rules as the origin. For example, in Figure 19, Archive Hierarchy: Structured & Unstructured Data the origin of node USA is country. The node 2000 is derived from the value entered in IRM for beginning of retention period. Note that the hierarchy is set up in such a way that one node reflects the beginning of the retention period for archived data or the time that the snapshot was taken. The new WebDAV interface allows for a clear structure for storing data, and also facilitates the passing of properties to the storage system for each node in the hierarchy and for the file itself. Properties with well-defined semantics are for instance the origin mentioned above for the creation of the hierarchy, and even more importantly, the beginning and the end of the retention period as calculated for each archive file with the help of IRM. For example, if you display the node USA (Figure 19, Archive Hierarchy: Structured & Unstructured Data) the origin property country is attached to the field USA in the hierarchy. On the node 2000, the properties beginning of retention period equals 2000 and end of retention period equals 2007, because in the IRM rules a 7-year retention period was entered. The storage partners that have been certified for the WebDAV interface understand the semantics of these properties. Thus they can map the retention requirements determined for the data and protect the data accordingly against destruction. Tip! SAP has a special certification program through which storage partners can certify that their interfaces and storage offerings are ILM-aware. In Version 1.1 of the ILM certification (available for partners from August 2007), storage partners can certify that their storage interface can receive and store the WebDAV properties and metadata from the SAP application system. In Version 2.0 (BC-ILM 2.0), to be available in April 2008, storage partners can certify that they can also interpret the semantics of these properties. For more information on partner certification see https://www.sdn.sap.com/irj/sdn/icc Integration and Certification Integration Scenarios

Figure 18, ILM-Aware Storage Integration and Documents

It is also possible to manage the retention of the unstructured data that is attached to the structured data. As always, you store the attached unstructured data in a storage system using the ArchiveLink interface. The references to the locations of the unstructured data are then stored in a second hierarchy that corresponds to that of the structured data (if this exists already). The location of each document (incoming or outgoing documents, scanned invoices, etc.) is stored in the corresponding collection (see glossary for a definition) in the mirrored hierarchy as its hosting archived business object. This way, the unstructured data appears close to its corresponding structured data and inherits the same life cycle metadata.

When you are choosing a storage system to support you in your ILM strategy, it is important to check whether it meets certain criteria. For example, it should support both the nondeleteability as well as the destruction of data on request. This will ensure that your data is kept for the entire retention period required by law, and then destroyed at the correct point in time. Another important factor to consider is whether the storage system supports you to reduce redundancy as much as possible. Likewise, a storage system should be able to guarantee that stored data cannot be changed or modified in any way. From our experience, WORM-like magnetic storage technology is best-suited for the purpose of ILM from SAP. So far we have introduced the basic concepts, terms, and technologies involved in both data management and retention management, and indicated the roles they play in ILM. Now we want to focus on the third cornerstone of ILM.

Main Concepts and Technologies in Retention Warehouse An ILM strategy is not complete without taking into consideration that a lot of data has to be retained longer than the system is available in which it was created. System decommissioning is becoming increasingly important especially for large companies. With mergers and acquisitions, the trend towards system landscape harmonization and consolidation, and constant efforts for cost-cutting, some large SAP customers are many systems they are trying to decommission.

Figure 19, Archive Hierarchy: Structured & Unstructured Data

Note that even though unstructured data is stored using a different interface, it is entirely possible and even intended that both the structured and unstructured data is stored in one and the same enterprise archive, depending on the interface openness of the storage vendor.

23

So far there has not been a standardized method for approaching such endeavors, which bring with them a whole new layer of complexity. Most data can only be interpreted in the context of its original system and data stored in SAP archive files is no exception. However, data from decommissioned systems is not exempt from legal compliance issues or from tax audits. To meet these challenges up to now customers have had to create their own strategies and solutions.

With the retention warehouse, SAP now offers a standardized method for system decommissioning in answer to these pressing issues. It allows you to reuse the archived data outside the original system in a central retention warehouse. The retention warehouse is made up at least of a stand-alone SAP NetWeaver system or a reused SAP NW BI system (plus SAP ERP for DDIC information in the first version of ILM from SAP). Retention Warehouse: The Process With ILM from SAP, a system that is to be shut down is first emptied of its data using standard and ILM-enhanced archiving objects and programs. Here it is important to not only archive the data objects themselves, but also context and customizing information, such as currency information, and country codes, for example. In a first step you archive all the business-complete data. Next you use the snapshot functions to archive data from objects that are not businesscomplete yet and the context data we just mentioned. The retention warehouse is tightly integrated with the other cornerstones of ILM, to ensure, for example, that data from decommissioned systems is also covered by the retention policies in IRM. This means that retention times or even legal holds apply to the data from the decommissioned system in the same way as they do from other systems. Once the original system has been shut down and all the necessary information has been moved to the retention warehouse, it may be some years before it is needed again. If, for example, an audit is pending, you can then select specific data and load it into the SAP NW BI for reporting purposes. This way, auditors can view the data even from decommissioned systems. In the next section, we describe in more detail, how the retention warehouse processes work. Retention Warehouse: Transfer of Context Data to the Archive The snapshot option introduced in the retention management section is a first step towards helping to retain all data even after a system has been decommissioned. In addition to the business complete data that can be moved to

24

the archive with the standard archiving option, it is possible to copy data from open business processes to the archive to be retained as well. This data is kept separately from the archived data. The snapshots belonging to potentially open transactions can therefore be used as a good base for migration projects that take over the open business transactions into follow-on systems. Having moved all business-complete and business incomplete transactional data out of the system to be decommissioned, it is also necessary to enrich the archive with context information. For this ILM from SAP offers special programs to move complete customizing or master data, or other information not covered by the archiving objects to the archive. You can then decide on a table-bytable basis, which of the information you want to retain. To facilitate the selection of relevant tables, SAP already delivers preconfigured programs to cover information defined as retention relevant for tax reasons. Note that beside additional data from database tables also knowledge about object relations and domain values can be moved to the archive in a standardized way. After you have “emptied” out the system, it can be shut down. The data is now located in the retention warehouse. Retention Warehouse: Building Work Packages for BI Retaining all necessary information for an audit in the archive is an essential prerequisite for fulfilling retention rules. However, it is equally important to provide sufficient reporting capabilities for the archived information. This has to be done without increasing the cost of storage for the archived data. Assuming that the archived data is accessed rather seldomly, it is not economically feasible to permanently provide all indices on the data necessary for fast and comfortable analyses during an audit. Therefore we chose a two-step approach to combine cost-efficient storage with comfortable reporting: In a first step a work package is defined using the archive hierarchy built from the rules maintained in the IRM to identify the relevant data in the archive; in a second step the data defined in the work package is transferred temporarily from the archive into SAP NW BI for detailed analysis.

The starting point for each work package is a list of archive files created with one of the components included in ILM from SAP, called ILM Store Browser. It allows you to navigate along the hierarchy and pick the branches of the archive that contain data needed for the desired analysis. The ILM Store Browser can make use of the origin property that was passed to the store when the nodes of the hierarchy were formed and it allows you to apply filters when browsing the hierarchy based on this origin. For example, it is possible to restrict the browser to certain countries or fiscal years assuming that those were used in defining the IRM rules.

In the future, the retention warehouse will provide predefined queries for frequently used reports and analyses. The queries are based on the generated data, so that analyses that have been defined once can be reused. It is also possible to define individual reports on the created infoproviders. The retention warehouse keeps track of the composition and transfer of a work package, so that the data can be completely and automatically removed from the BI, once the analysis on the transferred archive data has been completed. Several work packages can be used in parallel in the Retention Warehouse and are considered to be completely independent data sets, which are protected through authorizations and namespaces. In this way, it is possible for user departments to run their own analyses, while an audit can be run in parallel. The work packages for both areas can partially or completely use the same archived data set. Since Retention Warehouse is tightly integrated with SAP NW BI, you have the entire spectrum of analysis tools and reporting capabilities available.

Figure 20, Selecting Files for Export in ILM Store Browser

The outcome of the work with the ILM Store Browser is a list of archive files (including format descriptions and business content information) containing data that may be relevant for the audit, called a work package. With the possibilities to further filter the data before transfer to BI, you can make sure that the end-user (e.g. auditor) only sees the information he or she is authorized to see. Retention Warehouse: Flexible Reporting and Audits in BI In preparation for reporting, you can set up the infrastructures required for queries in BI, based on the format description and business content information you also exported into the work package, and the requirements of the audit. The infrastructure also includes infoproviders. Once you have created these infoproviders, you can temporarily fill them with data from the archive.

25

Retention Warehouse: Additional Use Case While the retention warehouse was primarily built to be used for system decommissioning, it could also be used to offer a comfortable audit environment that has little impact on the OLTP system. Assuming that you archive data on a regular basis from your production system, you could use both the snapshot option of the archiving objects and the program to transfer context information to the archive on a regular basis. This snapshot data is not deleted from the production system, and, equally important, it is not indexed, so that it is not included involuntarily in reports run in the production system. When you set up a retention warehouse with access to the store used in the production system you can use the complete functionality described for the retention warehouse, already during the life time of the production system. The only thing necessary for this is a copy of the archive administration data from the production system to the retention warehouse. With such a setup it is possible to

obtain a similar functionality as with DART, which would eliminate a major part of storing data redundantly as was done in the past.

ILM from SAP vs. Enterprise Content Management (ECM) According to AIIM, the definition of ECM is: “…the technologies used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. ECM tools and strategies allow the management of an organization's unstructured information, wherever that information exists.” Thus, ECM deals with document management, input management, output management, records management, Web content management, digital asset management, email management, forms management, collaboration, case management, business process management, and search. It also includes the storage of unstructured data, often referred to in this context as “archiving”. In contrast, ILM from SAP manages the retention of primarily structured information during different stages of its life time, from the time data is first created in the application until it is destroyed either in the application or usually in the storage system. Along this process different aspects are involved, such as retention policy management, legal holds, and ensuring that the same rules continue to apply to the data, even after it has been moved to storage (ILM-aware storage). ILM from SAP also provides functions for legacy system decommissioning. Within this context, the retention of unstructured data that is attached to the structured information needs to also be considered; for example scanned invoices, which are attached to a financial document in the SAP application from a US system receives the same retention constraints as the financial document (e.g. retain 7 years). Thus ILM covers the retention also of the unstructured information attached to structured information. (See also Retention Management: ILM-Aware Storage Integration).

26

Availability ILM Solution from SAP The first version of the ILM solution from SAP will be available for ramp-up customers mid 2008 with SAP NetWeaver 7.0 Enhancement Pack 1. A limited number of ramp-up customers will be selected through a special readiness check process. Unrestricted shipment is planned for the beginning of 2009, after the ramp-up process has been completed. ILM solution from SAP will be on the global SAP price list. ILM-Aware Storage Certification The WebDAV Storage Interface for the ILM Solution from SAP Certification (BC-ILM 2.0) for storage partners will be officially available in April 2008.

Conclusion and Outlook With the three cornerstones of ILM, data archiving/data management, retention management, and retention warehouse, SAP provides a comprehensive product offering to support you in your ILM strategy. Many of the tools we presented in this white paper are tried and tested and some have existed for many years. The new functions were developed in close cooperation with customers, and are being specifically designed to meet real-world pain points and concerns. They are tightly integrated with the standard data archiving/management tools and care was taken to focus on non-disruptive innovation, openness, and completeness. ILM is currently in its maturation phase, both in terms of products that are offered in the market to support ILM and in terms of companies that are actually implementing ILM strategies. The main goal is to reach a greater degree of automation and completeness, which also includes better integration with and interfaces to storage partner and other partner products. Most new developments, especially at SAP are aimed towards this goal.

At SAP we believe that a holistic approach is indispensable when trying to implement an ILM strategy. This also means that application vendors must form closer ties with storage vendors. SAP has been extremely active in this area, and has, among other things, set up additional storage partner certification programs for data archiving and ILM-aware storage. ILM is not something that can be implemented over night. It is, however, something that can begin right now, especially within an SAP environment. How can you start? Begin by using the standard tools and functions SAP offers—as ILM from SAP has been developed to be as non-disruptive as possible, you can lay the groundwork now for easy integration of ILM from SAP later. Also, you can further prepare by following some important recommendations: Avoid the storage of archive files onto non-magnetic media to facilitate the destruction of data in the future; using archive routing can also help in this respect, as it automatically segregates your archived data into specific areas in the storage system; when you archive, make sure you do so in correlation with your retention rules and avoid mixing the data in your archive files. This makes managing and finding information much easier in the future; upgrading to SAP ERP 6.0 will allow you to obtain new ILM functions as they are being rolled out; finally, if you are thinking about centralizing your storage system or implementing a new storage system as part of your ILM strategy, make sure your storage partner is certified for ILM-aware storage.

Additional Information For more information on the following topics, visit the provided links: Information Lifecycle Management (ILM) o http://service.sap.com/ilm Governance, Risk, and Compliance (GRC) o http://www.sap.com/grc Partner Certification o https://www.sdn.sap.com/irj/sdn/icc

SAP Insider Articles http://www.sapinsideronline.com/ o

27

From Data Management to Information Lifecycle Management

o o

(July 2007) Time is Ripe for ILM (Special Feature), (April 2007) SAP’s Strategy for End-to-End Success (January 2008)

GLOSSARY Archive Administration (transaction SARA) – Central data archiving interface for most user activities in data archiving, such as scheduling write and delete jobs, setting up and deleting the archive index, storing and retrieving archive files. Archive Development Kit (ADK) – Technical framework and basis of SAP’s data archiving solution. ADK is a software layer between the SAP applications and the archive, and provides the runtime and administration environment for all SAP data archiving functions. ADK also provides a programming interface for developing archiving programs at SAP or at customer’s site. Archive Information Structures – Central element of the Archive Information System (AS). Represents a type of index based on a field catalog and can be used to find archived data in the archive. Archive Information System – A generic tool for searching archives, which is fully integrated into data archiving. The search and display of data is based on archive information structures, which the user can define and fill with data from the archive. ArchiveLink – Interface for controlling communication between an SAP application and an external storage system. It enables data and documents from the SAP application to be stored and accessed. Archive Routing – Central tool for setting up rules and conditions based on which data is automatically routed to specific areas in the storage system. Automatically segregating archived data in this

28

way supports legal compliance strategies, by facilitating the destruction of data, and providing companies with more automation and control over the storage location of specific groups of data. archiving class – Mechanism used to archive data that cannot stand alone from a business point of view and is archived together with application data objects. archiving object – Logical object of related business data in the database that is written from the database to an archive file and then deleted from the database once it has been archived successfully. An archiving object also includes the corresponding archiving programs as well as Customizing. business complete – data that is unchangeable and no longer needed in everyday business—in other words, data that you no longer need to access on a regular basis business object – Represents a central business object from the real world, such as an order or an invoice. From a technical point of view, a business object is an instance of a business object type that has concrete values. Business objects are managed in the Business Object Repository (BOR) collection – A node in an archive hierarchy, which is used to store archived documents, called resources, or other collections. The concept of a collection is similar to that of a folder in a file system. Collections are physically stored in a storage system. There are three collection types: 'S' = System, 'H'= Home and 'A'= Application.

data –

The physical representation of information in any form

data retention policy – Set of rules governing the storage (archiving) and destruction of data deletion of data – Process that removes information from matter or energy, or prevents it from being accessed. Data can be deleted without any loss of information if it is ensured that the information contained therein is stored redundantly in other data. destruction of data – Process that deletes the data while either taking the loss of information deliberately into account, or explicitly having this in mind. electronic discovery – Also called e-discovery, refers to any process in which electronic data is sought, located, secured, and searched with the intent of using it as evidence in a civil or criminal legal case. end-of-life of data – The point in time when the end of the retention period has been reached and the data can be destroyed end-of-life of system – The point in time when a system is permanently shut down. This concept is relevant for ILM, because often times due to legal requirements, data lives longer than the system in which it was produced. Hierarchical Storage Management System (HSM) – A storage solution that automatically distributes data according to individually configured rules (such as the frequency of access) within a hierarchy with different storage media (such as hard disks, magneto-optical disks, and magnetic

29

tapes). An HSM system is represented to the accessing system as a file system that stores the files saved there in a file path that cannot be changed logically. information – “Information is information, not matter or energy” (Norbert Wiener, 1961). “Information is the single decision between … equally plausible alternatives” (H. Gardner, 1987). As such, information is the smallest unit of knowledge. Information Lifecycle Management (ILM) – Information Lifecycle Management is comprised of the policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost effective IT infrastructure from the time information is conceived through its final destruction. Information is aligned with business processes through management of policies and service levels associated with applications, metadata, information, and data. Information Retention Manager (IRM) – Tool used in the context of Information Lifecycle Management for storing ILM-relevant policies, e.g. data retention policies, residence time policies. It can be extended to include other ILM-relevant policy categories for data stored, for example, in TREX or BI and non-SAP data. legal hold – A type of “freeze” placed on data records, if legal authorities have decided that an organization must preserve certain data records when litigation is anticipated or confirmed. Records on which a legal hold has been placed must be retained (e.g. they cannot be destroyed) until the legal hold has been removed. object type – concept used in the information retention manager (IRM) for a grouping of objects within a particular

object category. Object types can be sales order, invoice, financial document, which all belong to the object category "business object". optical archiving – widely-used but inaccurate term for document storage. The storage system used is generally based on optical media (CDs, WORMs, etc.), which is why the term “optical” archiving is used. policy type – concept used within the IRM for the grouping of rules. Policy types can pertain to taxes (TAX), product liability (PROD), risk management (RISK) residence time – The length of time that must be exceeded before application data can be archived. Depending on the application, the basis for calculating the residence time can be the creation date, the posting period, the goods issue date, and so on. The residence time is usually given in days. retention management – Methods and processes to manage the retention of data based on legal requirements. With the help of technology these methods involve managing retention policies, ILM-aware storage integration, destruction, legal hold management, and e-Discovery. retention period – Length of time that indicates, based on legal requirements, the maximum period of time in which a document, such as a financial document or an invoice, must be retained before it can be destroyed. Retention periods may vary from one country to another and for different types of records (e.g. financial data of an enterprise, receipts for tax-deductible purchases etc.) WebDAV Interface –

30

Interface used for communication between the SAP system and ILM-aware storage systems, to support ILM processes.

Suggest Documents