GRC
HR
SCM
CRM
BI


Article

 

It Takes the Right Tools and Metrics to Monitor SLA Compliance

by Excerpted from SAP Professional Journal | SAPinsider

January 1, 2001

by Excerpted from SAP Professional Journal, November/December 2000 SAPinsider - 2001 (Volume 2), January (Issue 1)
 

A Service Level Agreement (SLA) details which services end users expect, and how quickly service outages will be restored. But how do you measure SLA compliance? Richard C. DeAngelis, Jr., in his recent article “Defining SAP Service Level Agreements: An IT Manager’s Survival Guide,” published in the November/December 2000 issue of the SAP Professional Journal, explains that it comes down to using the right tools to gather the right metrics.

     The metrics you need to gather to measure SLA compliance depend on the level of service that needs to be supported. These levels of IT support generally fall under the groups listed in Figure 1

  • The level-one support group: This group is often the call center or help desk. It should primarily monitor for application and system status and service level compliance.

  • The level-two support group: The responsibility of this group is to handle basic problem resolution at the infrastructure level. The level-two support group requires more detailed information on the systems and applications than the level-one group does, but avoid giving the level-two support group too much data.

  • The level-three support group: This group is charged with detailed problem resolution or capacity analysis, and needs as many metrics as is necessary to capture pertinent data. Good level-three tools employ artificial intelligence (AI) and sophisticated filtering technologies to aid the troubleshooter in getting to the data that really matters.
Listing 1 Ascending Levels of IT Support

The Tools for Capturing Useful Metrics

According to DeAngelis, R/3 provides a control protocol called CCMS (Computing Center Management System) that allows for tracking and alarming for a variety of known errors and performance conditions. However, there is much to track in a sizable R/3 environment, and the causes of errors are not always easy to determine. You may have to use third-party tools, like those from Envive, Luminate, realTech, and syskoplan.¹ Other companies like Tivoli, Hewlett-Packard, or BMC also provide products that might help in this area, but the products’ abilities should be checked thoroughly.

     DeAngelis suggests five different sources of service level data for R/3 that allow the capture of useful metrics:

  • SNMP Mangement Consoles
  • Intelligent Systems Agents
  • Application Performance Monitoring Systems
  • End-User Transaction Performance Monitoring Systems
  • The ARM API

Details about each are provided in Figure 2.

     While each of the tools mentioned here can help with SLA maintenance, you should always first check the upcoming features of the SAP system itself. Starting with Release 4.6C, the System_Administrator_Workplace is available in the CCMS, which provides a view of your systems status at a single glance. The new reporting functionality (RZ23) for the values stored in the SAP Performance Database was also delivered with 4.6C. SAP is working on these products to make them even more reliable in the next releases, and the key transaction RZ20 (Monitoring Architecture) and RZ21 will also be redesigned.

SNMP Management Consoles

“SNMP” management consoles make good tools for the level-one support groups because they are pertinent to health and status monitoring. SNMP managers, such as HP OpenView Network Node Manager or IBM NetView, use a simple but effective polling architecturewith a management agent based on MIB-II to collect and trend information on the availability of any device. Cabletron Systems and Computer Associates offer similar SNMP managers in their management tool suites. The data you extract from MIB-II predominantly concerns real-time connectivity and proper operational status of the MIB-II managed devices. The SNMP manager can also define and receive alerts based on MIBs exceeding predefined variables.

Intelligent Systems Agents

Intelligent systems agents, useful to all three levels of the IT support staff, run in the background on servers and PCs. The agents used by Tivoli, HP OpenView VantagePoint, or BMC Patrol are capable of tracking thousands of system health, performance, and utilization variables. They derive this critical health data from system log files, operating system processes, and other system-specific resources. These metrics are used to set up alarm thresholds that, when exceeded, generate alerts to the level-one support group, or initiate preconfigured automatic administrative actions like rebooting a dial-in server, which can greatly minimize delays.

Application Performance Monitoring Systems

With the preponderance of mission-critical applications in recent years, many of the Monitoring Systems systems management vendors have been investing in add-on extensions to their OS and systems monitoring agents that add monitoring and threshold-based alarm management to service metrics specific to R/3 application modules. HP OpenView Manager for SAP R/3 features an API that links to CCMS, and allows for R/3 programs to be monitored for trouble signs. Events generated from R/3 events can be escalated through pagers, cell phones, and e-mail, as well as tracked and graphed from the Service Level Reporting module. HP OpenView Manager’s “Smart Plug-in” is used in conjunction with HP’s VantagePoint Intelligent Agent to supplement the existing OS and system monitoring capabilities with additional knowledge on R/3. Envive’s Inspector monitor deploys a multitiered management collection architecture on Windows NT or 2000. Inspector keeps a datamart updated with SAP R/3 vital signs that are collected by the Inspector Server, which communicates with the managed environment of R/3 servers. Envive has also been evolving their R/3-focused management tools into a portal-based, “pay as you go” strategy. Other tools in this space include Tivoli, with its R/3 Manager, and BMC, with its solution based on Patrol.

The ARM API

The ARM API Management via the “ARM API” is well suited to the level-three staff because it emphasizes overall health and performance, and filters out extraneous events and data, presenting only relevant events with root-cause analysis of system faults. In combination with application-related data derived from intelligent agents, the ARM API can quantify the types of delays that really affect the end user. Other tools are also coming to market from Computer Associates, Candle, Hewlett-Packard, and other systems management vendors that address the need for AI in determining the root cause of R/3 performance degradation.

End-User Transaction Performance Monitoring System End-User Transaction Performance Transaction (end-to-end) monitoring tools monitor or simulate actual end-user transactions. Monitoring Systems Monitoring tools like HP Web Transaction Observer or Vital Sign’s VitalSuite place an agent on the client. For Web clients, either ActiveX or JavaScript can be used. In either case, the end users must accept the request for monitoring in order for their transactions to be instrumented. Synthetic transaction tools like S3 from NextPoint Networks (now owned by NetScout) and load-testing tools by Mercury Interactive may be more appropriate for applications where users are reluctant to be tracked. Companies like Vital Signs (now owned by Lucent) have agents for non-Web applications like R/3. These agents can be installed on the corporations’ computers, where users do not have a choice about being monitored. Another good approach for R/3 is Envive’s StopWatch, which users a “blackbox” approach to R/3 end-user transaction monitoring that tags and retrieves information from the R/3 data packets transmitted from the clients to measure end-user performance of the SAP GUI screens. With this information, StopWatch can provide statistics such as bytes transferred per SAP GUI screen session and average throughput for the different R/3 network segments. Other tools in this area include Tivoli Application Performance Management together with Tivoli TDS.
Figure 2 The Five Sources of Service Level Data for R/3

      You might also find necessary information, such as end-user response time values, in the new transaction ST03N (since 4.6C), or in the Monitoring Architecture, where the values for transactions are also available. You can also create alerts or, for example, send an e-mail if a threshold is exceeded. This is all customizable in the R/3 system itself.

SLA Resources & References

BMC Software, Inc.
www.bmc.com

Cabletron Systems, Inc.
www.cabletron.com

Candle Corporation
www.candle.com

Computer Associates International, Inc.
www.ca.com

Envive Corporation
www.envive.com

Hewlett-Packard Company
www.hp.com

IBM Corporation
www.ibm.com

Luminate Software Corp.
www.luminate.com

Mercury Interactive Corporation
www.mercuryinteractive.com

NextPoint Networks2
www.nextpoint.com

realTech AG
www.realtech.de

syskoplan AG
www.syskoplan.de

Tivoli Systems, Inc.
www.tivoli.com

Vital Signs³
www.vitalsigns.com ²


¹ Check www.sap.com/solutions/compsoft/cspdirectory for SAP-validated products. To obtain more information on vendors discussed here, please see the “SLA Resources & References” section at the end of the article.
² Now owned by NetScout Systems, Inc.
³ Now owned by Lucent NPS.

The majority of this article is excerpted from an article published in the November/December 2000 issue of the SAP Professional Journal — “Defining SAP Service Level Agreements: An IT Manager’s Survival Guide,” by Richard C. DeAngelis, Jr. To receive a complimentary trial copy of this issue, contact SAP Professional Journal at sheila@SAPpro.com.

An email has been sent to:






More from SAPinsider



COMMENTS

Please log in to post a comment.

No comments have been submitted on this article. Be the first to comment!


SAPinsider
FAQ