Expand +



Demystifying Java-Based Load Tests and Their Results

by Xiaoqing Cheng | SAPinsider

October 1, 2006

by Xiaoqing Cheng, SAP AG SAPinsider - 2006 (Volume 7), October (Issue 4)

Implementation teams often conduct custom load tests in large and mission-critical SAP projects to ensure that the newly installed system landscape will fulfill performance needs. A customer planning to install a new SAP system for 10,000 users, for example, could simulate the user load and determine if the system can meet its capacity requirements under various operating conditions.

Any custom load test presents implementation teams with several challenges. Simulating the right user load is a precise science, since teams need to test exactly the functionality that those 10,000 users will use, gather business requirements to derive an accurate system load profile, and properly simulate user behaviors with a load test tool.

But the most critical part of a load test is how to analyze system behavior when certain functional and performance issues occur during the actual load test, and to understand why they occurred. This is especially challenging for SAP project teams experienced with ABAP applications, but not as familiar with Java-based applications.

Project teams need an approach that can reduce the complexity of load tests. And once load tests are complete, teams need to understand their results and explain how Java-specific key performance indicators (KPIs) can help evaluate and optimize performance (Hint: looking only at response time will leave you shortchanged). Both the load test approach and the Java KPIs introduced in this article have been used extensively in SAP performance-tuning projects, in benchmark tests for several SAP NetWeaver releases, and on a variety of operating system platforms.

Why Conduct a Load Test?

A successful custom load test can ensure that the performance requirements of a complex software application, including the hardware landscape and the communication network infrastructure, have been met. Load tests are a best practice for large, highly customized, and mission-critical SAP system

How to Simplify Load Tests for Complex Java Applications

A complex Java-based application usually consists of many interacting system components. To identify performance issues, you need to drill down and pinpoint the particular component(s) causing the bottleneck. This is an extensive process, especially for Java applications. To identify the bottleneck components, you need an accurate architectural overview of the Java application, and specialized monitoring tools to analyze the complex interaction patterns between system components.

An alternative to testing (and stressing!) the entire application at once is to perform load tests on individual system components and then combine these results to obtain a performance overview of the complete application. This approach can significantly reduce the complexity of load tests and improve the efficiency of analysis efforts. This best-practice method involves three steps:

  • Derive component-specific load profiles from the overall load profile

  • Test one component at a time, while simulating the other components

  • Combine the individual component results to form the overall performance result

I'll explain these steps in more detail using the examples of two new SAP NetWeaver Portal benchmarks: the SAP Employee Self-Service (EP-ESS) Portal Benchmark and the SAP People-Centric Customer Relationship Management (EP-PCC) Portal Benchmark (see sidebar). You'll then learn how to measure your load-testing success with essential Java-specific KPIs that you might not typically think of when gauging system performance.

Step 1: Derive the Load Profile

A load profile refers to the number of user interaction steps per unit of time. The load profile can be calculated easily when you know, for example, the number of concurrent users and a user's average think time between interaction steps. Consider a user performing three SAP Employee Self-Service (ESS) tasks successively, such as recording working time, changing personal address data, and displaying a pay slip. Each ESS scenario is executed in an ESS session, which begins with a portal navigation step: A user clicks on his browser, and several interactions happen between the portal layer and the backend system.

We assume an average think time of t seconds between interaction steps and that the user performs five interaction steps within each ESS scenario. The overall system load profile is the total number of concurrent users divided by t.

But as you can also see in Figure 1, ESS user interaction steps that occur after session startup do not generate requests to the portal — they bypass the portal and go straight to the backend server. Therefore, the load profile differs depending on the user interaction pattern for the component involved: The load profile of the portal component is the number of concurrent users divided by 6t (one portal navigation step plus five interaction steps), whereas the load profile of the backend system is the total number of concurrent users divided by t.

Figure 1
A high-level look at the two different types of interaction patterns in EP-ESS scenarios

SAP NetWeaver Portal Benchmarks: A Key Reference for Java-Specific Load Tests

With the growing role of Java in the SAP environment, the need for a new SAP Standard Application Benchmark running on the Java stack of SAP NetWeaver became evident, as previous benchmarks have typically been ABAP-based.1 As a result, SAP now provides SAP NetWeaver Portal benchmarks, since the portal is currently the largest Java-based application SAP offers. Benchmarks represent typical business scenarios and enable customers to conduct comprehensive performance analyses and load tests, as well as compare the KPIs of SAP applications on different software and hardware platforms. Two SAP NetWeaver Portal benchmarks are available:

Sample benchmark results
  • SAP Employee Self-Service (EP-ESS) Portal Benchmark — The EP-ESS benchmark focuses on
    concurrent ESS users and their top-level navigation behavior in the portal. From a technical point of view, this benchmark tests the performance of the portal platform while launching typical stateful (multistep) business transactions.

  • SAP People-Centric Customer Relationship Management (EP-PCC) Portal Benchmark — The
    EP-PCC benchmark simulates the CRM business scenario of sales representatives carrying out portal navigations to display overview information of business activities, account management, and acquisitions. The technical background of this benchmark is to use SAP NetWeaver Portal as a rendering platform that retrieves unformatted business data from the CRM backend system and represents it as HTML pages in iViews.

For reference in your own performance projects, you can view a detailed description of the benchmarks as well as published benchmark results at To get a sense of what benchmarking results can look like, the figure above shows an example of a certified EP-ESS benchmark with 1,440 concurrent users on a two-processor, four-core, four-thread machine.

Step 2: Simulate the Interacting Components

One of the most important design goals of SAP NetWeaver Portal benchmarks is to focus the load only on the portal component. The backend system should not have any influence on performance. This design goal has been achieved in the EP-ESS benchmark by simply omitting the HTTP requests sent to the backend system from the benchmark script, as shown in Figure 1. This simulation concept applies generally to URL iViews that retrieve iView content directly from the backend system.

To simulate the CRM back end in EP-PCC scenarios (see Figure 2), we included a switch parameter in CRM Java iViews used in the EP-PCC benchmark. Through this switch parameter, a simulation function module in CRM is called instead of the original one in performance tests. A simulation function module returns precompiled business content without database access and any expensive processing — with a response time of less than 30 ms. A package of simulation modules is available as of mySAP CRM 4.0 SP8. Detailed configuration instructions are described at > Documentation.

Figure 2
The two different interaction patterns of EP-PCC scenarios

Step 3: Combine the Results

After successfully completing the load tests for each component, you can combine their individual performance results (depending on their specific interaction patterns) to determine the overall KPIs. Individual resource consumptions, such as CPU and memory, likewise must be summarized to obtain the total system resource consumption.

When we consider the average response time of user interaction steps, the response times of the individual components generally have to be added to the total response time — if the interactions between the components occurred sequentially, that is. In cases of parallel component interactions, the maximum component response time represents the total response time.

In the case of the EP-ESS scenarios, only one iView is displayed in each portal page. Therefore, the total system response time is the sum of the response times of the portal and the ESS component. In EP-PCC scenarios, the portal pages display between two and four CRM Java iViews, which are processed in parallel. This parallel processing is already taken into consideration in the average response time of the portal component when the CRM back end is simulated.

Each CRM Java iView calls several CRM function modules. The maximum sequential response time of the function modules called by one Java iView must be added to the response time of the portal component to build the total average response time.

Separating the testing process into the individual system components greatly reduces load-test complexity. But now how do implementation teams interpret the test results to better understand their application's performance?

Understand the Relevant Performance KPIs for Java

An essential part of performance analysis and optimization is using a set of well-defined KPIs. Good KPIs should correctly reflect performance requirements. They must also be accurately measurable, and in an ideal case, should indicate possible optimizations.

The response time of a user interaction step is the most important KPI in terms of the end-user performance experience. Response time in this context is defined as the elapsed clock time between the user input action and the display of the next screen. But comprehensive KPI analysis is about more than just response time. The unique characteristics of Java make a more comprehensive KPI analysis necessary. Next we'll discuss some additional KPIs that are particularly important for Java applications.

CPU Time

Related to response time is CPU time — a critical system resource shared by many concurrent users on a server. The CPU time is measured per user interaction step. For Java applications, this KPI has the same advantages as it does for non-Java applications, such as high accuracy and reproducibility. In load tests, it is also possible to measure the server's CPU utilization as an operating system counter. However, it is not suitable for accurate performance measurements and comparisons because CPU utilization depends on the current system throughput.

Memory Consumption

Memory consumption is another important KPI. It is quite a challenge to measure and analyze the memory consumption of Java applications. Both Java's flexible heap configuration and its automatic garbage collection (GC) have significant influence on memory behavior and result in varied performance.2 For a complete observation of Java memory consumption behavior, teams should consider three memory KPIs:

  • The framework space represents the memory requirement on a Java Virtual Machine (VM) after the SAP NetWeaver Application Server Java has been started and warmed up. This KPI depends on the deployment of applications on the Java VM. Using this KPI, you can calculate how much memory in the Java heap is still available to run applications on this Java VM. For measurement, log on to the SAP NetWeaver Application Server Java via the telnet interface, jump to the Java VM, and trigger full GC several times.3 Then you can view the "bytes after GC" counter in the GC log.

  • The user session space is the amount of memory occupied by a logged on, but inactive, user. After start-up and warm-up, measure the initial heap usage (M1) after several triggered full GCs as described above. You log N (for example, 100) users on to the system and let them execute some typical transactions, either manually or by using a load test tool. Keeping the users logged on but inactive, you again measure the heap usage (M2) applying the same procedure as before. The user session space equals (M2-M1)/N.

  • The processing space is defined as the average garbage-collected bytes per user interaction step. This KPI represents the dynamic memory consumption of an active user when processing specific user interaction steps. You can measure this KPI by totaling the garbage-collected bytes (B) during the execution of a large number (K) of user interactions steps, and then calculating the KPI as B/K. Using a load test tool to put a moderate load on a system for a certain time period (long enough to ensure that B is greater than the maximum heap size), the measurement results of this KPI are very accurately reproducible. According to current results, this dynamic part is dominant compared to the static user session space.

The memory consumption of a given Java application depends not only on the number of concurrent users, but also on how active the users are. This is why we use three memory KPIs, instead of the traditional one, to describe the complete Java memory behavior.

Comprehensive performance KPI analysis is about more than just response time. The unique characteristics of Java make a more comprehensive KPI analysis necessary.

So far, the response time, CPU, and memory KPIs we have introduced have been application-specific, and independent of actual heap configurations. Therefore, they are used either for optimizing Java applications in development or for the hardware sizing of Java applications in customer projects. From an administrative and operational point of view, we also need to tune the performance of the application by finding the optimal configuration of the SAP NetWeaver Application Server Java and the Java VMs.

Configuration Tuning

Because general discussions of configuration topics aren't within the scope of this article, here we'll focus on GC tuning — specifically, minimizing the performance impact of GC through optimal heap configurations. A lack in memory resources will result in very high GC activities. Therefore, when we consider GC tuning, we assume that the application is already optimized with respect to memory consumption and that the hardware infrastructure has been sized correctly.4 To describe GC, two KPIs are key:

  • Interval — The average time between two successive GC occurrences

  • Duration — The average elapsed time for completion of a GC cycle

Both KPIs can be calculated from a GC log for a certain time period. Most Java VMs supported by the SAP NetWeaver Application Server Java, such as Sun Hotspot VM and IBM J9 VM, provide generational GC. This means that by separating the heap into a young space and an old space, the most temporary objects can be recycled through faster, more frequent minor GCs in the young space, while long-life objects stay in the old space where they are only checked by less frequent but long-running full GCs. Our GC tuning recommendations, using our interval and duration KPIs, are as follows:

  1. The interval of full garbage collection should be kept above several minutes, and the duration below 10 seconds.

  2. The interval of minor garbage collection should be above 1 second, and the duration below 200 ms.

Some Java performance-tuning guides recommend a single GC KPI, called the relative garbage collection time, defined as duration divided by execution time. These guides recommend a relative GC time of between 5% and 20% for a well-behaved Java application. We prefer to consider our two GC KPIs using absolute time values. In our experience, long GC durations could cause a so-called Java resonance phenomenon because waiting requests are accumulated during the "stop-the-world pause" of GC cycles, especially when the application is under high load. In such a situation, performance is destroyed and system stability is endangered.

A Look at the Numbers

Figure 3 shows concrete values for all of the KPIs discussed above. These numbers give you a sense of what real load test results could look like, so you can start to benchmark your system performance against that of your counterparts in other companies.

Number of concurrent users with 10 seconds think time 1440 users
CPU utilization 98%
Average response time 1.940 s
CPU time per user interaction step 0.033 s
Framework space 208 MB
User session space 0.735 MB
Processing space 3.250 MB
GC interval 1.876 s
GC duration 0.200 s
Figure 3
Sample KPI values from the EP-ESS benchmark

These values are the result of the above-named certified EP-ESS benchmark with SAP NetWeaver Portal 6.0, SP13, running on a central server with 1230 SAPS per CPU core.


Customers often use custom load tests in large and mission-critical customer projects to ensure that the installed system landscape is scalable and will perform optimally. These tests can be quite complex, especially in a Java environment, due to garbage collection and other Java-related factors — but they don't have to be.

By testing individual system components and employing suitable CPU and memory KPIs — not just response time — you can significantly reduce the complexity of your load tests, better understand Java-specific performance issues, and optimize overall system performance. And with SAP Standard Application Benchmarks now available for Java-based applications, SAP developers, partners, and customers have a comparative reference for their performance projects.

For more information on performance and sizing, visit And for more on the Java-based SAP NetWeaver Portal benchmarks, visit

1 SAP hardware partners execute SAP Standard Application Benchmarks to demonstrate the performance of their hardware platforms and enable customers to compare different hardware vendors. They are also used in SAP internal performance, scalability, and sizing tests. As well-defined test cases of real-world applications, benchmarks stress system components in a realistic way.

2 Heap is the memory space a Java Virtual Machine uses to create and store objects. There are a variety of Java VM parameters to control the size, internal structure, and behavior of the heap. Automatic garbage collection enables the Java runtime to take over the responsibility of memory management. The Java programmer only decides when to create objects. These objects can then be used through references. Once an object has no more references, it will be automatically detected and removed by the garbage collector.

3 I recommend looking at the heap usage after several full GCs so you see only the live objects, not the garbage. This also ensures a high reproducibility of results.

4 To understand the impact of garbage collection on the average response time of load tests and related Java programming guidelines, see "Taking Out the Trash: Avoid Performance Bottlenecks from Java Garbage Collection" by Susanne Janssen and Rudolf Meier in the July-September 2005 issue of SAP Insider (

An email has been sent to:

More from SAPinsider


Please log in to post a comment.

No comments have been submitted on this article. Be the first to comment!