in Programming

Java Virtual Machine Tuning under JVM 1.4.2

Here’s an article I wrote about tuning Sun Java JRE 1.4.2 some time ago. I’m only posting it now to save it from loss when I leave CBC.ca.

This page is intended to document some proposals and empirical data gathered while attempting to tune the JVM used for running web applications on CBC.ca’s Java servers.

Topics to be covered:

  • Impact of using different garbage collectors
  • Impact of tuning garbage collectors
  • Maximum and minimum heap size settings
  • [potentially] Impact of using different JVMs other than the Sun JVM. For example, compiling Java code into native OS code using gcj? among others.

Overview of Memory Allocation in the JVM

  • eden: where objects are created in the JVM
  • generational garbage collection: the division of the memory space into segments (generations) sorted by object lifetime, and the use of different garbage collection algorithms for each generation
  • minor collection: collection of short-lived objects (those residing in eden or survivor spaces)
  • major collection: collection of long-lived objects (those residing in the tenured generation)
  • permanent generation: a section of the tenured generation where the reflective data of the JVM is held (class and method objects)
  • survivor spaces: the two slots where objects surviving reap in eden get moved to in preparation for movement into the tenured generation
  • tenured generation: objects whose lifespan is relatively long
  • young generation: eden plus survivor spaces

This diagram (taken from the Hotspot GC tuning document) illustrates the relationships of the various memory segments:

Objects are created in eden. If they survive eden, they move into one of the two survivor spaces (only one of which is in use at a time — it’s a double-buffering scenario) in preparation for a move into the tenured generation. Once moved into the tenured generation they stay there until they die and are garbage collected.

Tuning Garbage Collection

Motivation and Background

The 1.4.2 Sun JVM comes with a choice of several garbage collectors. The default collector is single-threaded and designed to be effective for most small applications. Its parameters are not optimal for many server applications and server machines (those with many gigabytes of RAM and several processors). Since these limitations apply to CBC.ca’s applications and server infrastructure, it makes sense to investigate the possibility of using other garbage collectors and/or tuning the GC parameters.

Summary: Garbage Collectors in the 1.4.2 JVM

As a starting point, let’s summarize the different GCs (aside from the default collector) in the 1.4.2 JVM and the recommended circumstances under which they be used. These are all taken from Sun’s [http://java.sun.com/docs/hotspot/gc1.4.2 Tuning Garbage Collection with the 1.4.2 Java Virtual Machine] document.

The GCs are all “generational” collectors in the sense that each generation has its own garbage collection strategy. The JVM divides the heap into different ”generations” which will be discussed later (under the tuning section entitled “Tuning the Generation Sizes”)

Name Overview When to Use
Throughput Collector Use a parallelized young generation collector, but use the default tenured generation collector. Use to improve performance with multiprocessor machines where young generation minor collections are frequent.
Concurrent Low Pause Collector Use a concurrent tenured generation collector with a default young generation collector. Can be used in conjunction with a parallelized young generation copying collector. When the application can afford to share processor resources with the garbage collector. Use on applications with a large tenured generation (long-lived data) running on multiprocessor machines.
Incremental Low Pause Collector Incrementally collect a portion of the tenured generation to reduce the impact of major collections. Use on applications which can afford to trade throughput for shorter tenured generation pauses.

Based on these descriptions, it makes sense to hypothesize that both the Throughput Collector (TC) and the Concurrent Low-Pause Collector (CLPC) could be used for different CBC.ca applications. TC can be used for web applications where the young generation is large, i.e. object lifetimes are short. One example of such an application is the Weather engine, where each page load causes the quick creation and destruction of a Java Bean. CLPC could be used for applications such as Program Guide where the tenured and old generations are liable to be large due to the in-memory caches.

The next step would be to do performance testing to try out these hypotheses.

Hypothesis Verification Methodology

The following tests may be conducted:

  1. Baseline
  2. Test TC: java -XX:+UseParallelGC ...
  3. Test TC with aggressive heap sizing java -XX:AggressiveHeap ... . Expect performance to be better than #1.
  4. Test CLPC with default young generation collector: java -XX:+UseConcMarkSweepGC ...
  5. Test CLPC with parallelized copying young generation collector: java -XX:+UseConcMarkSweepGC -XX:+UseParNewGC ...

Quantitative Performance Gathering:

  1. JMeter response times

Qualitative Performance Gathering:

  1. Use jvmstat to observe object creation and GC run frequency

Tuning Generation Sizing

It’s also possible to tune the generation sizes if we know that one is liable to be larger than another. For example, expanding the young generation size may increase performance for applications which create a lot of objects that die young, because the frequency of minor collections will decrease. Some of the parameters that can be tuned are:

  • eden-to-survivor space ratio
  • “new” ratio (tenured-to-young ratio)
  • permanent generation size

We probably don’t need to tune these unless we are really seeing performance problems (e.g. too frequent garbage collections due to eden being too small, etc.)

Tuning The Heap

Total Heap Size

The total heap size can be tuned with the -Xms and -Xmx parameters to the JVM, representing minimum and maximum heap size respectively. If ms mx then not all of the heap will be committed to the JVM upon startup. However, the garbage collector will only attempt to collect as many objects to keep the committed heap size between ms and mx.

The implication of this is that if ms is set too high, a given application may consume at least ms megs of RAM even if most of its committed heap is full of dead objects. Therefore, the current CBC.ca setting of -Xms=1024m -Xmx=2048m is likely not appropriate.

Heap Size Ratios

The virtual machine will also try to grow or shrink the heap at each collection to keep the proportion of free space to live objects within a specific range, indicated by the -XX:MinHeapFreeRatio and -XX:MaxHeapFreeRatio parameters. By default these are set to 40% and 70% respectively. In other words, if, after a collection, a generation has fewer than 40% free space, its size will be expanded such that there is at least 40% of free heap in that generation.

Next Steps

The most important step to take right away is to determine what are the appropriate minimum and maximum heap size ratios for CBC.ca applications. I propose the following:

  1. Select a body of applications for test.
  2. Temporarily reserve a machine which is representative of server hardware currently in production, e.g. one of the QA boxes.
  3. Decrease minimum heap size by 1/2 on each test and empirically observe memory size (both virtual and resident) of Tomcat under load. This observation can be done with “top” or the same “ps” flags which are being used to run the Cricket graphs.
  4. Observe also the throughput of the application. The throughput is expected to decrease for smaller values of the minimum heap size because the garbage collector has to do more work on each collection.