by Laurence Vanhelsuwé

Profiling the profilers

news
Aug 22, 200320 mins

A comparative review of three commercial Java code profilers

Modern software is such an unwieldy multidimensional beast that no single development tool can ever hope to give programmers the complete picture of their creations. Even performance—that elusive metric we all love to hate when our code struggles with second gear, and love, when our code travels at O(1) speed—consists of many devilishly interlocked facets.

Performance’s proverbial tip of the iceberg consists of the subjective, user-tangible perception of program speed and responsiveness. If we temporarily eliminate the user from the equation, then we can equate performance with the sum effect of objective performance facets such as algorithm choice, overall memory usage, object allocation and de-allocation dynamics, and multithreading design and runtime behavior. Helping you to understand your program’s dynamic behavior in these select dimensions is the burdensome job of code profilers.

In this article, I look at three commercial Java profilers and determine which ones come close to satisfying your, and my, needs:

  • Borland’s Optimizeit Suite
  • Quest Software’s JProbe Suite
  • ej-technologies’ JProfiler

Profiler basics

Not surprisingly, all three products have a lot in common. All modern profilers begin from an identical starting point and constraint: the Java Virtual Machine Profiler Interface (JVMPI) (see sidebar, “The Java Virtual Machine Profiler Interface“). This Sun Microsystems API lets tool vendors interface or connect with a JVMPI-compliant JVM, and monitor the workings and key events of a JVM running any Java program—from standalone application to applet, servlet, and Enterprise JavaBeans (EJB) component.

Given that JVMPI imposes a standardized, level playing field for all profiler tool vendors, it is no surprise that the main differentiating factor vendors compete on boils down to the tools’ metafeatures (i.e., features that add significant value to raw JVMPI data and functionality, and, even more important, the graphical user interface (GUI) fronting those features).

As you’ll see in the rest of this product review, the products’ make-or-break GUIs each have an individual approach to the core problem of how to exploit the raw JVMPI features to maximize analytic and debugging productivity. Unfortunately, as for so many applications tasked with visualizing large datasets in an intuitive and truly user-friendly way, not every profiler convinced me that its makers succeeded in that respect.

The three reviewed profilers have almost identical profiling session configuration capabilities, so I briefly mention them here and move on to the comparison:

  • JVM selection
  • To-be-profiled program selection
  • CLASSPATH and source path selection

To start any profiling session, all three products let you select the JVM on which you normally run your application. Once you select a JVM, you must specify your program’s main class, or executable jar file, and what arguments, if any, your program expects. Finally, setting the CLASSPATH for a profiling session also typically lets you point the tool at your source code hierarchy. Figure 1 shows a typical session configuration dialog.

Launching a program in a profiler implies the generation, capture, and visualization of overwhelming volumes of data, so all profilers include diverse approaches to control this data flood by filtering on various criteria, typically on a per-package basis. This is done using flexible regular expression-style patterns like java.util.* or even jav*.

A quick product comparison

Before I explore profiler-specific features (and anti-features), Table 1 shows an attribute matrix summarizing each offering’s key points:

Table 1. Attribute matrix

 Optimizeit SuiteJProbe SuiteJProfiler
Version5.05.02.2.1
Price,599,000¹99
Free evaluationYesYesYes
Online (built-in) helpYesYes (JavaHelp)Yes (JavaHelp)²

Is help context-

sensitive?

YesYesYes
Built-in tutorialsYesYesNo³
Paper documentationNoYesNo
Number of tool modules

3 (Profiler, Thread Debugger,

Code Coverage)

4 (Profiler, Coverage,

Memory Debugger,

Threadalyzer)

0 (all-in-one)

Tool modules sold

separately?

NoYesNo
CPU profilerYes (not real time)Yes (not real time)Yes (real time)
Object/heap profilerYesYesYes
Thread profilerYesYesYes
Deadlock detectionAutomated and visualAutomatedManual

Race condition

detection

NoYesNo
Code coverageYesYesNo
Multi-JVM supportYesYesYes
Drill-down to sourceYesYesYes
Drill-down to bytecodeNoYesYes
Remote profiling*YesYesYes
Automated profiling**YesYesYes
IDE integrationYesYesYes
Report generationYesYesYes

Host platform

licensing policy

Multiplatform and

single-platform licenses

Single platformMultiplatform
Websitewww.borland.com/optimizeitwww.jprobe.comwww.jprofiler.com
Ease of use7/104/108/10

Explanatory table notes:

* Remote profiling: The ability to profile a Java program executing on a machine other than your development machine

** Automated profiling: The ability to perform unattended overnight profiling sessions; in other words, command-line-driven operation with no GUI

¹ JProbe Suite price includes one year of Gold Support (technical support)

² ej-technologies’ JProfiler Online Help contains almost no screenshots of views or dialogs

³ ej-technologies’ lack of explicit tutorials is partly compensated by some demo sessions

Test platform

I was pleasantly surprised by the profilers’ broad support for diverse platforms, both from a host operating system (OS) and Java implementation point of view. In fact, most profilers support every commercially relevant host and/or JVM implementation (due to the large number of permutations; see the vendors’ product Websites for precise details). One exception is that neither Borland’s nor Quest Software’s profiler suites support Windows 98. This is, I was told, because Windows 98 isn’t a “serious” OS when it comes to method timing accuracy. (Windows 98 apparently only offers 50-ms tick accuracy via its public API, and sure enough, many methods will fall through such a coarse timer’s net.) So, I tested all three contenders on a standalone PC built around a 900-MHz Athlon CPU, 256 MB RAM, running Windows XP (Service Pack 1).

To keep this review manageable, I restricted myself to testing standalone Java 2 Platform, Standard Edition (J2SE) applications. I didn’t test Java 2 Platform, Enterprise Edition (J2EE) applications, although all three vendors try their best to sell into J2EE markets by including product features that explicitly support servlet profiling or EJB components running on various application servers.

You must have intimate knowledge of an application’s architecture and implementation to expect to gain new insights from using a profiler, so I relied mainly on two of my own real-life applications as profiling guinea pigs (see Table 2 below).

Table 2. Profiled applications

Program nameWorld-on-a-DiscSlave
Description

Map-based multimedia

engine for CD/DVD-ROM

See

www.worldonadisc.com

Generic pluggable file

and directory processor

See

www.lv2.clara.co.uk/slave.html

Performance

“Achilles’ heel”

CPU-boundI/O-bound
Number of classes3560+

All three profilers come with small demonstration applications. I definitely found it necessary to familiarize myself with each profiler by playing with these demos before letting the tools loose on my own programs.

Borland Optimizeit Suite

Borland’s Optimizeit Suite is the most mature, full-featured profiler package reviewed here. At ,599 a seat, Borland is clearly not trying to grab the lone developer market. The suite consists of three loosely coupled components: Optimizeit Profiler, Optimizeit Thread Debugger, and Optimizeit Code Coverage.

Borland Optimizeit Suite’s core features

Borland’s Optimizeit Profiler is the combined tool façade for CPU and heap/object profiling. Figure 2 shows a typical GUI screenshot.

Figure 2’s class instances view tabulates the distribution of live objects, grouped by class and sorted by number of objects. If you’ve never used a heap profiler before, be prepared for an epiphany: Not only will the underlying reality of your program viewed through this lens completely disorient you, but once you come to terms with it, you’ll never view your program’s source code the same way again.

Borland’s Profiler lets you click on any class and view precisely where each instance of that class has been allocated. Not only does Profiler tell you in which methods the allocations occurred (Figure 3), but if you double-click any method name, a source code viewer pops up with the allocating statement line highlighted.

This feature is invaluable when you suspect your program suffers from object allocation hot spots (i.e., parts of the program that allocate too many objects), or want to track down the source of live objects you think shouldn’t even exist in your program (e.g., Abstract Window Toolkit (AWT) color objects in an XML parser or Swing objects cluttering up a command-line-only utility!).

The Memory Leak Detector lets you compare two snapshots of your program’s heap, and thus hopefully uncover loitering objects that anchor potentially crippling numbers of referenced objects. Note that, despite this feature’s name, detecting memory leaks remains hard, manual work, although this feature vastly reduces the size of the needle’s haystack.

Profiler’s Instance Display also reveals memory leaks. This tool facet lets you drill down to the dizzying microscopic level of object graphs. You can analyze incoming and outgoing references for any live object in your application. I have one problem with this feature: Borland doesn’t use conventional graph visualization for this part of the tool; instead, the tool relies on a mixture of tables and trees, as Figure 4 shows.

The Thread Debugger lives in the second module of Borland’s suite. Thread Debugger consists of a real-time display of all live threads and their states (running, waiting, blocking, and blocked in native input/output (I/O)), plus several other views designed to help you analyze deadlocks and resource bottlenecks. In the real-time view, each thread row tells you how many object locks—called “monitors” by the program—the thread owns and how long that thread has been blocked. Figure 5 illustrates a typical thread debugging session.

If you’ve ever spent a few days debugging a threading problem, then you’ll agree that the following features make Thread Debugger alone worth its weight in gold: explicit deadlock detection and visualization (Figure 6), thread contention, waiting threads, and excessive nested locking analysis.

Figure 6. Thread Debugger contains logic to detect and visualize deadlocks. Click on thumbnail to view full-size image.

The third module in Borland’s trinity, Code Coverage, does a reasonably good job showing you the parts of your program all your system’s threads visit. The overview lists classes with percentage values, and clicking any class pops up a source code view with executed lines highlighted in yellow.

Borland Optimizeit Suite’s rough edges

Borland’s tool trio architecture isn’t so much an architecture as it is a feeble elastic band holding three completely different programs together. This means if you’re working with the Profiler and then want to switch to some threading analysis using the Thread Debugger, you must switch programs and start a brand new session. That’s not what I call suite integration.

Thread Debugger can really use a filtering feature to let you tune the amount of data you must wade through. Currently, you have to view all live threads (stripped of any thread group information), which is like watching a conveyor belt of rainbow color-coded strips scroll past: pretty, but not a terribly easy way to separate the signal from the noise. A hierarchical viewing feature that reflects Java’s thread group hierarchy also wouldn’t go amiss. In addition, time period formatting could improve: instead of printing Thread-2 waited 23507561 ms, printing Thread-2 waited 6 h:31 m:47.561 s would be more programmer-friendly.

Wherever Optimizeit shows object sizes in bytes, it uses a format like 124b. To an old hacker like me who has poured over a few too many hex dumps, this kept triggering hexadecimal “B” déjà vu. My brain would have vastly preferred the same information formatted as 124 bytes.

Code Coverage’s source code display is a lot less useful than first apparent: The visualization approach highlights executed code lines in yellow and leaves all others white (i.e., not highlighted). While this approach sounds like commonsense, the feature would be more useful if it also considered any white space (including comments) following any executed code as similarly executed. The result would be far easier on the eyes, only showing gaps to indicate code that a thread hasn’t touched.

Quest Software’s JProbe Suite

Quest Software’s JProbe Suite is similar to Borland’s Optimizeit Suite in terms of maturity and pricing. It also approaches the profiling problem by offering separate tool modules, in this case, four: JProbe Profiler, JProbe Coverage, JProbe Memory Debugger, and JProbe Threadalyzer.

Unlike the two other profiler publishers, Quest Software separately licenses individual suite components; JProbe Profiler on its own, for example, costs 49 (including one year of technical support).

JProbe Suite’s core features

JProbe’s approach to profiling centers on the concept of JVM performance data snapshots. You can save snapshots for future reloading and continued analysis and compare two snapshots to explore differences. Figure 7 shows a typical JProbe Profiler window with two snapshots available for analysis.

At a stroke, this snapshot-centric approach means that most JProbe analysis is done offline, as opposed to real time. This approach makes sense for heap analysis, but it seems an unduly rigid approach for CPU performance profiling.

For the all-important task of CPU profiling, JProbe offers either a tabular view (Figure 8) or a graphical view (Figure 9); it doesn’t provide an intuitive Swing JTree-type view that lets you drill down along call stack branches.

JProbe’s second module, Memory Debugger, lets you navigate a snapshot of the heap in numerous ways. It also contains a simple, but effective, use case analysis feature that lets you reset collected data before starting a use case. Memory Debugger also lets you define any number of instance count thresholds (Quest Software somewhat confusingly calls these “asserts”) that will alert you at the end of a use case if any class has spawned more objects than you expect. Figure 10 displays JProbe’s heap summary view.

Memory Debugger’s Instance Detail view (Figure 11) lets you analyze which objects refer to a focus object, and in turn to which objects the focus object refers.

Figure 11. A typical instance detail view shows part of the referrer tree for a single PictureList object of my World-on-a-Disc test application. Click on thumbnail to view full-size image.

The reference graph view (Figure 12) lets you analyze incoming and outgoing references in glorious, but often overwhelming, graphical detail.

And if untangling spaghetti sounds too 1970s retro for you, the Memory Leak Doctor will display object reference chains that potentially cause memory leaks (Figure 13). The Leak Doctor lets you perform “What if I could remove this reference?” experiments on a heap snapshot, and see whether such operations solve your memory leak problem.

Figure 13. Memory Leak Doctor suggests leak problems rather than diagnoses them. Click on thumbnail to view full-size image.

Another useful Memory Debugger feature is the Garbage Monitor; this tracks allocation hot spots for short-lived objects like the behind-the-scenes StringBuffer objects used by language-level string concatenation (as opposed to API-level concatenation).

JProbe’s Threadalyzer is JProbe Suite’s thread analysis module and boasts many useful features to automatically detect problems like:

  • Deadlock (actual and potential)
  • Stalled threads
  • Data races

However, the GUIs fronting these otherwise attractive features are far from intuitive.

JProbe Suite’s ring-bound (paper) documentation deserves mention for being well written and designed (in terms of layout, screen shot use, and icons). It also goes the extra mile by giving readers many valuable performance analysis techniques and tips along with some fundamental advice about how to integrate profiling into your software development methodology.

JProbe Suite’s rough edges

Despite JProbe’s high standing in the tools market, it still feels disappointingly raw and unpolished to me. It looks and feels like a piece of software that has grown organically over the years without benefiting from the occasional from-scratch redesign such growth demands. Consequently, JProbe Suite 5.0 groans under the stresses imposed by classic “featuritis.” I found few aspects of JProbe truly intuitive or elegantly powerful. Learning to use JProbe was really hard work. The Memory Debugger was especially confusing with unintuitive GUIs that popped up left, right, and center, causing me to feel thoroughly lost on too many occasions.

Other JProbe problems, in no particular order, include:

  • Interpreting and interacting with the graph displays (object graph and method timing) is difficult.
  • Objects are identified by an obfuscating hex number instead of their more enlightening toString() representation.
  • Toolbar icons were minuscule on my 20-inch, 1280 x 1024 resolution screen.
  • The method timings table restricted its information to show alphanumeric data only, instead of combining text, graphics, and colors.
  • It uses three different looks and feels for the tables in the following modules: Code Coverage, Garbage Monitor, and Profiler.
  • It appears that standard Swing components were not used for the tables, resulting in a nonstandard look and feel.
  • Too many different windows/views exhibit little, if any, overall consistency or master design.
  • The online help sometimes rendered corrupted text (probably a bug in JavaHelp that JProbe uses).
  • Countless types of windows are connected together via a maze of button clicks without a back/forward mechanism.
  • Printing reports is a hack: a PDF file is generated and sent to Acrobat Reader (which you need to have installed). I have Acrobat 4.0 on my test machine, but Acrobat refused to cooperate with this scheme and displayed: “There was an error opening this document. The file does not exist.”
  • In the Code Coverage module, the Generate Report dialog opened taller than my 1024-line desktop (OK and Cancel buttons were completely off the screen!).

ej-technologies’ JProfiler

If you haven’t heard of ej-technologies, it’s not surprising: As a German company founded in 2001, ej-technologies is a new kid on the Java tools market block. ej-technologies launched JProfiler 1.0 in February 2002 and version 2.2.1 (reviewed here) in April 2003. (ej-technologies is also the company behind the open source jclasslib classfile API and class viewer. JProfiler uses the viewer.) The timing of ej-technologies’ birth and flagship product launch means JProfiler’s design has benefited from a significant amount of hindsight: its GUI is much easier to navigate than, for example, Quest Software’s JProbe Suite.

JProfiler’s core features

JProfiler differs significantly from the other profilers because it takes a unified tool approach to exploring different profiling dimensions: the product consists of one tool that simply uses four different view sets to let you analyze your program from different angles without restarting sessions or switching to different programs. This approach’s refreshing simplicity and symmetry is far more sensible than a fragmented collection of tools blighted with mutually inconsistent user interfaces with arbitrarily determined functionality.

The four JProfiler view sets are:

  1. Memory view (heap and object graph analysis)
  2. CPU view (method timing)
  3. Threads view
  4. VM telemetry view

These view sets are subdivided into subviews that form the heart of the product. Figure 14 shows JProfiler with its pivotal Views drop-down menu.

While the Views menu clearly reflects the entire product’s architecture, quickly switching between views simply requires selecting one of the view sets in the wide vertical toolbar on the left, followed by optionally selecting a specific subview type from the tabs along the bottom of the window.

Method timing is available as a method invocation tree showing percentage of time consumed and absolute time consumed. Figure 15 shows a view of such a tree.

Figure 15. JProfiler’s invocation tree view (one of three CPU views). Click on thumbnail to view full-size image.

JProfiler lets you drill down to source or bytecode views of cycle-squandering methods via a left-click context menu associated with every line of the invocation tree view. Individual threads or a thread group can filter method-profiling information. JProfiler also has a subview listing JVM-identified code hot spots that have been compiled to native code.

Several memory views support heap analysis. Figure 16 shows the class monitor subview similar to Borland’s equivalent.

JProfiler’s Heap Walker module (Figure 17) is the only aspect of the product that tarnishes the overall achievement of packing a collection of extremely technical features into one easy-to-navigate GUI. Heap Walker’s six sub-subviews feel more like a program within a program and demand far more concentration to effectively use than any other parts of the product.

The threads view set comprises five subviews focusing on past and current thread states (runnable, waiting, and blocked), past and current monitor usage, and monitor statistics. These views are little more than visualized raw data. None of the views feature automated intelligence to help you identify race conditions, deadlocks, or other threading nightmares. Figure 18 shows a typical threads view.

The VM telemetry view set comprises five different real-time scrolling graphs showing used and free heap space, number of objects (helpfully categorized into arrays and nonarrays), number of loaded classes, garbage collector activity, and number of threads. Figure 19 shows this view set.

JProfiler’s rough edges

A simple feature missing from JProfiler’s CPU profiler is a Reset or Clear Data command. For example, when you profile a Swing application, there is a fundamental difference between the performance data collected during application loading and initialization and data collected when activating one of the application’s features. Once my test applications loaded, I wanted to clean the slate to focus on particular use cases. Currently, JProfiler doesn’t support this fundamental mode of analysis.

JProfiler’s online help was much too terse, lacking examples, detailed tutorials, and even GUI screenshots. Although JProfiler is the most intuitive profiler of the three reviewed here, the lack of sufficient online help made its Heap Walker module even more difficult to comprehend.

Closing remarks

Although it is humanly possible to write stable Java applications without X-raying your program with a profiler, you know deep down that cutting this corner can land you in big trouble. Any application that must run uninterrupted for days or weeks (or even months!) on end must be certified stable after you’ve thoroughly validated your system’s detailed internal runtime behavior.

All three profilers coped with the two real-life applications I threw at them. On my 900-MHz machine, profiling slowed the applications down, but not by so much that it made the task of analyzing or debugging applications an ordeal.

Of the three profilers, I liked Borland’s Optimizeit Suite and ej-technologies’ JProfile, although there’s potential for future improvement. The aggressive price tag for ej-technologies’ JProfiler will appeal to smaller cash-strapped companies, while larger companies will no doubt favor Borland’s Optimizeit Suite for its richer feature set. If you fancy Optimizeit for your team, note that Borland’s JBuilder 8.0 and 9.0 (Enterprise Editions) already contain Optimizeit Suite 5.0 integrated into the JBuilder IDE.

Finally, I have to be frank and state that Quest Software’s JProbe Suite caused me too much head scratching to recommend it. Profiling a real-life Java system is a highly skilled, technical, and time-consuming job, and the last thing you need is your profiler making your life harder by adding complexity instead of simplicity. The potential of JProbe Suite’s many features is consistently undermined by unacceptably poor GUI design, making the job of analyzing your code unbearably frustrating.

Laurence Vanhelsuwé is a senior software engineer living in Scotland. With more than 20 years of programming experience—seven exclusively on Java—Laurence currently juggles his time between software development, authoring Java-related articles and books, technical editing, and Java training.