Forum Controls
Spotlight Features

The Rich Engineering Heritage Behind Dependency Injection

Andrew McVeigh takes us on a tour of the rich heritage behind dependency injection, what it represents, and tells us why its here to stay.

NetBeans 6: Matisse Updates

NetBeans 6 delivers great updates to the Matisse GUI builder. Spend a few minutes with Roman Strobl and get an expert briefing on what's new and what has changed.

Introduction to Groovy Part 3

In this, the third and final installation of Andres' Introduction to Groovy series, you learn about how Groovy handles variable numbers of arguments, named parameters, currying, and more about Groovy operators. Including, some new operators.

Easier Custom Components with Swing Fuse

Swing Fuse (actually just Fuse), is a framework designed to make it easier to create your own custom desktop components. In this article, Daniel Spiewak shows you how to get started and provides sample source code you can download.

Benchmark Analysis: Guice vs Spring

Willam Louth shows how he uses JXInsight Probes to investigate probable performance issues with code bases that he is not familiar with. He also highlights possible pitfalls in creating a benchmark, as well as in the analysis of results.
Replies: 10 - Pages: 1  
  Click to reply to this thread Reply

A Few Weekend Benchmarks

At 8:03 AM on Oct 10, 2005, Michael Urban wrote:

Over the weekend, I did a few benchmarks with Java 5, Mustang, and C, just for fun, and to see where we were at. The benchmarks I used are modified (not by me) versions of the fft, life, and infilife benchmarks that fix several problems with the original ones that were used in at least a few Java vs. C benchmarks. The benchmarks are available in a zip file along with the original broken ones and explinations of what was wrong with them.

The system I ran these tests on is an Intel Celeron M with 480 Mb of memory, running Windows XP home edition. The C tests were run on cygwin and compiled with gcc 3.4.4 and -O2 optimization. (Increasing optimization to -O3 on gcc yielded no additional increase in performance.) The Java tests were run on both Java 5 and the latest snapshot of Mustang. In both cases, the JVM was run in server mode. In Mustang, I also tested escape analysis, although it only made a difference in the Life test, and was only about 2.11%.

The following charts show the results of the three tests I ran:





C comes out slightly ahead on the Fast Fourier and Life tests, but not by much. The results of the Infilife test are interesting though. Java clearly blows C away here.

In all cases, Mustang performs somewhat better than Java 5 on average. But there are a couple of interesting exceptions to note when looking at the raw data...The following rows from the Life test for example:
Size     Java 5     Mustang
1000     1043       592
2500     174        96.9

For some reason, Mustang does significantly worse at sizes 1000, and 2500 for the Life test than Java 5 does. Interestingly, Java 5 significantly beats C at these higher values, whereas Mustang comes in nearly identical to C (C had results of 610 and 99.6 respectively). Despite this performance drop, though, Mustang still comes out ahead of Java 5 on the Life test because of turning in better results for other values in test. Anyone have any thoughts on what the slowdown might be at 1000 and 2500 in Mustang?

Anyway, just thought I would share these results. What are your thoughts and comments?
1 . At 10:27 AM on Oct 10, 2005, Chris Rijk wrote:
  Click to reply to this thread Reply

Re: A Few Weekend Benchmarks

Sounds like some modified versions of some benchmarks I wrote many years ago.

Infilife does a lot of memory allocation/deallocation - lots of small blocks. GC vs malloc anyone? (note, this test can be significantly affected by GC switches used. Though with the automatic tuning/sizing in more recent JVMs, this is less of an issue)

I wouldn't worry too much about the difference in speeds at different sizes for the "life" benchmark. That's been happening since the start and shows the importance in testing performance over a range of settings instead of fixed settings (ie just one array size). I'd say it is a result of different optimisation strategies - and the Java version can be affected by whether you run small arrays first or large.

Incidentally, I did some testing with these benchmarks over a year ago. On an Opteron system, I could get all the Java versions running faster than GCC binaries and in several cases against Intel's Linux C compilers. The C binaries did better (relatively) on a Pentium 4 though.

I also did some reworking of the benchmarks and looking into some advanced warm-up issues earlier this year but have been too busy to complete it - I have been trying to write a follow-up to my original "binaries vs bytecodes" article for some time...
2 . At 12:22 PM on Oct 10, 2005, Osvaldo Doederlein wrote:
  Click to reply to this thread Reply

Escape Analysis

Notice that even though escape analysis is already implemented in Mustang, right now (b55) the optimizations that need information from escape analysis are not yet implemented; the most important are lock elimination and stack allocation. You can check the working of escape analysis with -XX:+PrintEscapeAnalysis -XX:+DoEscapeAnalysis (I think the "print" option is exclusive to the fastdebug build). If you can read Assembly, check also -XX:+PrintOptoAssembly (this is not final ASM code but it's very close), also in fastdebug builds. Pretty hot tool to study any microbenchmark ;-)

The small difference of performance that you may see in current builds, turning escape analysis on, should be due to minor effects of this analysis in lesser optimizations that already explore it. But the big bucks have yet to come. Stack allocation not only saves GC overhead, but it also opens the door to other opportunistic optimizations, by inlining object fields as local variables (which allows register allocation of the hottest fields), and simplifying (inlined) method invocations that depend only on some fields of the inlined objects. Think for instance in Iterator objects, which are ubiquitous in modern Java code, never escape methods, and are very small (typically 2-3 fields). Escape analysis + stack allocation + field inlining and other optimizations, can produce code that's as fast as iterator-less code(*), e.g. a "for (int...)" to iterate an ArrayList.

These optimizations may also reduce the need for lightweight objects in Java. I have long proposed this feature, and groups like JavaGrande (numerics) and game developers want it, because when you are forced to use full-blown Java objects for types like Complex or Triangle, the overhead of allocation and GC is horrible even with the best GCs. It's bad enough with mutable objects, but even worse with immutable designs (like BigDecimal) because your calculations will produce and dispose temp objects for intermediary results like there's no tomorrow, forcing frequent GC and thrashing the L1/L2 caches. For example, in "x=y.square().add(z)", if these vars are objects, "y.square" creates a temp objects that is only used as the LHS for add(z), then immediately abandoned. Notice that the same code structure that generate temp objects, also generates non-escaping values!(**) The stack allocation of number-like objects has the potential of reducing this problem so much, that we don't even need to bother with lightweight objects anymore.

(*) More precisely, Iterators have extra complexity for fail-fast behavior, but with good inlining and loop optimizations, most of this behavior can be optimized out (e.g., moved out of loops).

(**) Even though my y^2 value escapes to add() as the receiver ("this" parameter), it doesn't matter because (a) add() is probably simple enough to be inlined, and (b) HotSpot can do transitive escape analysis, so if add() doesn't escape its parameters, most optimizations on otherwise non-escaping variables from the caller are still valid even without inlining.
3 . At 1:24 PM on Oct 10, 2005, murphee (Werner Schuster) DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> These optimizations may also reduce the need for
> lightweight objects in Java. I have long proposed

Well, there's still the matter of storing large numbers of small or custom objects due to the object overhead (headers and possible alignment overhead), which would be a use for lightweight classes.

(Of course, there are possible workaround hacks that simply allocate a large java.nio.Buffer and dump the data inside it (I think the Javolution Struct classes can work like that)).
4 . At 1:26 PM on Oct 10, 2005, Jean-Marie Dautelle DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> ... because when you are
> forced to use full-blown Java objects for types like
> Complex or Triangle, the overhead of allocation and
> GC is horrible even with the best GCs. It's bad
> enough with mutable objects, but even worse with
> immutable designs (like BigDecimal) because your
> calculations will produce and dispose temp objects
> for intermediary results like there's no tomorrow,
> forcing frequent GC and thrashing the L1/L2 caches.

Kind of agree with that. For example, using pseudo "stack" allocations (Ref. Javolution PoolContext) adding immutable large integers is up to 8x faster (See JScience benchmark: http://jscience.org/doc/benchmark.html).
But, for small objects it seems that generational collectors are almost as fast as stack allocations.

What about supporting multiple "object spaces" (heap, stack, etc) which can be shared (or not) between threads. It would be very nice if such feature were directly supported by the VM (through the "new" keywork instead of using object factories).
Jean-Marie Dautelle - Marlboro, MA
-- Javolution: Everything should be made as simple as possible... -- JScience: But not simpler!
5 . At 3:57 AM on Oct 11, 2005, Artur Biesiadowski DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> What about supporting multiple object spaces
> (heap, stack, etc) which can be shared (or not)
> between threads. It would be very nice if such
> feature were directly supported by the VM (through
> the "new" keywork instead of using object factories).

Why I should care if given variable escapes scope or not if jvm can determine it for me ? To get any benefit from that, you would have to introduce special kind of object allocation, which could be only used in local context and never assigned to any instance/static variable. Such modifier would have to be placed on every method argument and every instance variable, thus requiring you to write two rewrite most of the methods in every library (as they would expect String, not stack-allocated String as argument).

IMHO, it is a lot better to allow jvm to do it for you behind the scenes.
6 . At 10:56 AM on Oct 11, 2005, Jean-Marie Dautelle DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> Why I should care if given variable escapes scope or
> not if jvm can determine it for me ?

Unfortunately, this kind of analysis is difficult at compile time (it has to be conservative and miss all except trivial cases), it might be easier at run-time but it would take up cpu time.

> To get any
> benefit from that, you would have to introduce
> special kind of object allocation, which could be
> only used in local context and never assigned to any
> instance/static variable.

No, you don't need to mark anything and you can use the "new" keyword. Basically, by default "new" would allocate on the current thread object space (heap by default for backward compatibility). If a "stack" object is assigned (referenced) to objects from another "stack" or "heap" (e.g. static) the JVM could detect it and move this object to the heap.

> Such modifier would have to
> be placed on every method argument and every instance
> variable, thus requiring you to write two rewrite
> most of the methods in every library (as they would
> expect String, not stack-allocated String as
> argument).

No, the JVM can easily detect when an object is assigned from another object-space (e.g. RTSJ VM raise an IllegalMemoryAccess when stack objects are referenced by heap objects). As far as the user is concerned it does not matter if the object is allocated on the stack or not (the JVM will automatically move it to the heap if assigned to a static variable).

To summarize, instead of allocating on the heap by default and finding out what can be allocated on the stack. An easier most effective approach would be to allocate everything on the stack (when executing in a PoolContext) and finding out what has to be moved on to the heap ;)

PS: All this is already done by Javolution. Unfortunately, Javolution cannot change the JVM behavior. Therefore, users have to use object factories instead of constructor ("new") and detect (export/preserve) objects being referenced outside of their object space (e.g. static). Having direct JVM support would make the whole thing a lot faster, easier ("new" keyword instead of factories) and safer (export/preserve done by the JVM) :)
Jean-Marie Dautelle - Marlboro, MA
-- Javolution: Everything should be made as simple as possible... -- JScience: But not simpler!
7 . At 1:13 PM on Oct 11, 2005, Artur Biesiadowski DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> No, you don't need to mark anything and you can use
> the "new" keyword. Basically, by default "new" would
> allocate on the current thread object space (heap by
> default for backward compatibility). If a "stack"
> object is assigned (referenced) to objects from
> another "stack" or "heap" (e.g. static) the JVM could
> detect it and move this object to the heap.

Isn't very similar thing happening with thread-local heaps ? New objects are getting allocated on per-thread small heap and then all non-garbage code is moved to next generation when per-thread heap is exceeded. I think that one of jvms implemented per-thread allocation spaces (was it Hotspot or IBM one, I cannot recall).

In RTSJ, explicit allocation arenas are needed because
a) there have to be guarantees, not wishes
b) it is designed to run on systems which cannot afford to do any kind of runtime analysis

I still think that for J2SE, it is not worth to mess with specialized allocators. And I don't think that you will be able to use J2SE for hard real time systems ever - it is better to go with certain subset of libraries plus superset of operations (which RTSJ seems to be aiming).
8 . At 2:00 PM on Oct 11, 2005, Jean-Marie Dautelle DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> I still think that for J2SE, it is not worth to mess
> with specialized allocators.

There is no specialized allocator (no change to your application code). The only thing your application might want to do (optional) is to execute garbage generating code within a pool context (this can be done at high-level). For example:
   PoolContext.enter();

try {
...// Code demanding a lot of temporary allocations.
} finally {
PoolContext.exit(); // Reset stack (almost instantenous)
}


> And I don't think that
> you will be able to use J2SE for hard real time
> systems ever - it is better to go with certain subset
> of libraries plus superset of operations (which RTSJ
> seems to be aiming).

We currently have several J2EE (yes servers) hard real-time systems in development. This has been made possible thanks to the new generation of real-time garbage collectors (ref.http://javolution.org/doc/Man33955.pdf).
The solution proposed here works particularly well with these collectors because it significantly reduces their work (the more heap allocations the more work). Also, accelerating object allocation/collection is beneficial to all (real-time or not).
Jean-Marie Dautelle - Marlboro, MA
-- Javolution: Everything should be made as simple as possible... -- JScience: But not simpler!
9 . At 5:00 PM on Oct 11, 2005, Artur Biesiadowski DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> We currently have several J2EE (yes servers) hard
> real-time systems in development.

What containers do you use ? Have you modified one of open source ones, or come up with your own ?
10 . At 5:40 PM on Oct 11, 2005, Jean-Marie Dautelle DeveloperZone Top 100 wrote:
  Click to reply to this thread Reply

Re: Escape Analysis

> > We currently have several J2EE (yes servers) hard
> > real-time systems in development.
>
> What containers do you use ? Have you modified one of
> open source ones, or come up with your own ?

We have several sub-contractors providing competitive solutions to us (at a price ;) ). Cannot say more...
Jean-Marie Dautelle - Marlboro, MA
-- Javolution: Everything should be made as simple as possible... -- JScience: But not simpler!

thread.rss_message