NetBeans 6 delivers great updates to the Matisse GUI builder. Spend a few minutes with Roman Strobl and get an expert briefing on what's new and what has changed. (sponsored)
In this, the third and final installation of Andres' Introduction to Groovy series, you learn about how Groovy handles variable numbers of arguments, named parameters, currying, and more about Groovy operators. Including, some new operators.
Swing Fuse (actually just Fuse), is a framework designed to make it easier to create your own custom desktop components. In this article, Daniel Spiewak shows you how to get started and provides sample source code you can download.
Willam Louth shows how he uses JXInsight Probes to investigate probable performance issues with code bases that he is not familiar with. He also highlights possible pitfalls in creating a benchmark, as well as in the analysis of results.
Replies:
19 -
Pages:
2
[
12
| Next
]
Threads:
[
Previous
|
Next
]
A recent discussion with Gavin King got me thinking about bytecode modification, and the advantages/disadvantages thereof. I've really avoided bytecode modification because it always seemed the risks involved were far more dangerous than the advantages which can be gained. Is this really teh right approach?
To me, it boils down to one simple fact: bytecode modification changes the behavior of your Java code
counter to its specification
at runtime. By its very definition, bytecode modification lets you do things which are not enabled by the Java specification (though allowed by bytecode). Unfortunately, this can potentially make things very hard to debug, and lead to very strange errors. For example, if there's a problem in a generated section, the JVM will spit out some weird and difficult-to-fathom error. At the very least, it won't correspond with any specific source code. As a result, fixing the problem may not be as straightforward as one may think.
What's worse, is many framework developers have taken to the use of bytecode modification libraries to handle key features. Hibernate, for example, uses CGLIB to enable attribute lazy-loading. Spring uses bytecode modification for its AOP functionality (actually, all AOP frameworks use this technique). So to use any of this great functionality supplied by these libraries, we're stuck with all the potential repercussions and weird errors.
Is it really best practice for us to be using such hacky techniques to accomplish our aims? Is bytecode modification really a valid tool to use?
One of your questions attempts to impose the answer. Of course its not reasonable to use such hacky techniques. But they are not hacky.
Yes, there are risks involved. But all of the same risks apply to using a Java source to bytecode compiler. Ever seen bugs in the Java compiler? Or the JIT? Some of the ECJ bugs have been brutal, and some JITs have had similar issues.
The bottom line is that buggy code in places like this can be really bad and difficult to deal with.
Fortunately, a little goes a long way, and if designed well, testing can be quite extensive and comprehensive.
I really don't understand why you keep saying that this "goes against the specification". It doesn't go against the specification any more than compiling Jython code to java bytecode does. It is just a different way of doing things.
Daniel, you should have done your home work. Spring framework is NOT using bytecode modification for its AOP functionality. Its AOP implementation is completely proxy-based. Though Spring can generate bytecode at the runtime to proxify classes that can't be handled by the java.lang.reflect.Proxy, i.e. non-interface based ones.
> Daniel, you should have done your home work. Spring
> framework is NOT using bytecode modification for its
> AOP functionality. Its AOP implementation is
> completely proxy-based. Though Spring can generate
> bytecode at the runtime to proxify classes that can't
> be handled by the java.lang.reflect.Proxy, i.e.
> non-interface based ones.
As I understand it, you can use either approach. We were using proxies, but we've switched to the AspectJ approach which weaves at runtime. For us, it avoids certain problems when executing calls within a proxied class. Anyway, the option is there, at least with Spring 2.
> What's worse, is many framework developers have taken
> to the use of bytecode modification libraries to
You're right. So, what problems have you had with it in practice? I rely on several libraries that perform bytecode generation at compile time or at runtime and have had no bugs, that I know of, relating to it. The only issues we've were Jar-versioning issues, where two libraries were using incompatible versions of a bytecode library (ASM is a common problem), but we got around this using JarJar.
Your argument sounds a little too theoretical. I'd like to hear some arguments against bytecode generation based on experience. In practice, I like what bytecode generation allows for.
>> bytecode modification changes the behavior of your Java code counter to its specification at runtime
DI frameworks also DO the same thing. Reflection has the same effect, since you can set private fields and call private methods.
It's like the gun that is given to you, but it's your choice to shoot yourself.
Advantages are huge and mostly are execution speed related, since no reflection overhead.
Disadvantages are the if a rookie gets his hands on this "kewl" feature the consequences can be devastating ... Like our ruby friend that decided that he didn't like nil object, so he modified it.
I personally like to use bytecode manipulation to add property change events and truly transparent DI. ASM is a great, and so is the ASMified Bytecode view in Eclipse.
I have no problem with bytecode generation per se, and have used it from time to time successfully. To some extent, one could argue that the Java Proxy class does just that.
Where I have had problems is with secure systems that have to be audited by external specialists. These people generally know a lot about security and what loopholes to look for. However, when faced with tens of thousands of lines of business code, they have their work cut out understanding how it works and ensuring nothing bad can happen.
If you present them with a module that does dynamic bytecode generation, there are strong chances that their brains will blow.
If we live in the POJO world, the bytecode instrumentation is the *best* tool to achieve the magic required by the POJO oriented development.
Bytecode instrumentation (modification, weaving, enhancement...) is not for anyone, is very very useful to frameworks converting a POJO in a service (or exposing it to services) or to AOP (but no very much people uses AOP I think).
Reflection and proxies are nice but fall short and they are slow (if we are talking about database access is not an issue or course).
When I started to develop JNIEasy I tried reflection and evaluated proxies but I wanted *real native transparency* with almost *no restriction*. In the persistent world ObjectStore C++ was a dream (in C++!), JDO achieved the same transparency as ObjectStore with bytecode enhancement, bytecode enhancement is the feature dreamed by C++ (ObjectStore used memory page faults and required source code). Terracotta needs bytecode enhancement to provide the clustering magic... if we want real transparency bytecode modification is the best.
About debugging and good practices: bytecode modification must be very very small, basically *hooks*, calls to interceptors, before/after a field access or Java call or replacing them fully, new code shouldn't have ifs, any decision should be done on normal Java code (including "do nothing").
When you debug modified bytecode you usually don't need the modified source because it should be stupid, and should be documented (when the field is read blah blah), usually modified bytecode hasn't debug info but is not a problem because you can debug any interceptor call because interceptor code is normal Java code. Anyway a disassembler like JODE may be useful.
For instance:
int a = this.data;
may be enhanced to:
int a = MyInterceptor.processIntField(this.data);
What is the debug problem? processIntField call can be debugged! you know what is the original value (parameter) and the result (returned value).
And for bytecode instrumentation the easiest tool for me is Javassist, with Javassist you can use Java! you don't need to know the bytecode format because Javassist contains a Java compiler.
I cannot agree more with Jose Maria.
You should look at byte-code instrumentation as just an extra-compilation step.
There are several good libraries to instrument the byte-code, such as BCEL and ASM.
And the resulting byte-code is still 100% compliant with the JVM specification, including profiling and debugging APIs (don't worry, your lines won't be changed when you'll debug .
1. No form of bytecode manipulation will make the JVM do something outside its spec (unless it has a bug :)). It may change the way the code works, in relation of what the SOURCE says, but the BYTECODE won't even run unless it is verified valid by the JVM.
2. Problems with side-effects from bytecode generation/manipulation may occur if the framework does something wrong, but it is just like any other bug. I mean, a bug in Hibernate's class generation algorithm is the same as a bug in your ActiveObject's dynamic proxy (which works pretty much the same as runtime class generation using CGLib). You just have to correct the bug, not blame the technology.
I certainly don't like too much 'magic'. I don't think bytecode generation/manipulation should be used unless it is made to work in a deterministic (predictable) way, and the one who uses it really know what he is doing. But I do believe Gavin and Rod know what they are doing
IIf AspectJ or Hibernate make your debugging more difficult than the benefit you derive from them, don't use them. And if this functionality is important enough to you, lobby Sun to change their language, runtime, and/or specs to support these features better.
Yes, we should use bytecode modification. Is it potentially harmful? Yes. Is it harder to debug? Yes. Can it be beneficial? Yes. The greatest risk happens when you write code which utterly depends on bytecode modification, or else it misbehaves.
I might be looking around a project's javadocs and I find a useful looking class. I call the constructor, call some methods, and little do I know that I was supposed to pass this thing to a bytecode modifier before the class can be used correctly. In the meantime, what harm have I caused? Have I destroyed critical data?
Code that depends on bytecode modification should consider having a special mode indicating that they're not ready yet. Perhaps they can throw an exception, and the bytecode modification step sees the exception and removes it. Perhaps it can require it to be there.
public void doStuff() {
BytecodeModififerCheck.isModified();
...
}
The implementation of the isModified method can simply always throw an exception. The modifier of a method requires that this line of code be present at the very start. It removes it afterwards, so now the modified method works correctly.
> Code that depends on bytecode modification should
> consider having a special mode indicating that
> they're not ready yet.
I think you misunderstand the bytecode modification, byte code modification occurs *before* the class is ready to be used, may be modified in the file system or on loading time by the class loader. There is no possibility to modify a class already loaded (and registered on the JVM), in fact bytecode may be converted to machine code by the JVM JIT.
In the JDO world (and in JPA as in JPOX) an object may be "transient", when you make it "persistent" no bytecode modification occurs, this object is already "persistent capable" and the class is already enhanced, if not "persistent" any interceptor does nothing, when "persisted" internal object interceptors attach/synchronize this object to the database.
With JNIEasy is much the same when a "native capable" object is made "native" is attached/synchronized with a native memory fragment, if not native internal artifacts do nothing.
There is no possibility to modify a class already loaded (and registered on the JVM), in fact bytecode may be converted to machine code by the JVM JIT.
"Conversion to machine code" is not an irreversible step in a JIT: if you change the byte code that gave rise to some piece of machine code, the machine code just gets invalidated and the JIT recompiles your new bytecode next time around. It has to be able to do that anyway even without bytecode modifications to be able to perform many kinds of optimizations.
And, yes, Java lets you (or will soon let you) change classes on the fly; it's needed for debugging and patching long-running jobs.
Bytecode enhancement DOES NOT change the behavior of your Java code counter to its specification at runtime. A class that is not declared final is specified to ALLOW polymorphism. Learn your OO.
Bytecode generation is simply one of many ways to produce valide bytecode. The technique happens to do this at runtime, but once produced, the bytecode is no different than any other. It happens not to have java source code. Neither does bytecode created by compiling groovy or scala or any of a dozen other mechanisms. If it's the runtime creation part you don't like, you must hate that the JDK 6 has a compiler API that allows java source code to be compiled at runtime. Of course producing the bytecode is not the real crux of the matter, as bytecode must be loaded by a class loader to run. Maybe this is your objections. If so, please stop using application servers with class loaders that allow you to load webapps at runtime. Or any program with hot loadable plugins.
Should We Use Bytecode Modification?
At 12:18 AM on Sep 19, 2007, Daniel Spiewak wrote:
Fresh Jobs for Developers Post a job opportunity
To me, it boils down to one simple fact: bytecode modification changes the behavior of your Java code counter to its specification at runtime. By its very definition, bytecode modification lets you do things which are not enabled by the Java specification (though allowed by bytecode). Unfortunately, this can potentially make things very hard to debug, and lead to very strange errors. For example, if there's a problem in a generated section, the JVM will spit out some weird and difficult-to-fathom error. At the very least, it won't correspond with any specific source code. As a result, fixing the problem may not be as straightforward as one may think.
What's worse, is many framework developers have taken to the use of bytecode modification libraries to handle key features. Hibernate, for example, uses CGLIB to enable attribute lazy-loading. Spring uses bytecode modification for its AOP functionality (actually, all AOP frameworks use this technique). So to use any of this great functionality supplied by these libraries, we're stuck with all the potential repercussions and weird errors.
Is it really best practice for us to be using such hacky techniques to accomplish our aims? Is bytecode modification really a valid tool to use?
19 replies so far (
Post your own)
Re: Should We Use Bytecode Modification?
One of your questions attempts to impose the answer. Of course its not reasonable to use such hacky techniques. But they are not hacky.Yes, there are risks involved. But all of the same risks apply to using a Java source to bytecode compiler. Ever seen bugs in the Java compiler? Or the JIT? Some of the ECJ bugs have been brutal, and some JITs have had similar issues.
The bottom line is that buggy code in places like this can be really bad and difficult to deal with.
Fortunately, a little goes a long way, and if designed well, testing can be quite extensive and comprehensive.
I really don't understand why you keep saying that this "goes against the specification". It doesn't go against the specification any more than compiling Jython code to java bytecode does. It is just a different way of doing things.
Re: Should We Use Bytecode Modification?
Daniel, you should have done your home work. Spring framework is NOT using bytecode modification for its AOP functionality. Its AOP implementation is completely proxy-based. Though Spring can generate bytecode at the runtime to proxify classes that can't be handled by the java.lang.reflect.Proxy, i.e. non-interface based ones.Re: Should We Use Bytecode Modification?
> Daniel, you should have done your home work. Spring> framework is NOT using bytecode modification for its
> AOP functionality. Its AOP implementation is
> completely proxy-based. Though Spring can generate
> bytecode at the runtime to proxify classes that can't
> be handled by the java.lang.reflect.Proxy, i.e.
> non-interface based ones.
As I understand it, you can use either approach. We were using proxies, but we've switched to the AspectJ approach which weaves at runtime. For us, it avoids certain problems when executing calls within a proxied class. Anyway, the option is there, at least with Spring 2.
Patrick
Re: Should We Use Bytecode Modification?
> What's worse, is many framework developers have taken> to the use of bytecode modification libraries to
You're right. So, what problems have you had with it in practice? I rely on several libraries that perform bytecode generation at compile time or at runtime and have had no bugs, that I know of, relating to it. The only issues we've were Jar-versioning issues, where two libraries were using incompatible versions of a bytecode library (ASM is a common problem), but we got around this using JarJar.
Your argument sounds a little too theoretical. I'd like to hear some arguments against bytecode generation based on experience. In practice, I like what bytecode generation allows for.
Regards
Patrick
Re: Should We Use Bytecode Modification?
>> bytecode modification changes the behavior of your Java code counter to its specification at runtimeDI frameworks also DO the same thing. Reflection has the same effect, since you can set private fields and call private methods.
It's like the gun that is given to you, but it's your choice to shoot yourself.
Advantages are huge and mostly are execution speed related, since no reflection overhead.
Disadvantages are the if a rookie gets his hands on this "kewl" feature the consequences can be devastating ... Like our ruby friend that decided that he didn't like nil object, so he modified it.
I personally like to use bytecode manipulation to add property change events and truly transparent DI. ASM is a great, and so is the ASMified Bytecode view in Eclipse.
Re: Should We Use Bytecode Modification?
I have no problem with bytecode generation per se, and have used it from time to time successfully. To some extent, one could argue that the Java Proxy class does just that.Where I have had problems is with secure systems that have to be audited by external specialists. These people generally know a lot about security and what loopholes to look for. However, when faced with tens of thousands of lines of business code, they have their work cut out understanding how it works and ensuring nothing bad can happen.
If you present them with a module that does dynamic bytecode generation, there are strong chances that their brains will blow.
The same happens when using concurrent code.
Ian
Re: Should We Use Bytecode Modification?
If we live in the POJO world, the bytecode instrumentation is the *best* tool to achieve the magic required by the POJO oriented development.Bytecode instrumentation (modification, weaving, enhancement...) is not for anyone, is very very useful to frameworks converting a POJO in a service (or exposing it to services) or to AOP (but no very much people uses AOP I think).
Reflection and proxies are nice but fall short and they are slow (if we are talking about database access is not an issue or course).
When I started to develop JNIEasy I tried reflection and evaluated proxies but I wanted *real native transparency* with almost *no restriction*. In the persistent world ObjectStore C++ was a dream (in C++!), JDO achieved the same transparency as ObjectStore with bytecode enhancement, bytecode enhancement is the feature dreamed by C++ (ObjectStore used memory page faults and required source code). Terracotta needs bytecode enhancement to provide the clustering magic... if we want real transparency bytecode modification is the best.
About debugging and good practices: bytecode modification must be very very small, basically *hooks*, calls to interceptors, before/after a field access or Java call or replacing them fully, new code shouldn't have ifs, any decision should be done on normal Java code (including "do nothing").
When you debug modified bytecode you usually don't need the modified source because it should be stupid, and should be documented (when the field is read blah blah), usually modified bytecode hasn't debug info but is not a problem because you can debug any interceptor call because interceptor code is normal Java code. Anyway a disassembler like JODE may be useful.
For instance:
int a = this.data;
may be enhanced to:
int a = MyInterceptor.processIntField(this.data);
What is the debug problem? processIntField call can be debugged! you know what is the original value (parameter) and the result (returned value).
And for bytecode instrumentation the easiest tool for me is Javassist, with Javassist you can use Java! you don't need to know the bytecode format because Javassist contains a Java compiler.
Re: Should We Use Bytecode Modification?
I cannot agree more with Jose Maria.You should look at byte-code instrumentation as just an extra-compilation step.
There are several good libraries to instrument the byte-code, such as BCEL and ASM.
And the resulting byte-code is still 100% compliant with the JVM specification, including profiling and debugging APIs (don't worry, your lines won't be changed when you'll debug
Rgds, Eric.
Re: Should We Use Bytecode Modification?
1. No form of bytecode manipulation will make the JVM do something outside its spec (unless it has a bug :)). It may change the way the code works, in relation of what the SOURCE says, but the BYTECODE won't even run unless it is verified valid by the JVM.2. Problems with side-effects from bytecode generation/manipulation may occur if the framework does something wrong, but it is just like any other bug. I mean, a bug in Hibernate's class generation algorithm is the same as a bug in your ActiveObject's dynamic proxy (which works pretty much the same as runtime class generation using CGLib). You just have to correct the bug, not blame the technology.
I certainly don't like too much 'magic'. I don't think bytecode generation/manipulation should be used unless it is made to work in a deterministic (predictable) way, and the one who uses it really know what he is doing. But I do believe Gavin and Rod know what they are doing
it's your choice
IIf AspectJ or Hibernate make your debugging more difficult than the benefit you derive from them, don't use them. And if this functionality is important enough to you, lobby Sun to change their language, runtime, and/or specs to support these features better.Re: Should We Use Bytecode Modification?
Yes, we should use bytecode modification. Is it potentially harmful? Yes. Is it harder to debug? Yes. Can it be beneficial? Yes. The greatest risk happens when you write code which utterly depends on bytecode modification, or else it misbehaves.I might be looking around a project's javadocs and I find a useful looking class. I call the constructor, call some methods, and little do I know that I was supposed to pass this thing to a bytecode modifier before the class can be used correctly. In the meantime, what harm have I caused? Have I destroyed critical data?
Code that depends on bytecode modification should consider having a special mode indicating that they're not ready yet. Perhaps they can throw an exception, and the bytecode modification step sees the exception and removes it. Perhaps it can require it to be there.
public void doStuff() {
BytecodeModififerCheck.isModified();
...
}
The implementation of the isModified method can simply always throw an exception. The modifier of a method requires that this line of code be present at the very start. It removes it afterwards, so now the modified method works correctly.
carbonado.sourceforge.net
Re: Should We Use Bytecode Modification?
> Code that depends on bytecode modification should> consider having a special mode indicating that
> they're not ready yet.
I think you misunderstand the bytecode modification, byte code modification occurs *before* the class is ready to be used, may be modified in the file system or on loading time by the class loader. There is no possibility to modify a class already loaded (and registered on the JVM), in fact bytecode may be converted to machine code by the JVM JIT.
In the JDO world (and in JPA as in JPOX) an object may be "transient", when you make it "persistent" no bytecode modification occurs, this object is already "persistent capable" and the class is already enhanced, if not "persistent" any interceptor does nothing, when "persisted" internal object interceptors attach/synchronize this object to the database.
With JNIEasy is much the same when a "native capable" object is made "native" is attached/synchronized with a native memory fragment, if not native internal artifacts do nothing.
Re: Should We Use Bytecode Modification?
There is no possibility to modify a class already loaded (and registered on the JVM), in fact bytecode may be converted to machine code by the JVM JIT."Conversion to machine code" is not an irreversible step in a JIT: if you change the byte code that gave rise to some piece of machine code, the machine code just gets invalidated and the JIT recompiles your new bytecode next time around. It has to be able to do that anyway even without bytecode modifications to be able to perform many kinds of optimizations.
And, yes, Java lets you (or will soon let you) change classes on the fly; it's needed for debugging and patching long-running jobs.
Re: Should We Use Bytecode Modification?
Bytecode enhancement DOES NOT change the behavior of your Java code counter to its specification at runtime. A class that is not declared final is specified to ALLOW polymorphism. Learn your OO.Bytecode generation is simply one of many ways to produce valide bytecode. The technique happens to do this at runtime, but once produced, the bytecode is no different than any other. It happens not to have java source code. Neither does bytecode created by compiling groovy or scala or any of a dozen other mechanisms. If it's the runtime creation part you don't like, you must hate that the JDK 6 has a compiler API that allows java source code to be compiled at runtime. Of course producing the bytecode is not the real crux of the matter, as bytecode must be loaded by a class loader to run. Maybe this is your objections. If so, please stop using application servers with class loaders that allow you to load webapps at runtime. Or any program with hot loadable plugins.