Every now and then (see ticket 117 for a current example), an issue pops up where we fail class file verification with a stack inconsistency error. The problem is that in some specific cases, it's impossible to accumulate function call argument values on the stack: the JVM clears out the stack in these cases.
Next to storing values on the stack, the JVM offers local variables. These don't get cleared out. Historically, we have solved the problem by 'rewriting' the code being compiled in a way that made it store its values in local variables. This rewriting happens in pass1 of the compiler, based on the outcome of a the 'safety scanning function' UNSAFE-P.
However, this function is overeager (will classify too many cases as unsafe) and it needs to be manually added to all cases which may end up being problematic in pass2.
My proposal is now to add infrastructure to pass2 to handle the situation: Some blocks of code need to be marked as "collecting function call arguments". Then, upon the compilation of each argument, the new infrastructure would analyse the form being compiled. If that form will contain 'unsafe' bits, the compiler will automatically switch collection of function call arguments to local variables instead of on the stack.
My idea would be to add a macro called WITH-STACK-PARAMETER-COLLECTION and an accompanying COMPILE-STACK-PARAMETER. The macro would set up some special variables for the COMPILE-STACK-PARAMETER to inspect and set. At the end of the macro, code will be emitted to make sure all the values are correctly loaded on the stack, just as if they had always been there.
The pass1 form rewriting could be eliminated and our decision rules about what's 'unsafe' and what's not would be completely correct, instead of "overeager, but at least not missing any cases" [the latter it obviously does, since pass1 needs to estimate what pass2 will be compiling].
Any comments?
Regards,
Erik.
On 30 December 2010 23:39, Erik Huelsmann ehuels@gmail.com wrote:
My idea would be to add a macro called WITH-STACK-PARAMETER-COLLECTION
Some snipping done. :)
The pass1 form rewriting could be eliminated and our decision rules
Here too. :)
Any comments?
This seems to correlate with the overall (vague) direction to which I have envisioned our compiler to evolve. The stack inconsistencies etc. are things that I at least have seen as bugs caused by our code being too manual/low-level. The with-stack-parameter-collection macro sounds very much like a proper tool to help with such problems, and seems to be on the same track as some similar cleanups we've done inside pass2.
So, regarding the snips above, I wholeheartedly support the macro idea. Regarding pass1/pass2, please remember that the split of the two was never clear-cut to begin with, the division of responsibilities was muddied at best and the file-split I did at some point ages ago was more a hatchet/chainsaw job than any artistic sculpture job. :) The use of unsafe-p in conjunction with half-depending on pass2 in pass1 is probably a clear sign of an incomplete design to begin with. Therefore this sounds very much like a step in the right direction.
If we ever aspire to have several pass2 back-ends, doing stack analysis in pass1 is probably a bad idea anyway.
a) I don't know internals well enough to decipher the proposal. Is allocation on the stack a performance optimization? If so has any metering been done to see whether it actually impacts performance? If you would like opinions from me and perhaps others perhaps you could say a few more words about what the safety issue is? When is the JVM stack cleared?
b) If you are going to be doing thinking about compiler architecture I would very much like to see some thought going into debuggability. My current impression is that the trend is downward with more and more cases of not even being able to see function call arguments in slime. For example, at this point in ABCL's development, I think having a compiler option that pays performance for the ability to view local variables in the debugger should be an increasing priority. Other's mileage may vary, but in my case, where the bulk of time is spent in java libraries of various sorts, improving lisp performance is a distinctly lower priority than improving developer productivity by making it easier to debug.
2¢, Alan
On 31 December 2010 05:30, Alan Ruttenberg alanruttenberg@gmail.com wrote:
a) I don't know internals well enough to decipher the proposal. Is allocation on the stack a performance optimization? If so has any metering been done to see whether it actually impacts performance? If
Erik apparently speaks of the following chapter: http://java.sun.com/docs/books/jvms/second_edition/html/Overview.doc.html#17...
On 12/31/10 4:30 AM, Alan Ruttenberg wrote: […]
b) If you are going to be doing thinking about compiler architecture I would very much like to see some thought going into debuggability. My current impression is that the trend is downward with more and more cases of not even being able to see function call arguments in slime. For example, at this point in ABCL's development, I think having a compiler option that pays performance for the ability to view local variables in the debugger should be an increasing priority. Other's mileage may vary, but in my case, where the bulk of time is spent in java libraries of various sorts, improving lisp performance is a distinctly lower priority than improving developer productivity by making it easier to debug.
Improving the ability to debug code is the main point is one of my main concerns as well. It is indeed good to know that this point of view is shared within the ABCL user community, and rest assured that I will continue to advocate/promote this line of thought in the development discussions on #abcl.
Steps in the direction of better debug facilities for ABCL that I have in mind would be 1) the ability to inspect local variables from a given stack frame (probably by implementing EVAL with respect to a given frame, 2) implementing STEP, 3) the ability to restart computation at a given frame, and 4) better integration with source location information like the necessary hooks to support XREF. I have no concrete plans for implementation yet, as I am just trying to analyze what would be involved. Any others out there on private wishlists?
Currently I am investigating how to do these things in interpreted mode as it seems to be the easier path, and being able to do debug things would greatly assist the construction of a corresponding compiler. The one exception to the observation that we should try to do these debug implementation in interpreted mode first, is it might be possible to implement STEP in compiled code via "native" JVM breakpoints, but this would entail outfitting the compiler with a much better mapping of source to object code (and might involve using one JVM to instrument a second one via JVMTI). This "source to object" mapping in the compiler seems to be needed in general, as it would allow (eventually) things like finding the value of a local variable in compiled mode and to debug the compiler behavior as was needed in the current problem with ticket #117 that I asked Erik to help out with in order to get callbacks working in JNA.
I have fought hard to prevent regressions in information present in SLIME. Are you observing the complete lack of all function call arguments in the SLIME debugger presentation? If so, we should debug your installation details a bit, as SLIME HEAD with ABCL trunk always should work as it's what I test with. In order to prevent the pretty printer's elision of frame arguments, I often have to set *PRINT-RIGHT-MARGIN* to some large finite value, but I assume you would have recognized this. The compiler also often optimizes away stack frames that can be restored quickly by interpreting the given form at the current point in SLIME via M-x slime-eval-defun. Please provide some more details when you get the chance.
ABCL: arming bears since 1997!
Hi Alan,
On Fri, Dec 31, 2010 at 4:30 AM, Alan Ruttenberg alanruttenberg@gmail.com wrote:
a) I don't know internals well enough to decipher the proposal.
That's no problem, I'll gladly answer any questions that you may have. I was really deep into the details when writing the proposal, so, I probably missed some of the context when trying to write down the higher level issue.
Is allocation on the stack a performance optimization?
Yes and no. It's a logical consequence of the structure of the JVM: doesn't have registers like most CPUs, meaning that the only way to pass operands to the JVM instructions is by pushing them onto the stack. Equally, all operations return values on the stack, if any. This is true for all operations, including function calls.
However, the choice to actually leave the values on the stack instead of saving them to local variables and reloading them later, could be considered an optimization. Saving to local vars - if taken to its extremest - would lead to the following byte code as the compiled version of "(if (eq val1 val2) :good :bad)", assuming val1 and val2 are local variables (let bindings):
aload 1 astore 3 aload 2 astore 4 aload 3 aload 4 ifacmp :label1 getstatic ThisClass.SYMBOL_BAD astore 5 goto :label2 :label1 getstatic ThisClass.SYMBOL_GOOD astore 5 :label2 aload 5
Currently the same code compiles to:
aload 1 aload 2 ifacmp :label1 getstatic ThisClass.SYMBOL_BAD goto label2 :label1 getstatic ThisClass.SYMBOL_GOOD :label2
Although I admit that I have no idea about the cost of performance, I'd estimate a significant growth in the size of our JAR file at least.
If so has any metering been done to see whether it actually impacts performance?
No (because I haven't perceived it as a performance optimization per se, but as a size optimization too).
If you would like opinions from me and perhaps others perhaps you could say a few more words about what the safety issue is? When is the JVM stack cleared?
The JVM clears the stack when it enters an exception handler (in Java typically a catch{} or finally {} block). This is by itself not necessarily a problem: LET/LET* and PROGV just rethrow the exception after unbinding the specials they might have bound.
However, TAGBODY/GO, BLOCK/RETURN and CATCH/THROW are constructs which catch exceptions in the Java world and continue processing in the current frame. This is a problem: if any values had been collected on the stack before the TAGBODY form, they've now "disappeared" in some of the code-paths. Most of the negative effects have been eliminated by rewriting code into constructs which don't cause the same behaviour, so, normally, you shouldn't see this happening.
b) If you are going to be doing thinking about compiler architecture I would very much like to see some thought going into debuggability. My current impression is that the trend is downward with more and more cases of not even being able to see function call arguments in slime. For example, at this point in ABCL's development, I think having a compiler option that pays performance for the ability to view local variables in the debugger should be an increasing priority.
This compiler option should be hooked to the OPTIMIZE DEBUG declaration, if you ask me.
Other's mileage may vary, but in my case, where the bulk of time is spent in java libraries of various sorts, improving lisp performance is a distinctly lower priority than improving developer productivity by making it easier to debug.
Thanks! This is very valuable feedback. It's often very hard to determine what next steps to take; it's easy to focus on performance, since it's very well measurable. However, performance isn't the only thing which influences the development cycle. It'd be great to discuss the kind of things ABCL would need to add for debugability using a number of real-world problems: it'll make the problem much more tangible (at least to me).
2¢, Alan
Thanks for your comments!
Bye,
Erik.
The JVM clears the stack when it enters an exception handler (in Java
Minor clarification: the call stack frames are unwound rather than cleared. The stack doesn't clear completely, it unwinds to the catching frame.
If any values were stored on the stack, they're gone at that point, _unless_ they were pushed before invoking the throwing operation. I actually don't see how local variables are different, as both operand stacks and local variables are stored into call stack frames. If the frame is unwound, the values are gone. If the frame and/or local variables are established before the throwing call, they still exist after a throw, otherwise not.
Hi Ville,
On Fri, Dec 31, 2010 at 5:26 PM, Ville Voutilainen ville.voutilainen@gmail.com wrote:
The JVM clears the stack when it enters an exception handler (in Java
Minor clarification: the call stack frames are unwound rather than cleared. The stack doesn't clear completely, it unwinds to the catching frame.
If any values were stored on the stack, they're gone at that point, _unless_ they were pushed before invoking the throwing operation.
That's really not what I see happening in the verification process. What I see is that the stack of the current frame gets completely cleared. There's only 1 value on the stack when an exception handler is invoked, which is the exception object. Or at least, this is what the verifier assumes. This link (10th bullet) seems to support that: "When an exception is thrown, the contents of the operand stack are discarded." [http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#1...].
This is my point: we are currently doing things which - if written in Java - would look like this:
MyClass.execute(Symbol.KEYWORD, try { invoke_some_lisp_code(); } catch (Go g) { return-from-try ThisClass.constantValue; } );
which first puts Symbol.KEYWORD on the stack, but the stack gets cleared out when "catch (Go g)" is triggered.
I actually don't see how local variables are different, as both operand stacks and local variables are stored into call stack frames. If the frame is unwound, the values are gone.
Right. But the frame where the exception is "handled" isn't being unwound. It's stack is being cleared though; meaning that anything that was on that stack isn't anymore. *That* is our stack inconsistency issue.
If the frame and/or local variables are established before the throwing call, they still exist after a throw, otherwise not.
Well, this would have been great, because I don't see much of a stack issue here. But as I said, I see different behaviour. And for the first time in 2,5 years of searching the JVM spec, I found supporting documentation for it.
I hope this clears up some of the mysteries related to the hoops we have to jump through to work on the JVM?
Bye,
Erik.
On 1 January 2011 01:52, Erik Huelsmann ehuels@gmail.com wrote:
Minor clarification: the call stack frames are unwound rather than cleared. The stack doesn't clear completely, it unwinds to the catching frame. If any values were stored on the stack, they're gone at that point, _unless_ they were pushed before invoking the throwing operation.
That's really not what I see happening in the verification process. What I see is that the stack of the current frame gets completely cleared. There's only 1 value on the stack when an exception handler is invoked, which is the exception object. Or at least, this is what the verifier assumes. This link (10th bullet) seems to support that: "When an exception is thrown, the contents of the operand stack are discarded." [http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#1...].
Ok, it clears the operand stack, and unwinds the call stack to the frame that has the catch. The terminology is nicely overloaded. :) That explains your findings, if the opstack is cleared but local variables aren't (well, there's no reason why the latter would be).
This is my point: we are currently doing things which - if written in Java - would look like this: MyClass.execute(Symbol.KEYWORD, try { invoke_some_lisp_code(); } catch (Go g) { return-from-try ThisClass.constantValue; } ); which first puts Symbol.KEYWORD on the stack, but the stack gets cleared out when "catch (Go g)" is triggered.
It puts Symbol.KEYWORD on the operand stack, yes. We need to distinguish these ambiguities. ;) The local variables are on a different stack, which contains the stack frames, which contain the opstack and the local variables, ultimately, as far as I understand. Hence my confusion about the difference.
Right. But the frame where the exception is "handled" isn't being unwound. It's stack is being cleared though; meaning that anything that was on that stack isn't anymore. *That* is our stack inconsistency issue.
Sounds comprehensible, yes. It also sounds somewhat logical, the opstack is for building operands for an instruction, and if a throw occurs, the instruction may or may not have popped the opstack, and it's illogical to assume that anything left on the opstack would be useful for anything, as one can't reasonably know where the throw occurred and what's left on the opstack. It makes sense to me.
So yes, the catching frame of the call stack isn't unwound, but the opstack in that frame is cleared. If we're trying to assume that the opstack contains sane values in the presence of exceptions, it seems to me we're playing tricks with the jvm facilities.
Hi Ville,
On Sat, Jan 1, 2011 at 1:08 AM, Ville Voutilainen ville.voutilainen@gmail.com wrote:
On 1 January 2011 01:52, Erik Huelsmann ehuels@gmail.com wrote:
Minor clarification: the call stack frames are unwound rather than cleared. The stack doesn't clear completely, it unwinds to the catching frame. If any values were stored on the stack, they're gone at that point, _unless_ they were pushed before invoking the throwing operation.
That's really not what I see happening in the verification process. What I see is that the stack of the current frame gets completely cleared. There's only 1 value on the stack when an exception handler is invoked, which is the exception object. Or at least, this is what the verifier assumes. This link (10th bullet) seems to support that: "When an exception is thrown, the contents of the operand stack are discarded." [http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#1...].
Ok, it clears the operand stack, and unwinds the call stack to the frame that has the catch. The terminology is nicely overloaded. :) That explains your findings, if the opstack is cleared but local variables aren't (well, there's no reason why the latter would be).
This is my point: we are currently doing things which - if written in Java - would look like this: MyClass.execute(Symbol.KEYWORD, try { invoke_some_lisp_code(); } catch (Go g) { return-from-try ThisClass.constantValue; } ); which first puts Symbol.KEYWORD on the stack, but the stack gets cleared out when "catch (Go g)" is triggered.
It puts Symbol.KEYWORD on the operand stack, yes. We need to distinguish these ambiguities. ;) The local variables are on a different stack, which contains the stack frames, which contain the opstack and the local variables, ultimately, as far as I understand. Hence my confusion about the difference.
Heh. Good thing we have established the meaning of the JVM vocabulary here :-)
[snipped]
So yes, the catching frame of the call stack isn't unwound, but the opstack in that frame is cleared. If we're trying to assume that the opstack contains sane values in the presence of exceptions, it seems to me we're playing tricks with the jvm facilities.
Right. Well, the fact isn't so much that we're expecting sane values to in the presence of exceptions; it's more that we currently have infrastructure to 'roughly' estimate if there may be exception-related special forms in the code to be compiled. However, this infrastructure only provides estimates, trying to err on the safe side. It does err on the safe side, marking most potential situations (and many more) as 'unsafe'. However, since we still see stack inconsistency errors, my guess is that we need an exact solution (instead of the current estimate).
The solution - which is in my proposal - is to stop rewriting code in pass1, but address the true issue in pass2, where the actual issue can accurately be detected.
Bye,
Erik.
On 1 January 2011 02:42, Erik Huelsmann ehuels@gmail.com wrote:
The solution - which is in my proposal - is to stop rewriting code in pass1, but address the true issue in pass2, where the actual issue can accurately be detected.
That sounds like a post-0.24 job, we should probably fix the current stack inconsistencies reported in whatever way we can and do this structural work for 0.25.
On Sat, Jan 1, 2011 at 1:10 PM, Ville Voutilainen ville.voutilainen@gmail.com wrote:
On 1 January 2011 02:42, Erik Huelsmann ehuels@gmail.com wrote:
The solution - which is in my proposal - is to stop rewriting code in pass1, but address the true issue in pass2, where the actual issue can accurately be detected.
That sounds like a post-0.24 job, we should probably fix the current stack inconsistencies reported in whatever way we can and do this structural work for 0.25.
Ok. Well, ticket #117 is fixed at a slight performance cost in the compiler. But the fix was extremely simple, that's the advantage: there's hardly any chance for regressions.
Bye,
Erik.
On Fri, Dec 31, 2010 at 8:47 AM, Erik Huelsmann ehuels@gmail.com wrote:
Hi Alan,
On Fri, Dec 31, 2010 at 4:30 AM, Alan Ruttenberg alanruttenberg@gmail.com wrote:
a) I don't know internals well enough to decipher the proposal.
That's no problem, I'll gladly answer any questions that you may have. I was really deep into the details when writing the proposal, so, I probably missed some of the context when trying to write down the higher level issue.
Is allocation on the stack a performance optimization?
Yes and no. It's a logical consequence of the structure of the JVM: doesn't have registers like most CPUs, meaning that the only way to pass operands to the JVM instructions is by pushing them onto the stack. Equally, all operations return values on the stack, if any. This is true for all operations, including function calls.
However, the choice to actually leave the values on the stack instead of saving them to local variables and reloading them later, could be considered an optimization. Saving to local vars - if taken to its extremest - would lead to the following byte code as the compiled version of "(if (eq val1 val2) :good :bad)", assuming val1 and val2 are local variables (let bindings):
aload 1 astore 3 aload 2 astore 4 aload 3 aload 4 ifacmp :label1 getstatic ThisClass.SYMBOL_BAD astore 5 goto :label2 :label1 getstatic ThisClass.SYMBOL_GOOD astore 5 :label2 aload 5
Currently the same code compiles to:
aload 1 aload 2 ifacmp :label1 getstatic ThisClass.SYMBOL_BAD goto label2 :label1 getstatic ThisClass.SYMBOL_GOOD :label2
Although I admit that I have no idea about the cost of performance, I'd estimate a significant growth in the size of our JAR file at least.
If so has any metering been done to see whether it actually impacts performance?
No (because I haven't perceived it as a performance optimization per se, but as a size optimization too).
Not sure that small jar size is an important consideration. However the jvm has 65k bytecode limit - does abcl have a workaround for that? If yes, then doing the experiment (no optimization, profile) might be worth it. I expect that the JIT would do peephole optimization and eliminate the performance issue. (though http://nerds-central.blogspot.com/2009/09/tuning-jvm-for-unusual-uses-have-s... seems worth having a look at). Even if not, it might be worth handling this in a separate optimization phase within abcl's compiler.
If you would like opinions from me and perhaps others perhaps you could say a few more words about what the safety issue is? When is the JVM stack cleared?
The JVM clears the stack when it enters an exception handler (in Java typically a catch{} or finally {} block). This is by itself not necessarily a problem: LET/LET* and PROGV just rethrow the exception after unbinding the specials they might have bound.
The text that Ville pointed me out doesn't mention this. I'm looking at http://java.sun.com/docs/books/jvms/second_edition/html/Concepts.doc.html#22... to try to understand better. Still don't get. Are you referring to "When an exception is thrown, control is transferred from the code that caused the exception to the nearest dynamically enclosing catch clause of a try statement that handles the exception."? this would imply unwinding the stack. Interested in the response to Ville's message.
However, TAGBODY/GO, BLOCK/RETURN and CATCH/THROW are constructs which catch exceptions in the Java world and continue processing in the current frame. This is a problem: if any values had been collected on the stack before the TAGBODY form, they've now "disappeared" in some of the code-paths. Most of the negative effects have been eliminated by rewriting code into constructs which don't cause the same behaviour, so, normally, you shouldn't see this happening.
b) If you are going to be doing thinking about compiler architecture I would very much like to see some thought going into debuggability. My current impression is that the trend is downward with more and more cases of not even being able to see function call arguments in slime. For example, at this point in ABCL's development, I think having a compiler option that pays performance for the ability to view local variables in the debugger should be an increasing priority.
This compiler option should be hooked to the OPTIMIZE DEBUG declaration, if you ask me.
Sounds right.
Other's mileage may vary, but in my case, where the bulk of time is spent in java libraries of various sorts, improving lisp performance is a distinctly lower priority than improving developer productivity by making it easier to debug.
Thanks! This is very valuable feedback. It's often very hard to determine what next steps to take; it's easy to focus on performance, since it's very well measurable. However, performance isn't the only thing which influences the development cycle. It'd be great to discuss the kind of things ABCL would need to add for debugability using a number of real-world problems: it'll make the problem much more tangible (at least to me).
Mark named a few. For me the current bottleneck is visibility - in an exception I can't see where I am and what's going on. I'd like to know where in my code I am, and what the state of variables is. Another one would be beefing up trace and adding an advise facility. One wants to do things like trace methods, to have code run before or after, etc. Can supply more details if desired, but the description in http://ccl.clozure.com/ccl-documentation.html is what I am familiar with.
Glad to elaborate further. I'll respond to Mark's message.
-Alan
armedbear-devel@common-lisp.net