[armedbear-devel] Design issue regarding condition printing (was: Error reporting problems)

In order to start addressing our issue with error reporting, I thought I'd start poking around at the source of all Java based exception classes (of which both PROGRAM-ERROR and SIMLE-ERROR are part). This source is the Condition java class. In the class 3 methods have been defined, all related to printing of (java based) errors: * getMessage() [overridden 13 times, 8 call sites] * getConditionReport() [not overridden, 1 call site] * writeToString() While the role of "writeToString()" is clear (it's called by PRINT to print all Java-based objects) and hence should function differently under different circumstances, the role separation between getMessage() and getConditionReport() isn't quite so well defined. This conclusion comes both from the extreme lack of documentation as well as the following: getConditionReport() calls getMessage() to find an "alternative message" to be reported. getMessage() finds out if there's a formatControl string; if there is, it calls writeToString() to generate a message, returning that to getConditionReport(). If getConditionReport() doesn't get an alternative from getMessage() to return back to the caller, it calls Lisp.format() to generate a formatted string to be presented to the user. However, this second path will never get called, except when there's no format string.... Lisp.format() explicitly checks for a null format control string and returns an empty string - which explains why we haven't seen crashes from the above code paths. So, how does that tie into the grand scheme of error reporting? We have two places where errors are being reported: 1. the lisp debugger, which prints errors using the real (lisp based) FORMAT function if it's autoloaded (but falls back to Lisp.format() if FORMAT isn't autoloaded yet) which is invoked by INVOKE-DEBUGGER-REPORT-CONDITION 2. the ERROR place holder Primitives.ERROR() which calls Condition.getConditionReport() The Lisp based FORMAT uses PRINT-OBJECT to print conditions. PRINT-OBJECT.lisp defines specialized instances for several classes in the condition class hieranchy. I know I added some in order to try to address the exact issue we're talking about here. The Java based Lisp.format() function uses <condition>.writeToString() to produce its output. With all these facts on the table, I think we need to do a few things to clean up the mess: - Any condition overriding getMessage() in order to call FORMAT should be removed - getMessage() should be documented *not* to use writeToString() on its own instance - getConditionReport() should be documented not to be overridden, but to override getMessage() instead (maybe declare 'final'?) - Remove all the special methods for classes derived from CONDITION in the print-object.lisp file: they should all fall back to the general method for printing conditions. - Both INVOKE-DEBUGGER-REPORT-CONDITION and Lisp.ERROR should bind the following variables to these listed values, in order to make sure conditions are printed for humans: *print-readably* nil --> all other *print-** variables bound like the values specified for WITH-STANDARD-IO-SYNTAX (except maybe *PRINT-ESCAPE*) The above looks like it should solve our issues. I'd greatly welcome any comments: it's a rather big change, judging by the description above. Ville, you were going to look at it, maybe you can verify if I'm half way right? Bye, Erik.

On Jan 21, 2011, at 16:48, Erik Huelsmann wrote:
*print-readably* nil --> all other *print-** variables bound like the values specified for WITH-STANDARD-IO-SYNTAX (except maybe *PRINT-ESCAPE*)
I don't think it's a good idea to explicitly bind *print-circle* to nil in anything debugging-related, including error printing, because then it's a source of nontermination (silent until out of memory, if the output is going to a string) if user code is working with circular structure. -- Kevin Reid <http://switchb.org/kpreid/>

Hi Kevin, On Fri, Jan 21, 2011 at 11:40 PM, Kevin Reid <kpreid@switchb.org> wrote:
On Jan 21, 2011, at 16:48, Erik Huelsmann wrote:
*print-readably* nil --> all other *print-** variables bound like the values specified for WITH-STANDARD-IO-SYNTAX (except maybe *PRINT-ESCAPE*)
I don't think it's a good idea to explicitly bind *print-circle* to nil in anything debugging-related, including error printing, because then it's a source of nontermination (silent until out of memory, if the output is going to a string) if user code is working with circular structure.
That's a good point. Rather, it may even be better to explicitly bind it to T. This remark made me realize that our Java side writeToString actually doesn't support *PRINT-CIRCLE* which can be an issue when trying to debug ABCL before it has loaded the ERROR and PRINT-OBJECT infrastructures. To make sure it's not forgotten, I've filed a ticket for that. Thanks for your feedback! Bye, Erik.

On 21 January 2011 23:48, Erik Huelsmann <ehuels@gmail.com> wrote:
With all these facts on the table, I think we need to do a few things to clean up the mess: - Any condition overriding getMessage() in order to call FORMAT should be removed - getConditionReport() should be documented not to be overridden, but to override getMessage() instead (maybe declare 'final'?) I'd greatly welcome any comments: it's a rather big change, judging by the description above. Ville, you were going to look at it, maybe you can verify if I'm half way right?
First of all, thanks for looking at the mess. Second, you really should leave some bits of the work to the rest of us. :) Anyway. Some comments, without having looked at the situation in detail: 1) disallowing overriding getMessage() feels wrong. The design may, and likely is, incomplete, but I got a very vague feeling that the function is serving a purpose. Now, I'm not entirely certain whether that purpose exists on the lisp or the java side. I don't have a clear picture of that yet. 2) if getConditionReport is not supposed to be overridden, then YES, absolutely make it final. This actually is a point of endless philosophical language debate. :) But, when we have a language at hand that can do such static semantics checks, let's use the checks and forbid overriding by using existing language facilities. Off the top of my head, the checklist doesn't look incorrect - I just have some doubts about it. I personally would like to do a bit of fiddling with the codebase, and see if I get the same picture. That will at most take an hour or two, and will probably result in my having a pretty good idea about how to fix the problems. So, let's give it a timeout of 1-2 days, if I can't produce anything useful within that timeframe, feel free to hack it? :) (I'm exhausted by work, no chance whatsoever to do fun stuff between Mon-Fri).

Hi Ville, On Fri, Jan 21, 2011 at 11:46 PM, Ville Voutilainen <ville.voutilainen@gmail.com> wrote:
On 21 January 2011 23:48, Erik Huelsmann <ehuels@gmail.com> wrote:
With all these facts on the table, I think we need to do a few things to clean up the mess: - Any condition overriding getMessage() in order to call FORMAT should be removed - getConditionReport() should be documented not to be overridden, but to override getMessage() instead (maybe declare 'final'?) I'd greatly welcome any comments: it's a rather big change, judging by the description above. Ville, you were going to look at it, maybe you can verify if I'm half way right?
First of all, thanks for looking at the mess. Second, you really should leave some bits of the work to the rest of us. :)
Anyway. Some comments, without having looked at the situation in detail: 1) disallowing overriding getMessage() feels wrong. The design may, and likely is, incomplete, but I got a very vague feeling that the function is serving a purpose. Now, I'm not entirely certain whether that purpose exists on the lisp or the java side. I don't have a clear picture of that yet.
Right. I think I should have said "discourage overriding". However, the original getMessage() should be reduced to "return null" - I think, because it's calling writeToString() which itself is calling getMessage() again...
2) if getConditionReport is not supposed to be overridden, then YES, absolutely make it final. This actually is a point of endless philosophical language debate. :) But, when we have a language at hand that can do such static semantics checks, let's use the checks and forbid overriding by using existing language facilities.
Off the top of my head, the checklist doesn't look incorrect - I just have some doubts about it. I personally would like to do a bit of fiddling with the codebase, and see if I get the same picture. That will at most take an hour or two, and will probably result in my having a pretty good idea about how to fix the problems.
So, let's give it a timeout of 1-2 days, if I can't produce anything useful within that timeframe, feel free to hack it? :) (I'm exhausted by work, no chance whatsoever to do fun stuff between Mon-Fri).
Sure. It's just that I thought I'd help you get started :-) Bye, Erik.

On Sat, Jan 22, 2011 at 10:41 AM, Erik Huelsmann <ehuels@gmail.com> wrote:
Anyway. Some comments, without having looked at the situation in detail: 1) disallowing overriding getMessage() feels wrong. The design may, and likely is, incomplete, but I got a very vague feeling that the function is serving a purpose. Now, I'm not entirely certain whether that purpose exists on the lisp or the java side. I don't have a clear picture of that yet.
Right. I think I should have said "discourage overriding". However, the original getMessage() should be reduced to "return null" - I think, because it's calling writeToString() which itself is calling getMessage() again...
getMessage() is useful when an exception escapes to Java and you want to understand what happened (it's part of the API of Throwable and used to log the exception). Currently ABCL is not very helpful in that regard: if you disable the Lisp debugger to have exceptions propagated to Java (as it's often a case when you use ABCL as a library) you get back anonymous exceptions with no information on what happened. That is, I think, in important area of improvement. Bye, Alessio

Hi Alessio, On Sat, Jan 22, 2011 at 11:12 AM, Alessio Stalla <alessiostalla@gmail.com> wrote:
On Sat, Jan 22, 2011 at 10:41 AM, Erik Huelsmann <ehuels@gmail.com> wrote:
Anyway. Some comments, without having looked at the situation in detail: 1) disallowing overriding getMessage() feels wrong. The design may, and likely is, incomplete, but I got a very vague feeling that the function is serving a purpose. Now, I'm not entirely certain whether that purpose exists on the lisp or the java side. I don't have a clear picture of that yet.
Right. I think I should have said "discourage overriding". However, the original getMessage() should be reduced to "return null" - I think, because it's calling writeToString() which itself is calling getMessage() again...
getMessage() is useful when an exception escapes to Java and you want to understand what happened (it's part of the API of Throwable and used to log the exception). Currently ABCL is not very helpful in that regard: if you disable the Lisp debugger to have exceptions propagated to Java (as it's often a case when you use ABCL as a library) you get back anonymous exceptions with no information on what happened. That is, I think, in important area of improvement.
Am I right to think that you'd do so by using Interpreter._DEBUGGER_HOOK_FUNCTION_ ? If that's true, then you're using Interpreter.UnhandledCondition to pass the error on to Java, including its description. Interpreter.UnhandledCondition uses <condition>.writeToString() instead of using getConditionReport(). At least with my proposed design, I'd advocate changing all calls to condition.writeToString() to condition.getConditionReport() in those cases which are meant for human readable output. I'd say the above should solve your issue. It did make me realize that it's not Primitives.ERROR which should bind the *PRINT-** variables, but it should be getConditionReport() because it may be invoked in multiple places. Summarizing: I think getMessage() should be protected; it's a method to change the behaviour of getConditionReport(). The latter should be public, because *that*'s the method which should be used all over the place. Agreed? Bye, Erik.

On Sat, Jan 22, 2011 at 12:03 PM, Erik Huelsmann <ehuels@gmail.com> wrote:
Hi Alessio,
On Sat, Jan 22, 2011 at 11:12 AM, Alessio Stalla <alessiostalla@gmail.com> wrote:
On Sat, Jan 22, 2011 at 10:41 AM, Erik Huelsmann <ehuels@gmail.com> wrote:
Anyway. Some comments, without having looked at the situation in detail: 1) disallowing overriding getMessage() feels wrong. The design may, and likely is, incomplete, but I got a very vague feeling that the function is serving a purpose. Now, I'm not entirely certain whether that purpose exists on the lisp or the java side. I don't have a clear picture of that yet.
Right. I think I should have said "discourage overriding". However, the original getMessage() should be reduced to "return null" - I think, because it's calling writeToString() which itself is calling getMessage() again...
getMessage() is useful when an exception escapes to Java and you want to understand what happened (it's part of the API of Throwable and used to log the exception). Currently ABCL is not very helpful in that regard: if you disable the Lisp debugger to have exceptions propagated to Java (as it's often a case when you use ABCL as a library) you get back anonymous exceptions with no information on what happened. That is, I think, in important area of improvement.
Am I right to think that you'd do so by using Interpreter._DEBUGGER_HOOK_FUNCTION_ ? If that's true, then you're using Interpreter.UnhandledCondition to pass the error on to Java, including its description.
That's true.
Interpreter.UnhandledCondition uses <condition>.writeToString() instead of using getConditionReport(). At least with my proposed design, I'd advocate changing all calls to condition.writeToString() to condition.getConditionReport() in those cases which are meant for human readable output.
I'd say the above should solve your issue. It did make me realize that it's not Primitives.ERROR which should bind the *PRINT-** variables, but it should be getConditionReport() because it may be invoked in multiple places.
Summarizing: I think getMessage() should be protected; it's a method to change the behaviour of getConditionReport(). The latter should be public, because *that*'s the method which should be used all over the place.
Agreed?
Yes. I was incorrectly believing that getMessage() was referring to Throwable.getMessage(), while it is an unrelated method in Condition.
participants (4)
-
Alessio Stalla
-
Erik Huelsmann
-
Kevin Reid
-
Ville Voutilainen