forwarding
---------- Forwarded message ---------- From: Alessio Stalla alessiostalla@gmail.com Date: Wed, Oct 28, 2009 at 11:20 PM Subject: Re: [j-devel] Improving startup time: sanity check To: Erik Huelsmann ehuels@gmail.com Cc: armedbear-j-devel@lists.sourceforge.net, Alex Muscar muscar@gmail.com
On Wed, Oct 28, 2009 at 9:15 PM, Erik Huelsmann ehuels@gmail.com wrote:
Last weekend, we experimented with better autoloading. It turned out to strip roughly .4 seconds from a cold startup time of 1.7s, making it a 25% improvement.
However, the reason we started out with the startup time improvements in the first place was the ABCL startup time on Google App Engine. It turns out that our CPU usage during startup hasn't really decreased much (as per their benchmark indicator - they can't really give an actual figure).
So, I asked for advice on #appengine (on freenode). Their reaction was "we can't imagine the startup time being related to the size of the JAR" even though Peter Graves calculated a 34% ratio between ABCL and Clojure jar sizes and a 35% ratio between startup times - that looks like a linear match. Their reaction continued "you're probably just doing too much work during the init() phase."
The init() phase is where the ABCL environment gets loaded and all function objects get created.
Let's assume for a second they're right. In that case we must assume it's not I/O holding us up: it's the work the CPU must do to get us up and running. If that's true, profiling the application should tell us something about the bottlenecks we're running into. I happen to have done quite a number of such profiles in the course of last week. The conclusion which stands out is that ABCL - during the startup process
- spends ~ 40% of its time finding class constructors: the main
component of creating function objects.
This brought me to the conclusion that our startup process could be much faster, if we decided to delay function object creation until the function is actually used: we would eliminate the need to construct function objects until they're used instead of creating them when their siblings are requested to be loaded.
The idea is to create another Autoload derivative which will be "installed" in the appropriate places which, when invoked, loads the actual class from the byte array. I'm hoping this will cause a more equally spread "initialization load". The performance hit will only be the first call to the function: after it has been converted from the byte array, the autoload object will remove itself from the function call chain.
So, how about it? Comments most welcome!
I have mixed feelings about the idea. I think it's clever; but I also think we (I, at least) need more data to know if it will be actually beneficial.
If the goal is speeding up startup time in a context like AppEngine - where not only Lisp, but the whole user application will be loaded from scratch from time to time - then it is critical to know how many Lisp functions a generic application uses on average (both directly and indirectly). If it turns up that, say, 50% of Lisp is commonly used, then no matter how clever an autoloading scheme you implement, you'll cut loading times only by roughly 50% at best. If getting constructors through reflection is really the bottleneck, and if we determine that using new instead of reflection is significantly faster (from a quick test of mine, it seems it *really* is [1]), then it might be sensible to avoid reflection altogether and devise another scheme. For example, the compiler-generated class X could contain in its static initialization block the equivalent of something like
Lisp.someThreadLocal.set(new X())
and loadCompiledFunction or what it is could just fetch the instance from the threadlocal; not very elegant, but if it speeds things up...
Alessio
[1] this is the astounding result on a couple of runs on 50000 iterations (test files attached): REFLECTION: 16262373155 NEW: 84267527 % SLOWER: 19298
REFLECTION: 15917190176 NEW: 103681915 % SLOWER: 15351
REFLECTION: 15838714133 NEW: 77235481 % SLOWER: 20507
(times in ns) i.e. reflection as we use it is roughly 150-200 times slower than new and that's on a very simple class with no superclasses and a single constructor! The test might be wrong as I wrote it quickly and it's quite tricky. It uses the very same classloader of abcl, though (copy-pasted).
Erik Huelsmann writes:
forwarding
---------- Forwarded message ---------- From: Alessio Stalla alessiostalla@gmail.com Date: Wed, Oct 28, 2009 at 11:20 PM Subject: Re: [j-devel] Improving startup time: sanity check
This brought me to the conclusion that our startup process could be much faster, if we decided to delay function object creation until the function is actually used: we would eliminate the need to construct function objects until they're used instead of creating them when their siblings are requested to be loaded.
I haven't followed the actual issue, but the autoloading stuff makes abcl-svn feel pretty unresponsive at times, for example when it has to load in the actual pretty-printer code when trying to print a backtrace.
If you want to add more laziness please make it optional.
-T.
On Thu, Oct 29, 2009 at 10:49 AM, Tobias C. Rittweiler tcr@freebits.de wrote:
Erik Huelsmann writes:
forwarding
---------- Forwarded message ---------- From: Alessio Stalla alessiostalla@gmail.com Date: Wed, Oct 28, 2009 at 11:20 PM Subject: Re: [j-devel] Improving startup time: sanity check
This brought me to the conclusion that our startup process could be much faster, if we decided to delay function object creation until the function is actually used: we would eliminate the need to construct function objects until they're used instead of creating them when their siblings are requested to be loaded.
I haven't followed the actual issue, but the autoloading stuff makes abcl-svn feel pretty unresponsive at times, for example when it has to load in the actual pretty-printer code when trying to print a backtrace.
If you want to add more laziness please make it optional.
We could add a function, say system:resolve-all-autoloads, that you can call in your init file. Then abcl's startup will be pretty long, but later it won't be unresponsive.
On Thu, Oct 29, 2009 at 12:02 PM, Alessio Stalla alessiostalla@gmail.com wrote:
On Thu, Oct 29, 2009 at 10:49 AM, Tobias C. Rittweiler tcr@freebits.de wrote:
Erik Huelsmann writes:
forwarding
---------- Forwarded message ---------- From: Alessio Stalla alessiostalla@gmail.com Date: Wed, Oct 28, 2009 at 11:20 PM Subject: Re: [j-devel] Improving startup time: sanity check
This brought me to the conclusion that our startup process could be much faster, if we decided to delay function object creation until the function is actually used: we would eliminate the need to construct function objects until they're used instead of creating them when their siblings are requested to be loaded.
I haven't followed the actual issue, but the autoloading stuff makes abcl-svn feel pretty unresponsive at times, for example when it has to load in the actual pretty-printer code when trying to print a backtrace.
If you want to add more laziness please make it optional.
We could add a function, say system:resolve-all-autoloads, that you can call in your init file. Then abcl's startup will be pretty long, but later it won't be unresponsive.
Well, the thing I'm talking about now is expressly *not* to autoload all stuff from a file at once: just load the byte strings (which is not the performance bottle neck) once and leave it at that until a specific function is required the first time around. This increases performance for the autoload, but decreases performance on first-use.
The autoloading mechanism is thus not really the same kind of autoloading as the one we already have. However, the idea to have a proxy which delays some resolving/loading action is the same.
I hope that clarifies the intent.
Bye,
Erik.
armedbear-devel@common-lisp.net