Re: [armedbear-devel] Optimizing loading times: different strategy for externalizing

21 May 2010


      On Fri, May 21, 2010 at 5:22 AM, Alessio Stalla <alessiostalla@gmail.com> wrote:
...
On Fri, May 21, 2010 at 9:39 AM, Erik Huelsmann <ehuels@gmail.com> wrote:
...
A follow-up on my progress this week:
...
As described by Alessio, it looks like our loading process profiles
are dominated by reader functions. So, I've taken a look at what it
actually is that we serialize. I found that many things we serialize
today - which need to be restored by the reader - can be serialized
without requiring the reader to restore it: lists of symbols and
lists.
Except for DECLARE-* functions related to function references, I have
changed the externalization code to go through a single function:
EMIT-LOAD-EXTERNALIZED-OBJECT. This function externalizes the object
(if that didn't already happen) and emits code to load a reference to
the restored object. The actual serialization doesn't differ much from
the original. The difference is in the boiler plate that was in each
of the DECLARE-* functions, which is no longer part of the
serialization functions. I use a dispatch table to find the
serialization function belonging to the object to be externalized.
...
That's where I decided to take a look at today's serialization
mechanism. Roughly speaking, those are the functions in
compiler-pass2.lisp with a function name starting with DECLARE-*; the
namespace seems to contain functions for externalizing objects as well
as for caching constant values.
The caching / pre-evaluation is still in the DECLARE-* namespace;
nothing has changed there, not even the boiler plate :-)
...
On trunk, I'm working to:
 * separate the caching from the externalizing name-spaces
 * separate serialization and restoring functionalities in different functions
  (they were conflated in a single function for each type of object)
 * define serialization functions which allow recursive calling patterns for
  nested serialization of objects (to be restored without requiring the reader)
These actions are mostly completed. Enough for me to try the effect of
serializing lists differently. We have lots of lists with symbols in
them. These lists don't need to be read, but instead can be directly
constructed using "new Cons(new Fixnum(1), new Cons(..., NIL));"
I created code yesterday which does exactly that. Unfortunately, there
was no measurable impact on our boot time.
So, the conclusion must be that our fasl reader is great, to the
extent that it allows human-readable fasls, but it brings us the
negative side effect that we start up too slow to be useable on - for
example - Google App Engine.
Any ideas on improving our FASL format?
Ideas I've had myself:
 * Reduce the length of the names of the functions ABCL uses to create fasls
 * Embed documentation strings in CLS files instead of having them in the FASL
 * <Other things which reduce the size of a fasl>
Can we assume that the textual part of a FASL is ASCII text and thus
avoid UTF-8 conversions? It seems that it took a lot of time from my
profiling.
Would that preclude having unicode string constants?
...
Alessio
_______________________________________________
armedbear-devel mailing list
armedbear-devel@common-lisp.net
http://common-lisp.net/cgi-bin/mailman/listinfo/armedbear-devel

Re: [armedbear-devel] Optimizing loading times: different strategy for externalizing

Alan Ruttenberg