Hi Sven,
I don't know if you or anyone else is interested, but I have implemented xml and sexp serialization/deserialization of simple arrays -- I needed it for an app that uses cl-prevalence. I've attached the patch.
BTW, I would like to say that cl-prevalence is fantastic. We've been using it for five non-trivial (>25 classes, avg. 3000 instances per class) webapps without a hitch for almost a year now.
Regards,
Mike
-- Michael J. Forster mike@sharedlogic.ca
Mike,
On 05 Apr 2007, at 17:19, Michael J. Forster wrote:
I don't know if you or anyone else is interested, but I have implemented xml and sexp serialization/deserialization of simple arrays -- I needed it for an app that uses cl-prevalence. I've attached the patch.
The patch is OK in terms of code (I guess it is working fine in your situation), but I am not sure that it is conceptually correct (but maybe I am wrong).
According to my reading of CLHS the type simple-array on itself does not guarantee a (what I would call) homogeneous array (an array with the same type of element everywhere). The typespecs '(simple-array *) and '(simple-array <element-type>) would refer to this, but I don't know whether you can use them in method signatures.
Even so, the array-element-type could very well be too general, like T or cons or array. In that case, your serialization code fails to take shared and circular references into account (you are effectively assuming more primitive, non-shared, non-circural element-types - which probably works in the way you are using CL-PREVALENCE).
So, as I see and understand it now, your code would be OK, if we further qualify it with a test that the array-element-type is somewhat 'primitive'. But I am not sure how to express that in the method signature or how to test/enforce it in code, maybe we need a custom type predicate ?
Also, it would be very helpful if we had unit tests covering your extended serialization special cases.
Anyway, your patch would be an important optimalization for better/ faster serialization in some important cases!
BTW, I would like to say that cl-prevalence is fantastic. We've been using it for five non-trivial (>25 classes, avg. 3000 instances per class) webapps without a hitch for almost a year now.
That is very nice to hear: could you give some more details, like:
- what CL implementation you are using ? - what serialization you are using ? - the typical sizes of you transaction and snapshot files ? - total number of objects under prevalence, 75000 ? - rate of change (transaction log growth per day or so) ? - size of the image ? - machine details ? - do you have any GC problems ? - anything else you want to share
Regards,
Sven
On 2007-04-06, at 03:34, Sven Van Caekenberghe wrote:
Mike,
On 05 Apr 2007, at 17:19, Michael J. Forster wrote:
I don't know if you or anyone else is interested, but I have implemented xml and sexp serialization/deserialization of simple arrays -- I needed it for an app that uses cl-prevalence. I've attached the patch.
The patch is OK in terms of code (I guess it is working fine in your situation), but I am not sure that it is conceptually correct (but maybe I am wrong).
No, you are correct, and, in my haste, I posted the patch without fully describing my scenario or intentions. My apologies.
According to my reading of CLHS the type simple-array on itself does not guarantee a (what I would call) homogeneous array (an array with the same type of element everywhere). The typespecs '(simple-array *) and '(simple-array <element-type>) would refer to this, but I don't know whether you can use them in method signatures.
Even so, the array-element-type could very well be too general, like T or cons or array. In that case, your serialization code fails to take shared and circular references into account (you are effectively assuming more primitive, non-shared, non-circural element-types - which probably works in the way you are using CL- PREVALENCE).
So, as I see and understand it now, your code would be OK, if we further qualify it with a test that the array-element-type is somewhat 'primitive'. But I am not sure how to express that in the method signature or how to test/enforce it in code, maybe we need a custom type predicate ?
Yes, method signatures, one of my bigger CL gripes, though I do appreciate the reasons that the CLOS designers allowed dispatch on class rather than type, including compound typespecs. (It's like complaining that Feanor's Simarils didn't come in orange. ;-)
I think you nailed the issue in your second last sentence above. To my thinking, non-vector arrays are concrete types as opposed to the more abstract vectors and lists and even more abstract sequences. One has to qualify non- vector array element type on a case-by-case basis, which is perfectly acceptable -- and expected -- at the application level, but not for reusable libraries. Hence, the inviability of my patch.
Really, what I wanted to do was extend the cl-prevalence serialization/deserialization for my-application-specific-2D-array-of-rationals by writing methods in my application sources. However, while serialize-xml-internal and serialize-sexp- internal are generic functions, the corresponding deserialization functions are not. So, with barely an hour to deliver a feature, I hacked the ugly hack ;-)
Perhaps the deserialization functions could be reworked as GFs, allowing complete application-specific extension? I would be happy to help out if you're interested.
BTW, I would like to say that cl-prevalence is fantastic. We've been using it for five non-trivial (>25 classes, avg. 3000 instances per class) webapps without a hitch for almost a year now.
That is very nice to hear: could you give some more details, like:
- what CL implementation you are using ?
We develop with LW 4.4 and 5.0 on Mac and Windows; we deploy to CMUCL 19b on FreeBSD and LW 5.0 on Mac.
- what serialization you are using ?
We've tried both and would prefer to use the sexp format for its greater readability. However, we started with xml and haven't had an opportunity to change it.
- machine details ?
Dell 2U Intel P4 3.2GHz 4GB RAM 160GB usable disk, RAID1
Apple Xserve G5 Dual 2.3GHz 2GB RAM 140GB usable disk, RAID 5
- the typical sizes of you transaction and snapshot files ?
- total number of objects under prevalence, 75000 ?
- rate of change (transaction log growth per day or so) ?
- size of the image ?
I will collect some stats over the next few weeks and post them.
- do you have any GC problems ?
None that we've detected, though, without any outward signs of memory exhaustion, dying processes, or poor overall application performance, we haven't gone looking for trouble. I will start recording the GC stats as well.
- anything else you want to share
Probably, yes, though I need to find some time to organize my thoughts.
Suffice it to say, we've built a substantial database management layer atop of cl-prevalence, and, often, when I try to explain it to customers or business partners, most can't understand why we didn't just use SQL, some object- relational mapping package, and so forth.
It's hard to explain, given my rather unique experience in the database application market. My first employer and mentor, Dave Voorhis, is the author of one of only a handful of true relational database management systems:
http://dbappbuilder.sourceforge.net/Rel.html
If I can't convince someone that a SQL DBMS is not an RDBMS, then I can't begin to explain why we don't use SQL, why we went to the trouble of building our own DBMS, and why we can, legitimately, call it a RDBMS in spite of the word "prevalence" and the associated flame-fest.
Anyway, sorry, the rant wasn't meant for you. :-) Simply covering my corporate butt in case a customer or competitor ever reads this and attempts to misrepresent our position. In the end, cl-prevalence is a real boon to our work. If you have a PayPal button for the project, I would happily click it!
Regards,
Mike
-- Michael J. Forster mike@sharedlogic.ca
On 06 Apr 2007, at 19:24, Michael J. Forster wrote:
BTW, I would like to say that cl-prevalence is fantastic. We've been using it for five non-trivial (>25 classes, avg. 3000 instances per class) webapps without a hitch for almost a year now.
That is very nice to hear: could you give some more details, like:
- what CL implementation you are using ?
We develop with LW 4.4 and 5.0 on Mac and Windows; we deploy to CMUCL 19b on FreeBSD and LW 5.0 on Mac.
- what serialization you are using ?
We've tried both and would prefer to use the sexp format for its greater readability. However, we started with xml and haven't had an opportunity to change it.
- machine details ?
Dell 2U Intel P4 3.2GHz 4GB RAM 160GB usable disk, RAID1
Apple Xserve G5 Dual 2.3GHz 2GB RAM 140GB usable disk, RAID 5
- the typical sizes of you transaction and snapshot files ?
- total number of objects under prevalence, 75000 ?
- rate of change (transaction log growth per day or so) ?
- size of the image ?
I will collect some stats over the next few weeks and post them.
- do you have any GC problems ?
None that we've detected, though, without any outward signs of memory exhaustion, dying processes, or poor overall application performance, we haven't gone looking for trouble. I will start recording the GC stats as well.
This sounds like an awesome application! I am glad CL-PREVALENCE helped you in achieving your goals.
Regards,
Sven
cl-prevalence-devel@common-lisp.net