Hi again folks,
Maybe it's a bit early to be asking about hunchentoot, sbcl and performance... but heck, let me ask anyway. :-)
Trying the hunchentoot beta on SBCL 0.9.17, I ran tbnl-test, and uploaded a ~0.75MB binary file. (Which worked!) Testing with 'ab', I downloaded the file -- but I wasn't able to get a download faster than 359KB/sec, or 2 seconds per download, on a fast network. I also observed >95% CPU activity during the download (about 50/50 between user and system).
Profiling the :TBNL package, and downloading the file again, showed this:
seconds | consed | calls | sec/call | name -------------------------------------------------------- 1.740 | 35,748,752 | 1 | 1.739901 | HANDLE-STATIC-FILE 0.004 | 8,184 | 1 | 0.003998 | TBNL::READ-HTTP-HEADERS 0.004 | 24,912 | 6 | 0.000665 | TBNL::WRITE-HEADER-LINE/HTTP
and profiling :FLEX instead, then repeating the download, showed:
seconds | consed | calls | sec/call | name ------------------------------------------------------------- 0.009 | 81,920 | 352 | 0.000027 | FLEXI-STREAMS::READ-BYTE* 0.000 | 106,051,824 | 1,488,124 | 0.000000 | FLEXI-STREAMS::WRITE-BYTE*
The consing numbers seem pretty high. :-) I'm not sure how to interpret the 0.0 seconds for the write-byte* calls -- a hundred million conses might be fast, but not instantaneous! It's notable that the reported "profiling overhead" was ~15 seconds, perhaps it was just a profiling error. (I'm not a SLIME-profiling expert -- advice is welcome.)
Is this degree of performance similar to what you see under Lispworks? I'm not throwing stones at beta code, just trying to interpret what I'm seeing here.
Thanks, Graham
I replied to your other email before I saw this one.
On Wed, 4 Oct 2006 15:50:59 -0400, "Graham Fawcett" graham.fawcett@gmail.com wrote:
Maybe it's a bit early to be asking about hunchentoot, sbcl and performance... but heck, let me ask anyway. :-)
Trying the hunchentoot beta on SBCL 0.9.17, I ran tbnl-test, and uploaded a ~0.75MB binary file. (Which worked!) Testing with 'ab', I downloaded the file -- but I wasn't able to get a download faster than 359KB/sec, or 2 seconds per download, on a fast network. I also observed >95% CPU activity during the download (about 50/50 between user and system).
Profiling the :TBNL package, and downloading the file again, showed this:
seconds | consed | calls | sec/call | name
1.740 | 35,748,752 | 1 | 1.739901 | HANDLE-STATIC-FILE 0.004 | 8,184 | 1 | 0.003998 | TBNL::READ-HTTP-HEADERS 0.004 | 24,912 | 6 | 0.000665 | TBNL::WRITE-HEADER-LINE/HTTP
and profiling :FLEX instead, then repeating the download, showed:
seconds | consed | calls | sec/call | name
0.009 | 81,920 | 352 | 0.000027 | FLEXI-STREAMS::READ-BYTE* 0.000 | 106,051,824 | 1,488,124 | 0.000000 | FLEXI-STREAMS::WRITE-BYTE*
The consing numbers seem pretty high. :-) I'm not sure how to interpret the 0.0 seconds for the write-byte* calls -- a hundred million conses might be fast, but not instantaneous! It's notable that the reported "profiling overhead" was ~15 seconds, perhaps it was just a profiling error. (I'm not a SLIME-profiling expert -- advice is welcome.)
Is this degree of performance similar to what you see under Lispworks? I'm not throwing stones at beta code, just trying to interpret what I'm seeing here.
I haven't really checked the performance yet - I usually try to make it work correctly first. If you want to make the whole stuff faster, I think starting with FLEXI-STREAMS is a good idea. Throw a couple of optimization declarations at SBCL and see what Python has to say about them. CHUNGA might be a worthwhile target as well.
On Wed, 4 Oct 2006 15:50:59 -0400, "Graham Fawcett" graham.fawcett@gmail.com wrote:
seconds | consed | calls | sec/call | name
0.009 | 81,920 | 352 | 0.000027 | FLEXI-STREAMS::READ-BYTE* 0.000 | 106,051,824 | 1,488,124 | 0.000000 | FLEXI-STREAMS::WRITE-BYTE*
The consing numbers seem pretty high. :-)
It just occurred to me that I know what the reason for this excessive consing is, and I think I also know how to fix it. I'll do that tomorrow - have to go to bed now... :)
On Thu, 05 Oct 2006 01:19:22 +0200, Edi Weitz edi@agharta.de wrote:
It just occurred to me that I know what the reason for this excessive consing is, and I think I also know how to fix it. I'll do that tomorrow - have to go to bed now... :)
Please try with the latest (0.6.4) version of FLEXI-STREAMS. That should make a difference.
Cheers, Edi.
On 10/5/06, Edi Weitz edi@agharta.de wrote:
On Thu, 05 Oct 2006 01:19:22 +0200, Edi Weitz edi@agharta.de wrote:
It just occurred to me that I know what the reason for this excessive consing is, and I think I also know how to fix it. I'll do that tomorrow - have to go to bed now... :)
Please try with the latest (0.6.4) version of FLEXI-STREAMS. That should make a difference.
I updated flexi-streams (and trivial-gray-streams, since the new flexi-streams required an update of it). I see quite an improvement:
seconds | consed | calls | sec/call | name ----------------------------------------------------- 1.044 | 36,832 | 1 | 1.043898 | TBNL:HANDLE-STATIC-FILE 0.000 | 0 | 1 | 0.000000 | TBNL::READ-HTTP-HEADERS 0.000 | 28,664 | 6 | 0.000000 | TBNL::WRITE-HEADER-LINE/HTTP
Reduced the conses for handle-static-file by 90%.
On the FLEX side:
seconds | consed | calls | sec/call | name ------------------------------------------------------------ 0.000 | 71,507,496 | 1,488,124 | 0.000000 | FLEXI-STREAMS::WRITE-BYTE* 0.000 | 49,152 | 349 | 0.000000 | FLEXI-STREAMS::READ-BYTE*
71M conses down from 106M conses on the write-byte* call; 45M down from 81M on read-byte*.
According to the wall-clock (never trust the wall-clock!), I'm getting downloads of my large binary at ~1800KB/sec, an apparent three-fold improvement.
Thanks! Though not exactly fast yet, this is useable for development purposes. I'll play with some optimization-declarations, as you suggested, and see if I can squeeze out a bit more performance.
Regards, Graham
On Thu, 5 Oct 2006 11:24:26 -0400, "Graham Fawcett" graham.fawcett@gmail.com wrote:
Though not exactly fast yet, this is useable for development purposes.
Please try the just-released latest version of FLEXI-STREAMS (0.6.5) and see if you see more improvements.
On Fri, 06 Oct 2006 09:15:28 +0200, Edi Weitz edi@agharta.de wrote:
Please try the just-released latest version of FLEXI-STREAMS (0.6.5) and see if you see more improvements.
Sorry, for the noise today, but a) FLEXI-STREAMS 0.6.5 was buggy, so you should use 0.6.6, and b) you should also use Chunga 0.2.0 for (hopefully) more performance improvements.
If all goes as planned, we probably don't need those ugly declarations to increase the speed of Hunchentoot.
Expect more betas and bugs this weekend... :)
On 10/6/06, Edi Weitz edi@agharta.de wrote:
Sorry, for the noise today, but a) FLEXI-STREAMS 0.6.5 was buggy, so you should use 0.6.6, and b) you should also use Chunga 0.2.0 for (hopefully) more performance improvements.
Expect more betas and bugs this weekend... :)
I look forward to them! :-) Thanks very much, I'll keep testing.
Graham
On 10/6/06, Edi Weitz edi@agharta.de wrote:
Sorry, for the noise today, but a) FLEXI-STREAMS 0.6.5 was buggy, so you should use 0.6.6, and b) you should also use Chunga 0.2.0 for (hopefully) more performance improvements.
You weren't kidding! I've just installed the updated Flexi-streams and Chunga. Now I'm seeing, oh, about a 150x improvement! No time right now for any real stats, but here's the dump of a (localhost) 'ab' test:
------
$ ab -n 250 http://localhost:3000/tbnl/test/files/imgp2926.jpg ....
Server Software: Hunchentoot Server Hostname: localhost Server Port: 3000
Document Path: /tbnl/test/files/imgp2926.jpg .... Document Length: 812011 bytes
Concurrency Level: 1 Time taken for tests: 2.928872 seconds Complete requests: 250 Failed requests: 0 Write errors: 0 Total transferred: 203049750 bytes HTML transferred: 203002750 bytes Requests per second: 85.36 [#/sec] (mean) Time per request: 11.715 [ms] (mean) Time per request: 11.715 [ms] (mean, across all concurrent requests) Transfer rate: 67701.83 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 6 11 6.6 7 54 Waiting: 2 3 2.7 2 33 Total: 6 11 6.6 7 54
Percentage of the requests served within a certain time (ms) 50% 7 66% 15 75% 15 80% 15 90% 16 95% 17 98% 20 99% 43 100% 54 (longest request)
------
Brililant!
Cheers, Graham