Hi,
I just wanted to report on a little success story.
In the recent few months, I have been working on a new project that concerns itself with parallel programming. For that purpose, I experimented with variations of a home-grown work-stealing scheduler. (See http://en.wikipedia.org/wiki/Cilk#Work-stealing for a description of what work stealing is.) The development platform is LispWorks 6.0, which comes with an excellent library for SMP. (I would say it's the best of any dynamic language I am currently aware of, not just those of Lisp dialects.)
Performance matters in parallel programming. The benchmark application was a structured grid computation, like is for example used for computing heat equations, for running game of life, or for similar applications. This is easy to parallelize, because each processor can perform part of the computation on a sub-grid independently of all the other processors.
The hardware that I ran the experiments on is a 4x 6-core Intel Xeon "Dunnington" 2.67 GHz running Ubuntu.
For 200 iterations on a 4096x4096 matrix of single float values, I got the following runtimes: - with 8 cores ca. 19-20 secs. - with 16 cores ca. 9-10 secs. - with 24 cores ca. 8 secs.
The same experiment was also implemented by others in our group in the following languages: Unified Parallel C, Cilk++, C with MPI, OpenMP, and C++ with Threading Building Blocks. They all have slight variations in runtime, but all roughly in the same ballpark. The runtimes were: - with 8 cores ca. 10-11 secs. - with 16 cores ca. 8-10 secs. - with 24 cores ca. 8-10 secs.
The advantage of using Common Lisp is that I could still use higher-order constructs (closures) and have a pretty decent separation of concerns in the software architecture, while still getting very good efficiency. Of course, this required some pretty low-level tricks with type and optimization declarations in several places, but the nice thing about Common Lisp is that you don't have to be low-level right from the start, and only at the places that you identify as bottlenecks.
Unfortunately, I cannot present a lot more details at this stage, because most of this is covered by an NDA. I hope, though, that I can make the work-stealing scheduler available as an open source library at some later stage.
Pascal