Pascal Costanza wrote:
Parallel GC is no problem and implemented.
Which CL implementations have a parallel GC?
I'm a little late to the conversation, but we have been experimenting with a multi-threaded garbage collector that runs mostly in parallel with the lisp application code. The collector uses a mark-sweep algorithm in which the application code runs in parallel with the collector except for a few brief pauses.
Application threads must pause briefly during the mark phase to allow references in their active frames to be recorded. This pause is relatively short. Our current test system, heavily instrumented for validation and performance-measurement purposes, and using just one thread for the marking, is showing typical pauses for this initial stack scan under 1 msec. After these initial references are collected, the application threads run while the marker follows all references to produce a live-object map that is almost correct. With that initial map complete, all threads are stopped briefly to collect any new references and add them to the map. In our test system these reconciliation pauses are typically under 15 msecs and almost always under 25 msec.
After the live map has been completed, the application threads are released to continue processing while the collector scans the live map, collecting the storage occupied by dead objects into free pools.
Once this is done, the collector can cycle back to do a new mark phase.
While we are currently testing the marker running on just one thread, the code is essentially the same as the multi-threaded global gc Allegro has had for several years now. The number of threads used by the marker is a configurable option; experience suggests we can cut the pause times in half by using three or four threads.
Mark-sweep collectors can be subject to storage fragmentation. Memory is divided into areas each of which holds objects of one size. As objects leave the live set, their storage is collected into pools according to size. When several areas holding objects of the same size become partially free, it can be useful to relocate live objects from some of those areas into the others, reducing the memory footprint of the live set and making the abandoned area available for objects of some other size. Once the mark phase is complete, we have enough information to recognize areas that could benefit from this sort of reorganization. In order to move an object, however, we must adjust all pointers to that object. This can only be done while all the application threads are paused, and in order to keep application pauses short, it's essential that we find all the in-heap references to those objects without having to scan the entire heap.
Fortunately, the mark phase can do the work for us, providing us with a map of those areas holding references to one of the objects we wish to relocate. At the end of the second pause-the-application-threads mark phase, we can see how much time we've kept the application inactive, and estimate how much time it will take to fix the pointers to the objects we want to move. If the pase so far plus the estimated additional time is less than a settable amount, we can perform the relocation and reap the benefits. If not, we can check again next time.
Current efforts revolve around global optimization and the proper set of user-visible controls and reports to help optimize performance. The goal of the concurrent gc is to let us apply some of the extra memory and cpu power available in today's machines toward elimination of the lengthy and unpredictable gc pauses that can make lisp unattractive to human users. Real-time response isn't achieved, and isn't the goal.
I know people will want to know when it's going to be released, but we don't currently have a release date set.
Kevin Layer