I think I worked out the problem here: there are accesses to HashTables from unsynchronized get and put methods. Adding a 'synchronized' to the get/put methods in EqHashTable fixes this particular threading problem.
When the example below fails, it looks likeEqHashTable.get is being called from Layout.getSlotIndex(LispObject slotName) (line 191), but I can't find a way to fix the problem by synchronizing on getSlotIndex or on the slotTable variable, perhaps there are direct get calls to the slotTable from someplace else?
I've attached a patch that shows where I had to add 'synchronized' to fix these problems (which have become a fairly large problem for me!). I'm unsure of the fix though, because it looks like HashTable was designed for accesses through the getHash and putHash methods, which are already synchronized. If you apply this fix, you probably also want to update EqlHashTable and EqualHashTable.
Cheers,
-david k.
P.S.
Here is an updated code fragment that always fails for me before adding the synchronized keywords, and which now works (along with, as far as I can tell, the rest of my code).
(labels ((make-many () #'(lambda () (dotimes (i 5000) (make-instance 'slot-value-test))))) (let ((thread1 (make-thread (make-many))) (thread2 (make-thread (make-many)))) (threads:thread-join thread1) (threads:thread-join thread2)))
P.P.S.
Is testing 'ant test.ansi.compiled' currently broken? I get an end of file error from test/lisp/ansi/parse-ansi-errors.lisp when I try to run the tests. Looks like (defvar *default-database-file* is not closed
On Tue, Apr 27, 2010 at 11:37 AM, Erik Huelsmann ehuels@gmail.com wrote:
Hi David,
On Mon, Apr 26, 2010 at 8:36 PM, David Kirkman dkirkman@ucsd.edu wrote:
On Sat, Apr 24, 2010 at 9:17 AM, David Kirkman dkirkman@ucsd.edu wrote:
On Sat, Apr 24, 2010 at 7:50 AM, Erik Huelsmann ehuels@gmail.com wrote:
Can you share some examples which show the issues here? Adding thread-safety to clos.lisp - in specific, targeted places - should be well doable.
Here is a second example that reliably gives me an error. This time, I only need two threads:
(defclass counter () ((count :initform 0)))
Yesterday, I took a little time to look at the issue you mention. You said with 1 big "lock around everything" it all worked. Where did you put that lock? I tried putting some locks in a number of places (educated guesses), but nothing helped so far. I don't have data regarding the stack trace you see when running into this issue - there wasn't enough time for it.
I know now that the error comes from one of two functions in StandardGenericFunction.java, but that's simply based on the text you're seeing in the error message.
The above is to let you know I did some work on it - and the current state. Maybe someone has time to work on it before me: I may have time to work on it some more somewhere next week.
Bye,
Erik.