On 3/25/13 1129 , Anton Vodonosov wrote: […]
Mark, the test hangs not always.
Thanks for the confirmation, as I was just coming around to this realization as the only way to explain the inconsistencies.
I think this is bug in the test, because the test is not guaranteed to work.
It create 100 threads, each thread waits for (= i *shared*). In every thread i has different value, from 0 to 99. So the threads are chained and each thread waits white the previous one will increase *shared*.
But the threads use bt:condition-notify to interact, which deliver notification to only one of the waiting thread, and there is not guarantee it will be right thread.
The tests passes on SBCL. Maybe SBCL always choses to notify the first thread in the waiting queue.
But bt:condition-notify contract does not require this.
In short, I think what we see is not a bug in ABCL or bordeaux-threads, but a bug in the test.
Digging into the test
(test condition-variable (setf *shared* 0) (let ((num-procs 100)) (dotimes (i num-procs) (make-thread (compile nil `(lambda () (with-lock-held (*lock*) (loop until (= ,i *shared*) do (condition-wait *condition-variable* *lock*)) (incf *shared*)) (condition-notify *condition-variable*))))) (with-lock-held (*lock*) (loop until (= num-procs *shared*) do (condition-wait *condition-variable* *lock*))) (is (equal num-procs *shared*))))
I really don't understand what is being tested here. Since there is no delay in starting the threads, for a non-loaded CPU each thread never really invokes the CONDITION-WAIT. Instead, each thread "sees" that it is the correct worker in the chain, calls essentially a no-op CONDITION-NOTIFY and then exits. Wouldn't one want to delay the execution of the threads by some random amount before starting things going?