James M. Lawrence writes:
On Wed, Sep 2, 2015 at 6:09 AM, Daniel Kochmański daniel@turtleware.eu wrote:
Hm, then I can't reproduce neither of them. Spawning too many threads blows the heap, but it's understandable. I think it might be that i have x86_64 and a new kernel, (maybe it happens only on x86, or linux 3.2 had some bug?).
64-bit systems are sometimes better at concealing such threading bugs. The default assumption should be that 64-bit systems just haven't been jiggled to the right tune yet. Over the course of several years, and across several kernels, I haven't seen any version of ECL pass these stress tests. This problem would need to be understood before even thinking about kernel bugs, which seem quite unlikely in this case.
Ok, so I guess I'll just set up 32-bit VM. I hope this will be able to reproduce the problems (next test doesn't fail neither on my x86_64 for my tests). Yeah, I also think that kernel bug is unlikely, but I'll keep that possibility in mind.
I've added your tests to repository and will integrate them soon (infinite loop isn't a good pick for regression tests :-)).
Thanks, Daniel
Moving up the ladder of complexity, the next test case requires a blocking queue and involves repeatedly creating/destroying worker threads. It has three different types of failures; they are sporadic and not determined by the inputs. The following are just example runs and shouldn't be considered representative:
(qtest 0 64) ;=> segfault (qtest 1 64) ; => hang (qtest 10000 64) ; => error "Attempted to recursively lock..."
;;;; raw-queue
(defstruct (raw-queue (:conc-name nil)) (head nil) (tail nil))
(defun push-raw-queue (value queue) (let ((new (cons value nil))) (if (head queue) (setf (cdr (tail queue)) new) (setf (head queue) new)) (setf (tail queue) new)))
(defun pop-raw-queue (queue) (let ((node (head queue))) (if node (multiple-value-prog1 (values (car node) t) (when (null (setf (head queue) (cdr node))) (setf (tail queue) nil)) (setf (car node) nil (cdr node) nil)) (values nil nil))))
;;;; queue
(defstruct queue (impl (make-raw-queue)) (lock (mp:make-lock)) (cvar (mp:make-condition-variable)))
(defun push-queue (object queue) (mp:with-lock ((queue-lock queue)) (push-raw-queue object (queue-impl queue)) (mp:condition-variable-signal (queue-cvar queue))))
(defun pop-queue (queue) (mp:with-lock ((queue-lock queue)) (loop (multiple-value-bind (value presentp) (pop-raw-queue (queue-impl queue)) (if presentp (return value) (mp:condition-variable-wait (queue-cvar queue) (queue-lock queue)))))))
;;;; qtest
(defun qtest (message-count worker-count) (loop (let ((to-workers (make-queue)) (from-workers (make-queue))) (loop repeat worker-count do (mp:process-run-function "test" (lambda () (loop (let ((message (pop-queue to-workers))) (push-queue message from-workers) (unless message (return))))))) (loop repeat message-count do (push-queue t to-workers)) (loop repeat message-count do (pop-queue from-workers)) (loop repeat worker-count do (push-queue nil to-workers)) (loop repeat worker-count do (pop-queue from-workers)) (format t ".") (finish-output))))