During my testing of multi-threaded use of ABCL over the past weeks, I found that ABCL performs reliably with 2 threads on 2 cores (fully dedicated to running ABCL code). However, there's one notable exception: Any code which requires CLOS, such as the pretty printer and other actions which require generic methods. I'm able to quite reliably break ABCL by starting a specific series of actions on 2 threads at the same time. (No global values in the program code!)
The cause can be led back to slow-method-lookup; if I add SYNCHRONIZED-ON to it, the tests work fine. However, SLOW-METHOD-LOOKUP has friends which also manipulate the EMF-CACHE, so I don't think this would be the full solution.
There's however an additional issue: Hashmap is defined to be thread-unsafe: 2 concurrent readers are fine, but concurrent writers or mixed concurrent readers/writers are not guaranteed to work. So, to guarantee results with the current EMF-CACHE implementation, we need to make sure there's 1 reader/writer at the same time - using synchronized-on, that is. For most applications this would be horribly inefficient: most methods will be added once and read huge numbers of times.
ECL chose a rather unconventional direction in this case: instead of searching for lock-less semantics of global state, they have an EMF-CACHE per thread, meaning that read/write operations on the cache are thread-local and hence can be done lockless.
ABCL could easily go with java.util.concurrent.ConcurrentHashMap, which is lock-less and allows concurrent readers and limited synchronous writers (~16 by default). However, this assumes that the actual threading issue is mainly with writing values to the cache. ie, before an "entry to be cached" is available, nothing happens which can't be overwritten immediately after it has been added to the cache. It's that last bit I have problems with assessing: do we simply need to use another caching hash class or do we need to do much broader "global" locking? (The "global" locking proposed would simply be locking that specific generic function; clos could easily run in other threads for other GFs...)
Help welcome!
Bye,
Erik.
armedbear-devel@common-lisp.net