I see.. I'm starting to grok your point of view now.
It would be acceptable to have each ant be its own process in a language like Erlang which has lightweight processes. You can write programs in Erlang with thousands of processes and it doesn't skip a beat..
But to have each ant be its own OS thread would be very - even unreasonably - heavy.
(SIDE NOTE Is that actually what's going on in the Ants simulation? Clojure agents are something I haven't yet studied. I actually think agents run on a thread pool which has a number of threads equal to the number of processors. See this 2007 thread on "Erlang vs Clojure":
http://groups.google.com/group/clojure/browse_thread/thread/2a2b24ffef5d1631...
It might be that what they refer to as "actors" in this thread are presently being called "agents". Rich Hickey says:
"Well, they are bigger than 4 bytes each, having a per-actor linked- queue for pending messages (there are some delivery order semantics) and some flags, but pretty small overall, with nothing heavyweight (e.g. threads) per actor. I've had millions of (idle) actors loaded with no CPU overhead."
So we might be criticising Clojure incorrectly here. )
I've had Rich Hickey's talk on Clojure concurrency (http://clojure.blip.tv/file/812787/) queued up to watch for a while now. When I watch it I'll be looking especially for him to address these sort of concerns. In any case, this discussion has primed my mind to understand his talk better.
Thanks, Vish
On Tue, Apr 14, 2009 at 7:12 PM, doug@hcsw.org wrote:
Hi Vish,
On Mon, Apr 13, 2009 at 10:57:12PM -0400 or thereabouts, Vishvajit Singh wrote:
Rich Hickey's ant simulation is a good case study for Clojure concurrency in action. You may find it interesting to read through:
I briefly looked at this program and it seems to me that it creates 49 threads, each of which represent an ant. The ants then walk around a 2D grid subject to conditions like food and pheromones. The threads use transactional memory to synchronise access to the grid.
Let's apply The System.
Step 0: What is the reason for using threads (pre-emptively scheduled, shared memory processes) in this program? To multiplex IO? No. To coordinate access to multiple disks? No. To take advantage of multiple CPUs for speed reasons? Possibly. Let's assume for now that this is the reason.
Step 1: Make this program use 1 thread for all the ants. This will give the following performance benefits (remember, we are concerned about speed):
- Reduced context switching. Unless you have 49 or more CPUs in
your machine, threads will be pre-empted frequently (or will waste much of their time sleeping).
Reduce the memory and cache pressure of having 49 active threads.
Eliminate the overhead of transactional memory (it's not free).
You can now use mutable data-structures for performance reasons
(remember we care about speed).
It will also give the following correctness benefits (we are always concerned about correctness):
- How an ant moves will no longer be affected by artifacts
of your system's thread scheduler. Instead, ants will move according to time as defined by your simulation.
- Eliminate possibility of deadlocks and livelocks.
Step 2: Since we are concerned about speed we will want to run this simulation many times with different initial conditions and/or PRNG seeds (otherwise we wouldn't care about speed).
Build a manager/worker system that fork()s N worker processes (where N is the number of CPUs in your machine). Each worker accepts messages over a socket. These messages contain initial conditions and PRNG seeds. The worker then runs the simulation and returns the pertinent statistics to the manager.
An advantage of this is that when you need to compute huge simulations (the only situation where you would be concerned about speed) you can spawn worker processes on separate physical machines and have them connect to the manager over an internet socket. This is because we have divided the concurrent tasks along the most natural and effective concurrency boundary, the process.
Note: There is one use case for threads that The System doesn't cover: testing or benchmarking threading implementations. I have a feeling that this is the real purpose of the ant program.
It will take me some time to fully work through your arguments, as I haven't worked very much with concurrency myself.
I see. Having created many concurrent programs, some of which are multi-threaded, my advice is to avoid threads altogether. Only once you have acquired significant experience with concurrency will you be able to recognise the very rare situations where threading is actually warranted. At this point you will no longer need The System.
How do you feel your opinions hold up on the issue of simple convenience for the programmer? I mean, using select() for IO, fork() for concurrency, and Berkeley DB for transactions throughout your programs takes extra effort as compared to using the built-in language mechanisms.
Aha! But remember that in lisp having something built-in to the language is only a defmacro away.
Hope this helps,
Doug