Re: [toronto-lisp] April meeting minutes available on the website

15 Apr 2009

      I see.. I'm starting to grok your point of view now.

It would be acceptable to have each ant be its own process in a
language like Erlang which has lightweight processes. You can write
programs in Erlang with thousands of processes and it doesn't skip a
beat..

But to have each ant be its own OS thread would be very - even
unreasonably - heavy.

(SIDE NOTE
    Is that actually what's going on in the Ants simulation? Clojure
agents are something I haven't yet studied. I actually think agents
run on a thread pool which has a number of threads equal to the number
of processors. See this 2007 thread on "Erlang vs Clojure":

http://groups.google.com/group/clojure/browse_thread/thread/2a2b24ffef5d1631...

   It might be that what they refer to as "actors" in this thread are
presently being called "agents". Rich Hickey says:

"Well, they are bigger than 4 bytes each, having a per-actor linked-
queue for pending messages (there are some delivery order semantics)
and some flags, but pretty small overall, with nothing heavyweight
(e.g. threads) per actor. I've had millions of (idle) actors loaded
with no CPU overhead."

   So we might be criticising Clojure incorrectly here.
)

I've had Rich Hickey's talk on Clojure concurrency
(http://clojure.blip.tv/file/812787/) queued up to watch for a while
now. When I watch it I'll be looking especially for him to address
these sort of concerns. In any case, this discussion has primed my
mind to understand his talk better.

Thanks,
Vish

On Tue, Apr 14, 2009 at 7:12 PM,  <doug@hcsw.org> wrote:
...
Hi Vish,
On Mon, Apr 13, 2009 at 10:57:12PM -0400 or thereabouts, Vishvajit Singh wrote:
...
Rich Hickey's ant simulation is a good case study for Clojure
concurrency in action. You may find it interesting to read through:
I briefly looked at this program and it seems to me that it creates
49 threads, each of which represent an ant. The ants then walk around
a 2D grid subject to conditions like food and pheromones. The threads
use transactional memory to synchronise access to the grid.
Let's apply The System.
Step 0: What is the reason for using threads (pre-emptively scheduled,
shared memory processes) in this program? To multiplex IO? No. To
coordinate access to multiple disks? No. To take advantage of multiple
CPUs for speed reasons? Possibly. Let's assume for now that this is
the reason.
Step 1: Make this program use 1 thread for all the ants. This will
give the following performance benefits (remember, we are concerned
about speed):
* Reduced context switching. Unless you have 49 or more CPUs in
 your machine, threads will be pre-empted frequently (or will
 waste much of their time sleeping).
* Reduce the memory and cache pressure of having 49 active threads.
* Eliminate the overhead of transactional memory (it's not free).
* You can now use mutable data-structures for performance reasons
 (remember we care about speed).
It will also give the following correctness benefits (we are always
concerned about correctness):
* How an ant moves will no longer be affected by artifacts
 of your system's thread scheduler. Instead, ants will move
 according to time as defined by your simulation.
* Eliminate possibility of deadlocks and livelocks.
Step 2: Since we are concerned about speed we will want to run
this simulation many times with different initial conditions
and/or PRNG seeds (otherwise we wouldn't care about speed).
Build a manager/worker system that fork()s N worker processes
(where N is the number of CPUs in your machine). Each worker
accepts messages over a socket. These messages contain initial
conditions and PRNG seeds. The worker then runs the simulation
and returns the pertinent statistics to the manager.
An advantage of this is that when you need to compute huge
simulations (the only situation where you would be concerned about
speed) you can spawn worker processes on separate physical machines
and have them connect to the manager over an internet socket. This
is because we have divided the concurrent tasks along the most
natural and effective concurrency boundary, the process.
Note: There is one use case for threads that The System doesn't
cover: testing or benchmarking threading implementations.
I have a feeling that this is the real purpose of the ant program.
...
It will take me some time to fully work through your arguments, as I
haven't worked very much with concurrency myself.
I see. Having created many concurrent programs, some of which are
multi-threaded, my advice is to avoid threads altogether. Only
once you have acquired significant experience with concurrency
will you be able to recognise the very rare situations where
threading is actually warranted. At this point you will no
longer need The System.
...
How do you feel your
opinions hold up on the issue of simple convenience for the
programmer? I mean, using select() for IO, fork() for concurrency, and
Berkeley DB for transactions throughout your programs takes extra
effort as compared to using the built-in language mechanisms.
Aha! But remember that in lisp having something built-in to the
language is only a defmacro away.
Hope this helps,
Doug