Hi everyone,
I've written up the minutes for the April meeting, and put them up here: http://www.lisptoronto.org/past-meetings/2009-04-meeting Feel free to add anything you can remember. If you don't have permissions for the website, ask Telman for them.
I've also put up that Sudoku solver I wrote in Clojure -- it's attached to that page. I've tried to make the code as neat as possible. Take a look!
Another thing.. in the spirit of what Abram said to me in the subway station -- if you want our club to work on projects, start them yourself! -- I'm going to start studying Compojure, and when I've learned it I'll start a project for the club. Right now I'm thinking of making a Reddit-clone.. but I'm open to suggestions.
Cheers, Vish Singh
We discussed Clojure's concurrency mechanism (Software Transactional Memory).. Doug wonders whether it is overkill to include such a complex mechanism built-in to the language. Much disagreement here among the members.
I have never used Clojure so I can't comment much about it specifically, but that is not what I said about concurrency. Let me elaborate again:
Before you use threads (by which I mean pre-emptively scheduled, shared memory processes) ask yourself why:
* To handle multiple streams of IO? Solution: Schedule one process to handle all the IO by using an event API like select(), non-pre-emptively scheduled threads like GNU pth (http://www.gnu.org/software/pth/), or "coroutines", "green threads", whatever you wanna call em.
* To take advantage of multiple CPUs or multiple disks? Solution: Learn how the fork() system call works and use sockets or pipes for message passing.
To me there is no reason to ruin a great language like Common Lisp with thread-related non-determinism. I think having to ban mutable data structures from your language as in Clojure is a symptom of a poor concurrency strategy. As I said at the meeting, in my opinion the best implementation of transactional memory is Berkeley DB:
* Clojure's memory isn't persistent so if your application crashes or the power goes out you will lose all your memory. BDB can persist the memory to disk guaranteeing transactional consistency (although this is optional).
* As far as I know, Clojure transactional memory can only be accessed by threads running in the same process. BDB DBs are multi-process AND multi-thread.
* BDB allows replication so that the memory is distributed to physically separate machines. Each will have a transactionally consistent view of the data.
* BDB DBs can be larger than your address space on 32 bit systems. I'd guess that Clojure's transactional storage is limited on such systems.
* BDB hotbackups let you snapshot a DB without interrupting anything. I use the included db_hotbackup utility.
* BDB DBs can be accessed by programs in any language: Perl, Common Lisp, C, etc. Probably from Clojure too.
Thanks for the clarification, Doug. I see that your opinion is more nuanced than the version I put up. I'll update it soon and put up a more accurate version of your opinion.
Rich Hickey's ant simulation is a good case study for Clojure concurrency in action. You may find it interesting to read through:
http://clojure.googlegroups.com/web/ants.clj
It will take me some time to fully work through your arguments, as I haven't worked very much with concurrency myself. How do you feel your opinions hold up on the issue of simple convenience for the programmer? I mean, using select() for IO, fork() for concurrency, and Berkeley DB for transactions throughout your programs takes extra effort as compared to using the built-in language mechanisms. In the case of the ant colony simulation, you'd have to worry about the problem of distributing the ants among N forked processes, where N is the number of processors, whereas if you use shared-memory threads, you let the OS handle that.
Regards, Vish Singh
On Mon, Apr 13, 2009 at 7:20 PM, doug@hcsw.org wrote:
We discussed Clojure's concurrency mechanism (Software Transactional Memory).. Doug wonders whether it is overkill to include such a complex mechanism built-in to the language. Much disagreement here among the members.
I have never used Clojure so I can't comment much about it specifically, but that is not what I said about concurrency. Let me elaborate again:
Before you use threads (by which I mean pre-emptively scheduled, shared memory processes) ask yourself why:
- To handle multiple streams of IO? Solution: Schedule one
process to handle all the IO by using an event API like select(), non-pre-emptively scheduled threads like GNU pth (http://www.gnu.org/software/pth/), or "coroutines", "green threads", whatever you wanna call em.
- To take advantage of multiple CPUs or multiple disks?
Solution: Learn how the fork() system call works and use sockets or pipes for message passing.
To me there is no reason to ruin a great language like Common Lisp with thread-related non-determinism. I think having to ban mutable data structures from your language as in Clojure is a symptom of a poor concurrency strategy. As I said at the meeting, in my opinion the best implementation of transactional memory is Berkeley DB:
- Clojure's memory isn't persistent so if your application
crashes or the power goes out you will lose all your memory. BDB can persist the memory to disk guaranteeing transactional consistency (although this is optional).
- As far as I know, Clojure transactional memory can only be
accessed by threads running in the same process. BDB DBs are multi-process AND multi-thread.
- BDB allows replication so that the memory is distributed to
physically separate machines. Each will have a transactionally consistent view of the data.
- BDB DBs can be larger than your address space on 32 bit systems.
I'd guess that Clojure's transactional storage is limited on such systems.
- BDB hotbackups let you snapshot a DB without interrupting
anything. I use the included db_hotbackup utility.
- BDB DBs can be accessed by programs in any language: Perl,
Common Lisp, C, etc. Probably from Clojure too.
toronto-lisp mailing list toronto-lisp@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/toronto-lisp
Papers #4 and #6 on this page: http://www.cs.ubc.ca/~norm/508/2007W1/readinglist.html
might help out those of us who aren't well-versed in the debate between event-based and thread-based concurrency. A lot of the topics that are coming up in this thread have been discussed (and discussed and discussed and...) in O/S-related literature for a looong time :)
-Debo
On Mon, Apr 13, 2009 at 10:57 PM, Vishvajit Singh vishvajitsingh@gmail.com wrote:
Thanks for the clarification, Doug. I see that your opinion is more nuanced than the version I put up. I'll update it soon and put up a more accurate version of your opinion.
Rich Hickey's ant simulation is a good case study for Clojure concurrency in action. You may find it interesting to read through:
http://clojure.googlegroups.com/web/ants.clj
It will take me some time to fully work through your arguments, as I haven't worked very much with concurrency myself. How do you feel your opinions hold up on the issue of simple convenience for the programmer? I mean, using select() for IO, fork() for concurrency, and Berkeley DB for transactions throughout your programs takes extra effort as compared to using the built-in language mechanisms. In the case of the ant colony simulation, you'd have to worry about the problem of distributing the ants among N forked processes, where N is the number of processors, whereas if you use shared-memory threads, you let the OS handle that.
Regards, Vish Singh
On Mon, Apr 13, 2009 at 7:20 PM, doug@hcsw.org wrote:
We discussed Clojure's concurrency mechanism (Software Transactional Memory).. Doug wonders whether it is overkill to include such a complex mechanism built-in to the language. Much disagreement here among the members.
I have never used Clojure so I can't comment much about it specifically, but that is not what I said about concurrency. Let me elaborate again:
Before you use threads (by which I mean pre-emptively scheduled, shared memory processes) ask yourself why:
- To handle multiple streams of IO? Solution: Schedule one
process to handle all the IO by using an event API like select(), non-pre-emptively scheduled threads like GNU pth (http://www.gnu.org/software/pth/), or "coroutines", "green threads", whatever you wanna call em.
- To take advantage of multiple CPUs or multiple disks?
Solution: Learn how the fork() system call works and use sockets or pipes for message passing.
To me there is no reason to ruin a great language like Common Lisp with thread-related non-determinism. I think having to ban mutable data structures from your language as in Clojure is a symptom of a poor concurrency strategy. As I said at the meeting, in my opinion the best implementation of transactional memory is Berkeley DB:
- Clojure's memory isn't persistent so if your application
crashes or the power goes out you will lose all your memory. BDB can persist the memory to disk guaranteeing transactional consistency (although this is optional).
- As far as I know, Clojure transactional memory can only be
accessed by threads running in the same process. BDB DBs are multi-process AND multi-thread.
- BDB allows replication so that the memory is distributed to
physically separate machines. Each will have a transactionally consistent view of the data.
- BDB DBs can be larger than your address space on 32 bit systems.
I'd guess that Clojure's transactional storage is limited on such systems.
- BDB hotbackups let you snapshot a DB without interrupting
anything. I use the included db_hotbackup utility.
- BDB DBs can be accessed by programs in any language: Perl,
Common Lisp, C, etc. Probably from Clojure too.
toronto-lisp mailing list toronto-lisp@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/toronto-lisp
toronto-lisp mailing list toronto-lisp@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/toronto-lisp
Thanks Mike! Those papers answer a lot of my questions.
I would love to learn more about this stuff.. I think I'll head off to Chapters today and see if I can find a good book on concurrency.
Vish
On Tue, Apr 14, 2009 at 8:46 AM, Michael DiBernardo mikedebo@gmail.com wrote:
Papers #4 and #6 on this page: http://www.cs.ubc.ca/~norm/508/2007W1/readinglist.html
might help out those of us who aren't well-versed in the debate between event-based and thread-based concurrency. A lot of the topics that are coming up in this thread have been discussed (and discussed and discussed and...) in O/S-related literature for a looong time :)
-Debo
On Mon, Apr 13, 2009 at 10:57 PM, Vishvajit Singh vishvajitsingh@gmail.com wrote:
Thanks for the clarification, Doug. I see that your opinion is more nuanced than the version I put up. I'll update it soon and put up a more accurate version of your opinion.
Rich Hickey's ant simulation is a good case study for Clojure concurrency in action. You may find it interesting to read through:
http://clojure.googlegroups.com/web/ants.clj
It will take me some time to fully work through your arguments, as I haven't worked very much with concurrency myself. How do you feel your opinions hold up on the issue of simple convenience for the programmer? I mean, using select() for IO, fork() for concurrency, and Berkeley DB for transactions throughout your programs takes extra effort as compared to using the built-in language mechanisms. In the case of the ant colony simulation, you'd have to worry about the problem of distributing the ants among N forked processes, where N is the number of processors, whereas if you use shared-memory threads, you let the OS handle that.
Regards, Vish Singh
On Mon, Apr 13, 2009 at 7:20 PM, doug@hcsw.org wrote:
We discussed Clojure's concurrency mechanism (Software Transactional Memory).. Doug wonders whether it is overkill to include such a complex mechanism built-in to the language. Much disagreement here among the members.
I have never used Clojure so I can't comment much about it specifically, but that is not what I said about concurrency. Let me elaborate again:
Before you use threads (by which I mean pre-emptively scheduled, shared memory processes) ask yourself why:
- To handle multiple streams of IO? Solution: Schedule one
process to handle all the IO by using an event API like select(), non-pre-emptively scheduled threads like GNU pth (http://www.gnu.org/software/pth/), or "coroutines", "green threads", whatever you wanna call em.
- To take advantage of multiple CPUs or multiple disks?
Solution: Learn how the fork() system call works and use sockets or pipes for message passing.
To me there is no reason to ruin a great language like Common Lisp with thread-related non-determinism. I think having to ban mutable data structures from your language as in Clojure is a symptom of a poor concurrency strategy. As I said at the meeting, in my opinion the best implementation of transactional memory is Berkeley DB:
- Clojure's memory isn't persistent so if your application
crashes or the power goes out you will lose all your memory. BDB can persist the memory to disk guaranteeing transactional consistency (although this is optional).
- As far as I know, Clojure transactional memory can only be
accessed by threads running in the same process. BDB DBs are multi-process AND multi-thread.
- BDB allows replication so that the memory is distributed to
physically separate machines. Each will have a transactionally consistent view of the data.
- BDB DBs can be larger than your address space on 32 bit systems.
I'd guess that Clojure's transactional storage is limited on such systems.
- BDB hotbackups let you snapshot a DB without interrupting
anything. I use the included db_hotbackup utility.
- BDB DBs can be accessed by programs in any language: Perl,
Common Lisp, C, etc. Probably from Clojure too.
toronto-lisp mailing list toronto-lisp@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/toronto-lisp
toronto-lisp mailing list toronto-lisp@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/toronto-lisp
Hi Vish,
On Mon, Apr 13, 2009 at 10:57:12PM -0400 or thereabouts, Vishvajit Singh wrote:
Rich Hickey's ant simulation is a good case study for Clojure concurrency in action. You may find it interesting to read through:
I briefly looked at this program and it seems to me that it creates 49 threads, each of which represent an ant. The ants then walk around a 2D grid subject to conditions like food and pheromones. The threads use transactional memory to synchronise access to the grid.
Let's apply The System.
Step 0: What is the reason for using threads (pre-emptively scheduled, shared memory processes) in this program? To multiplex IO? No. To coordinate access to multiple disks? No. To take advantage of multiple CPUs for speed reasons? Possibly. Let's assume for now that this is the reason.
Step 1: Make this program use 1 thread for all the ants. This will give the following performance benefits (remember, we are concerned about speed):
* Reduced context switching. Unless you have 49 or more CPUs in your machine, threads will be pre-empted frequently (or will waste much of their time sleeping).
* Reduce the memory and cache pressure of having 49 active threads.
* Eliminate the overhead of transactional memory (it's not free).
* You can now use mutable data-structures for performance reasons (remember we care about speed).
It will also give the following correctness benefits (we are always concerned about correctness):
* How an ant moves will no longer be affected by artifacts of your system's thread scheduler. Instead, ants will move according to time as defined by your simulation.
* Eliminate possibility of deadlocks and livelocks.
Step 2: Since we are concerned about speed we will want to run this simulation many times with different initial conditions and/or PRNG seeds (otherwise we wouldn't care about speed).
Build a manager/worker system that fork()s N worker processes (where N is the number of CPUs in your machine). Each worker accepts messages over a socket. These messages contain initial conditions and PRNG seeds. The worker then runs the simulation and returns the pertinent statistics to the manager.
An advantage of this is that when you need to compute huge simulations (the only situation where you would be concerned about speed) you can spawn worker processes on separate physical machines and have them connect to the manager over an internet socket. This is because we have divided the concurrent tasks along the most natural and effective concurrency boundary, the process.
Note: There is one use case for threads that The System doesn't cover: testing or benchmarking threading implementations. I have a feeling that this is the real purpose of the ant program.
It will take me some time to fully work through your arguments, as I haven't worked very much with concurrency myself.
I see. Having created many concurrent programs, some of which are multi-threaded, my advice is to avoid threads altogether. Only once you have acquired significant experience with concurrency will you be able to recognise the very rare situations where threading is actually warranted. At this point you will no longer need The System.
How do you feel your opinions hold up on the issue of simple convenience for the programmer? I mean, using select() for IO, fork() for concurrency, and Berkeley DB for transactions throughout your programs takes extra effort as compared to using the built-in language mechanisms.
Aha! But remember that in lisp having something built-in to the language is only a defmacro away.
Hope this helps,
Doug
I see.. I'm starting to grok your point of view now.
It would be acceptable to have each ant be its own process in a language like Erlang which has lightweight processes. You can write programs in Erlang with thousands of processes and it doesn't skip a beat..
But to have each ant be its own OS thread would be very - even unreasonably - heavy.
(SIDE NOTE Is that actually what's going on in the Ants simulation? Clojure agents are something I haven't yet studied. I actually think agents run on a thread pool which has a number of threads equal to the number of processors. See this 2007 thread on "Erlang vs Clojure":
http://groups.google.com/group/clojure/browse_thread/thread/2a2b24ffef5d1631...
It might be that what they refer to as "actors" in this thread are presently being called "agents". Rich Hickey says:
"Well, they are bigger than 4 bytes each, having a per-actor linked- queue for pending messages (there are some delivery order semantics) and some flags, but pretty small overall, with nothing heavyweight (e.g. threads) per actor. I've had millions of (idle) actors loaded with no CPU overhead."
So we might be criticising Clojure incorrectly here. )
I've had Rich Hickey's talk on Clojure concurrency (http://clojure.blip.tv/file/812787/) queued up to watch for a while now. When I watch it I'll be looking especially for him to address these sort of concerns. In any case, this discussion has primed my mind to understand his talk better.
Thanks, Vish
On Tue, Apr 14, 2009 at 7:12 PM, doug@hcsw.org wrote:
Hi Vish,
On Mon, Apr 13, 2009 at 10:57:12PM -0400 or thereabouts, Vishvajit Singh wrote:
Rich Hickey's ant simulation is a good case study for Clojure concurrency in action. You may find it interesting to read through:
I briefly looked at this program and it seems to me that it creates 49 threads, each of which represent an ant. The ants then walk around a 2D grid subject to conditions like food and pheromones. The threads use transactional memory to synchronise access to the grid.
Let's apply The System.
Step 0: What is the reason for using threads (pre-emptively scheduled, shared memory processes) in this program? To multiplex IO? No. To coordinate access to multiple disks? No. To take advantage of multiple CPUs for speed reasons? Possibly. Let's assume for now that this is the reason.
Step 1: Make this program use 1 thread for all the ants. This will give the following performance benefits (remember, we are concerned about speed):
- Reduced context switching. Unless you have 49 or more CPUs in
your machine, threads will be pre-empted frequently (or will waste much of their time sleeping).
Reduce the memory and cache pressure of having 49 active threads.
Eliminate the overhead of transactional memory (it's not free).
You can now use mutable data-structures for performance reasons
(remember we care about speed).
It will also give the following correctness benefits (we are always concerned about correctness):
- How an ant moves will no longer be affected by artifacts
of your system's thread scheduler. Instead, ants will move according to time as defined by your simulation.
- Eliminate possibility of deadlocks and livelocks.
Step 2: Since we are concerned about speed we will want to run this simulation many times with different initial conditions and/or PRNG seeds (otherwise we wouldn't care about speed).
Build a manager/worker system that fork()s N worker processes (where N is the number of CPUs in your machine). Each worker accepts messages over a socket. These messages contain initial conditions and PRNG seeds. The worker then runs the simulation and returns the pertinent statistics to the manager.
An advantage of this is that when you need to compute huge simulations (the only situation where you would be concerned about speed) you can spawn worker processes on separate physical machines and have them connect to the manager over an internet socket. This is because we have divided the concurrent tasks along the most natural and effective concurrency boundary, the process.
Note: There is one use case for threads that The System doesn't cover: testing or benchmarking threading implementations. I have a feeling that this is the real purpose of the ant program.
It will take me some time to fully work through your arguments, as I haven't worked very much with concurrency myself.
I see. Having created many concurrent programs, some of which are multi-threaded, my advice is to avoid threads altogether. Only once you have acquired significant experience with concurrency will you be able to recognise the very rare situations where threading is actually warranted. At this point you will no longer need The System.
How do you feel your opinions hold up on the issue of simple convenience for the programmer? I mean, using select() for IO, fork() for concurrency, and Berkeley DB for transactions throughout your programs takes extra effort as compared to using the built-in language mechanisms.
Aha! But remember that in lisp having something built-in to the language is only a defmacro away.
Hope this helps,
Doug
On Tue, Apr 14, 2009 at 10:38:44PM -0400 or thereabouts, Vishvajit Singh wrote:
Is that actually what's going on in the Ants simulation? Clojure
agents are something I haven't yet studied. I actually think agents run on a thread pool which has a number of threads equal to the number of processors.
Let's find out. Before I ran anything:
doug@eclipse:~$ ps -C java -L PID LWP TTY TIME CMD doug@eclipse:~$
So no java running. Let's fire up Clojure:
doug@eclipse:~/tp12/clojure$ ls .. clojure clojure_20090320.zip doug@eclipse:~/tp12/clojure$ /home/doug/java/jre1.6.0_13/bin/java -jar clojure.jar Clojure user=>
Let's see how many threads:
doug@eclipse:~$ ps -C java -L PID LWP TTY TIME CMD 31212 31212 pts/12 00:00:00 java 31212 31213 pts/12 00:00:01 java 31212 31214 pts/12 00:00:00 java 31212 31215 pts/12 00:00:00 java 31212 31216 pts/12 00:00:00 java 31212 31217 pts/12 00:00:00 java 31212 31218 pts/12 00:00:00 java 31212 31219 pts/12 00:00:00 java 31212 31220 pts/12 00:00:00 java doug@eclipse:~$ ps -C java -L|grep -v PID|wc -l 9
So 9 OS threads. Next I loaded the ants.clj file:
user=> (load-file "ants.clj") nil
Window pops up. Blank except for blue square in middle. CPU idle. Now we have 13 threads:
doug@eclipse:~$ ps -C java -L|grep -v PID|wc -l 13
Next I pasted in the following to the clojure REPL (it's from the bottom of the ants.clj file):
(def ants (setup)) (send-off animator animation) (dorun (map #(send-off % behave) ants)) (send-off evaporator evaporation)
Simulation starts. CPU pegged at 100%.
doug@eclipse:~$ ps -C java -L|grep -v PID|wc -l 89
89 OS threads. Wow that is even more than I figured.
Oh and I get it now... The ants crawl around searching for food and bring it back to the nest. ;)
If the objective is to simulate ant movement (as opposed to testing/benchmarking a thread implementation) you will have a more efficient and correct program if each ant is not implemented as its own thread. See The System described in my previous message for a solution.
So we might be criticising Clojure incorrectly here.
Actually I was never criticising Clojure (I don't know enough about it to do that). I was only discussing the trade-offs of different concurrency strategies.
Doug
Ouch.. 89 threads?
I was confused about agents, apparently.. it seems that:
send -> runs on the thread pool send-off -> creates a new thread
Mea culpa :)
Vish
On Wed, Apr 15, 2009 at 3:15 AM, doug@hcsw.org wrote:
On Tue, Apr 14, 2009 at 10:38:44PM -0400 or thereabouts, Vishvajit Singh wrote:
Is that actually what's going on in the Ants simulation? Clojure agents are something I haven't yet studied. I actually think agents run on a thread pool which has a number of threads equal to the number of processors.
Let's find out. Before I ran anything:
doug@eclipse:~$ ps -C java -L PID LWP TTY TIME CMD doug@eclipse:~$
So no java running. Let's fire up Clojure:
doug@eclipse:~/tp12/clojure$ ls .. clojure clojure_20090320.zip doug@eclipse:~/tp12/clojure$ /home/doug/java/jre1.6.0_13/bin/java -jar clojure.jar Clojure user=>
Let's see how many threads:
doug@eclipse:~$ ps -C java -L PID LWP TTY TIME CMD 31212 31212 pts/12 00:00:00 java 31212 31213 pts/12 00:00:01 java 31212 31214 pts/12 00:00:00 java 31212 31215 pts/12 00:00:00 java 31212 31216 pts/12 00:00:00 java 31212 31217 pts/12 00:00:00 java 31212 31218 pts/12 00:00:00 java 31212 31219 pts/12 00:00:00 java 31212 31220 pts/12 00:00:00 java doug@eclipse:~$ ps -C java -L|grep -v PID|wc -l 9
So 9 OS threads. Next I loaded the ants.clj file:
user=> (load-file "ants.clj") nil
Window pops up. Blank except for blue square in middle. CPU idle. Now we have 13 threads:
doug@eclipse:~$ ps -C java -L|grep -v PID|wc -l 13
Next I pasted in the following to the clojure REPL (it's from the bottom of the ants.clj file):
(def ants (setup)) (send-off animator animation) (dorun (map #(send-off % behave) ants)) (send-off evaporator evaporation)
Simulation starts. CPU pegged at 100%.
doug@eclipse:~$ ps -C java -L|grep -v PID|wc -l 89
89 OS threads. Wow that is even more than I figured.
Oh and I get it now... The ants crawl around searching for food and bring it back to the nest. ;)
If the objective is to simulate ant movement (as opposed to testing/benchmarking a thread implementation) you will have a more efficient and correct program if each ant is not implemented as its own thread. See The System described in my previous message for a solution.
So we might be criticising Clojure incorrectly here.
Actually I was never criticising Clojure (I don't know enough about it to do that). I was only discussing the trade-offs of different concurrency strategies.
Doug
toronto-lisp mailing list toronto-lisp@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/toronto-lisp
On Wednesday 15 April 2009 10:26:26 am Vishvajit Singh wrote:
Ouch.. 89 threads?
I was confused about agents, apparently.. it seems that:
send -> runs on the thread pool send-off -> creates a new thread
Mea culpa :)
Not so fast :-).
I just finished watching the concurrency screencast, but the exact answer isn't in there.
Here is what I think he might be saying:
A classical (preemptive) process requires that you carve out a piece of memory for the stack and the context for the process. If you have 300 processes, then you have 300 stacks. If you have only one CPU, then 299 of the stacks are idly wasting space. (This is one of the reasons why people resort to uglyosities such as thread pools).
In VF and stackless programming (and maybe in Clojure, given the statement about millions of agents in memory), there is only one stack - the hardware stack.
In VF, each instance of a part uses up only enough memory for two queues (due to VF semantics) and a flag[+]. VF parts - basically state machines - run each action to completion. Parts all share the same stack and they leave no stacked context lying around "for next time".
To run a set of parts, you need a thread of execution. The thread of execution winds its way through all of the activated parts and then dies out when there's no work left to perform.
On bare hardware, the thread of execution is provided by a hardware interrupt (if the hardware supports prioritized interrupts, you can have more than one thread of execution stacked up).
On an O/S, a thread of execution is whatever you want it to be, e.g. a user process or a thread.
If you wanted to spray the threads of execution across a bunch of CPU's, you would have to have at least one thread of execution on each of the CPU's.
I wonder if this is what Hickey is getting at? The JVM will schedule threads onto various CPU's. In the concurrency screencast, he uses a dual core machine and shows that his simulation is using more than 100% of the CPU (i.e. more than one core is active).
His ant simulation could benefit from the addition of CPU's, up to 89 of them.
He stresses (repeatedly) that there is no explicit locking in the user-written code. He stresses that the GUI would be extremely hard / wasteful to write using locks. The GUI takes a snapshot in time of the 80x80 world and then renders this snapshot on the screen. According to him, it never gets it wrong - there are always the correct number of ants on the screen, they never overlap, etc.
pt
[+] Plus any static state variables declared by the programmer.
From what I understand here, concurrency is combined being combined with
parallelism. Some systems do not combine these models. I generally view concurrency as operations occuring at the same time that aren't related and don't necessarily rely on each other, where as parallelism is where you can cut up a problem into sequences which can be run concurrently and then combined.
I really want to emphasize that concurrency and parallelism are quite different and that often it depends on the problem which kind of concurrency or parallelism best maps to the problem. As well it often depends on the problem which kind of concurrency or parallelism provides the best performance.
One example I have is that the ocaml bytecode vm handles its own threads, while the native ocaml binaries were using pthreads. I was able to have 10000+ vm threads which all were relatively quiet, while I could only have about 1000 pthreads.
I think some of the patterns Doug mentions are valuable but they aren't applicable to everything. Remember programming is also about representing the problem for the programmer, the computer and the stakeholder who might use or need the code. If we enforce a certain styles we can get in the way of people representing their problems.
I haven't really said anything other than some problems are better suited to different models of concurrency or parallelism than others. Different models give you different performance and on different operating systems and VMs will have different advantages and disadvantages.
This doesn't even touch distributed concurrency and parallelism. If you try out MPI you will see where the "shared memory" approach starts to fall apart as the latency of your links increases and the max throughoutput decreases.
Summary: There is no easy way out
abram
On Wed, 15 Apr 2009, Paul Tarvydas wrote:
On Wednesday 15 April 2009 10:26:26 am Vishvajit Singh wrote:
Ouch.. 89 threads?
I was confused about agents, apparently.. it seems that:
send -> runs on the thread pool send-off -> creates a new thread
Mea culpa :)
Not so fast :-).
I just finished watching the concurrency screencast, but the exact answer isn't in there.
Here is what I think he might be saying:
A classical (preemptive) process requires that you carve out a piece of memory for the stack and the context for the process. If you have 300 processes, then you have 300 stacks. If you have only one CPU, then 299 of the stacks are idly wasting space. (This is one of the reasons why people resort to uglyosities such as thread pools).
In VF and stackless programming (and maybe in Clojure, given the statement about millions of agents in memory), there is only one stack - the hardware stack.
In VF, each instance of a part uses up only enough memory for two queues (due to VF semantics) and a flag[+]. VF parts - basically state machines - run each action to completion. Parts all share the same stack and they leave no stacked context lying around "for next time".
To run a set of parts, you need a thread of execution. The thread of execution winds its way through all of the activated parts and then dies out when there's no work left to perform.
On bare hardware, the thread of execution is provided by a hardware interrupt (if the hardware supports prioritized interrupts, you can have more than one thread of execution stacked up).
On an O/S, a thread of execution is whatever you want it to be, e.g. a user process or a thread.
If you wanted to spray the threads of execution across a bunch of CPU's, you would have to have at least one thread of execution on each of the CPU's.
I wonder if this is what Hickey is getting at? The JVM will schedule threads onto various CPU's. In the concurrency screencast, he uses a dual core machine and shows that his simulation is using more than 100% of the CPU (i.e. more than one core is active).
His ant simulation could benefit from the addition of CPU's, up to 89 of them.
He stresses (repeatedly) that there is no explicit locking in the user-written code. He stresses that the GUI would be extremely hard / wasteful to write using locks. The GUI takes a snapshot in time of the 80x80 world and then renders this snapshot on the screen. According to him, it never gets it wrong - there are always the correct number of ants on the screen, they never overlap, etc.
pt
[+] Plus any static state variables declared by the programmer.