CMUCL integration is complete.
Great. Do you have process kill working in CMUCL, SBCL, Allegro, in a way that is safe wrt locks and that propagates along the process tree? That's a major feature required of an erlang implementation.
I have not done any work on process killing. As far as I know, it is not a part of ErLisp right now. This will definitely require a kill-thread function in compatibility.lisp. Beyond that I'm not sure how to procede. Manually calling kill-thread? Automatically calling it following a time-out? Also, I would like to give each process the responsibility for killing its child processes, but if it's being killed because it is no longer responsive then that won't work... I would like to see some discussion on this topic in erlisp-devel because I have no experience with such situations.
This leads me to wonder how they do a reliable detection of a remote node being dead, as opposed to the communication channel being down -- and how they cope with a mistake between the two. Surely the point is tackled somewhere in some Erlang documentation...
Best I can think of is for the node itself to be represented as a process whose only job is communication with the outside world. Since the user won't control this process directly, it should be possible to make it fairly durable so that we can assume it is alive. *crosses fingers*
Next up is CLisp.
I don't think clisp has a complete threading implementation yet. Does it? If it doesn't, then it might be time to begin a distributed implementation -- and to fork clisp processes as a way to build new threads.
It looks like CLisp threading is very much a work in progress. I didn't research this well enough in advance. Forking CLisp processes is a nice idea, but we might be better off just letting it go until CLisp has better thread support. The original reason for including CLisp was so that Windows users would have access to ErLisp. Allegro support solves that problem well enough for now.
So what next? Dirk, is ErLisp already have some sort of slot waiting for process-linking code? Or will we be designing something from scratch? I'm thinking that my next step is to read Faré's thesis again. A robust distributed process management system does not sound like an easy task.
Eric
I think within Erlang nodes use a keep alive message to make sure they have not been split off. Quite often, though, you cannot detect if the foreign process exists. If it's a local process id and you send it a message and it's dead then you will get a no_proc exception, not so with a foreign process.
It does seem to me that the easiest way to verify if a node is alive is to send a keep-alive message to a housekeeping process on that node and consider it split off if a reply is not received after a timeout.
I'm not sure if it matters if a node is dead or just split off.
On Jul 27, 2005, at 2:11 AM, Eric Lavigne wrote:
This leads me to wonder how they do a reliable detection of a remote node being dead, as opposed to the communication channel being down -- and how they cope with a mistake between the two. Surely the point is tackled somewhere in some Erlang documentation...
Best I can think of is for the node itself to be represented as a process whose only job is communication with the outside world. Since the user won't control this process directly, it should be possible to make it fairly durable so that we can assume it is alive. *crosses fingers*
Joel Reymont wrote:
I think within Erlang nodes use a keep alive message to make sure they have not been split off. Quite often, though, you cannot detect if the foreign process exists. If it's a local process id and you send it a message and it's dead then you will get a no_proc exception, not so with a foreign process.
It does seem to me that the easiest way to verify if a node is alive is to send a keep-alive message to a housekeeping process on that node and consider it split off if a reply is not received after a timeout.
Sounds perfectly reasonable to me.
I'm not sure if it matters if a node is dead or just split off.
Erlang doesn't make the distinction, does it?
My thinking is that if it's good enough for Erlang, it is /at least/ good enough for Erlisp stage 1. We can always reconsider later on (even though it might be more painful then ;)).
- Dirk
On 26/07/05, Dirk Gerrits dirk@dirkgerrits.com wrote:
Joel Reymont wrote:
I think within Erlang nodes use a keep alive message to make sure they have not been split off. Quite often, though, you cannot detect if the foreign process exists. If it's a local process id and you send it a message and it's dead then you will get a no_proc exception, not so with a foreign process.
It does seem to me that the easiest way to verify if a node is alive is to send a keep-alive message to a housekeeping process on that node and consider it split off if a reply is not received after a timeout.
Sounds perfectly reasonable to me.
Way to go! Of course, we could add a hook here. And actually, the keep-alive should itself be a option in the meta-level protocol, which is on by default (but would be off on top of, say, SNAIL).
I'm not sure if it matters if a node is dead or just split off.
Erlang doesn't make the distinction, does it?
Dunno. That's what we'll have to find out at some point.
Eric: you're paid to know before the end of summer :-)
My thinking is that if it's good enough for Erlang, it is /at least/ good enough for Erlisp stage 1. We can always reconsider later on (even though it might be more painful then ;)).
Yup.
In anycase, the process linking model is the foundation of any robust programming in Erlang. It is much more important to think about doing it properly than to any kind of object marshalling.
Joel: can you confirm? Which features of Erlang do you miss most?
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] I love deadlines. I love the whooshing sound they make as they fly by. -- Douglas Adams
Joel Reymont:
I think within Erlang nodes use a keep alive message to make sure they have not been split off. Quite often, though, you cannot detect if the foreign process exists.
To check whether a _node_ is still running and connected, monitor_node can be used; from http://www.erlang.org/download/erlang-book-part1.pdf page 98:
"If no connection exists, and monitor_node/2 is called, the system will try to setup a connection and deliver a nodedown message if the connection fails."
To check whether a _process_ is still alive, linking to it and checking for the corresponding EXIT message is the way to go.
Dirk Gerrits:
I'm not sure if it matters if a node is dead or just split off.
Erlang doesn't make the distinction, does it?
From the same page:
"The BIF [=built-in function] monitor_node(Node, Flag) can be used to monitor nodes. An Erlang process evaluating the expression monitor_node(Node, true) will be notified with a {nodedown, Node} message if Node fails or if the network connection to Node fails. Unfortunately it is not possible to differentiate between node failures and network failures."
By the way, my experience with Erlang is mostly in trying to develop a Gnutella client a few years ago. To the reasonable nice original Gnutella protocol, many ill-specified ad-hoc extensions were added by the existing clients. Implementing these extensions, which is necessary in order to play well with the other clients, became less and less interesting, so I never finished it. Although I still find Erlang interesting, these days I'm more interested in Common Lisp.
- Willem
Eric Lavigne wrote:
The original reason for including CLisp was so that Windows users would have access to ErLisp. Allegro support solves that problem well enough for now.
Well the AllegroCL (and CMUCL) support at the moment is rather suboptimal (ie polling). Of course using threads in the first place is suboptimal, but we can at least make our thread support as good as possible while other methods are developed. ;)
So what next? Dirk, is ErLisp already have some sort of slot waiting for process-linking code? Or will we be designing something from scratch?
The latter. I was just starting to think about process linking when the storm of schoolwork kicked in.
I'm thinking that my next step is to read Faré's thesis again. A robust distributed process management system does not sound like an easy task.
No, no it does not. ;)
- Dirk
P.S. it's Erlisp with a lowercase "L". The LispNYC wiki lists "ErLisp" only because that's a WikiWord and "Erlisp" isn't.