I found this very interesting discussion on sbcl-devel. It shows the difficulty of doing things right if we are to kill processes asynchronously, yet they want to do things properly wrt catching them, ensuring atomicity of memory operations, freeing resources (notably alien data), etc.
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] Perhaps those of us who care about quality programs have not spoken up often enough -- `for bad programs to triumph requires only that good programmers remain silent.' I call this passivity the `Silence of the Lambdas.' -- hbaker
---------- Forwarded message ---------- From: Gábor Melis mega@hotpop.com Date: 30-Aug-2005 06:15 Subject: Re: [Sbcl-devel] async signals To: Alexander Kjeldaas astor@fast.no Cc: sbcl-devel@lists.sourceforge.net
On Monday 22 August 2005 22:23, Alexander Kjeldaas wrote:
OPEN, CLOSE and other functions that need cleanup could assert that *INTERRUPTS-ENABLED* is NIL. Thus unclean code could be weeded out.
Maybe something like this:
(let (x) (unwind-protect-without-interrupts-in-cleanup (progn (setq-without-interrupts x (open ...)) ...) (when x (close x)))))
would be easier to read?
astor
Yes, it's easier to read and within the bowels of sbcl it might even be utilized. Since async signals are not standard, people just don't expect arbitrary things to happen at any time. It's just hard.
Something must be done to bend async signals into lisp. I thought about making serve-event check for pending interrupts. When a possibly blocking I/O call is made the user should be well prepared to handle arbitrary conditions anyway, right? Yeah, it is a bit of stretch. Especially considering that the caller needs to know exactly which calls can do I/O or call serve-event. Not likely, around methods and such make it even less so.
At this point I got disheartened and come to think that abandoning a side-effecting computation (with-timeout, terminate-thread, C-c + abort) is practically unsafe wrt unwind-protect. With the exception of with-timeout they should not "normally" (not during development or as last resort) be used. Ordinary code is just full of assumptions of what can and cannot fail.
That leaves us with some documentation to write and with-timeout. A quick audit on the paserve code base revealed that its need could be satisfied with:
- deadlines on streams
Currently, timeouts are per-operation. Sending one byte every timeout seconds is enough to keep a reader on the other end of the stream busy. For a web server, that has a stream for each request but performs multiple operations on the stream an upper limit for the sum of the duration of those operations is needed.
(setf (stream-deadline stream) (+ 180 (get-universal-time))) (loop repeat 100 do (read-char stream))
A slight problem is clobbering deadlines already set. The following macro temporarily overwrites the deadline if the new deadline is earlier than the old one.
(with-stream-deadline (stream (+ 5 (get-universal-time))) (read stream))
- a timeout on socket-connect, open (?), close
The current implementation of with-timeout should work for aborting the system calls (connect, open). That would mean adding a :timeout parameter to socket-connect, and open on the lisp side.
close with :abort t should not block, unless SO_LINGER is set.
The plan:
1) implement a with-timeout that works with threads
2) add timeout parameters to socket-connect, open, ... (?)
3) implement stream deadlines
Unfortunately, 1) needs a scheduler (like Zach's timer http://www.cliki.net/TIMER, 500 lines), but - on the bright side - it's still possible to implement with setitimer and does not require a dedicated thread.
Please, flame this approach properly.
Cheers, Gábor
------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Sbcl-devel mailing list Sbcl-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sbcl-devel
Faré wrote:
I found this very interesting discussion on sbcl-devel. It shows the difficulty of doing things right if we are to kill processes asynchronously, yet they want to do things properly wrt catching them, ensuring atomicity of memory operations, freeing resources (notably alien data), etc.
In the case of shared memory concurrency, killing processes asynchronously is a minefield; it is so difficult to do right that IMO it should not be attempted.
Fortunately, in an Erlang-like message passing language we don't have to deal with shared memory at the semantic level. The only effect that a process can have on other processes is by sending messages. That is:
- sending a message is an atomic event, - the desired semantics of asynchronous termination is that all messages that the process attempts to send before a given point in its history are sent, and no messages that it attempts to send after that point are sent.
Of course, given that we have to call code in libraries that do not just use pure message passing, implementing this model may be easier said than done. But at least it's a clean semantics to aim for.
The specific reason why asynchronous termination is harder for a language using shared memory concurrency is that, if a thread is killed while it is holding a lock on an object, that object will still be accessible to other threads afterward. At best, the invariants of the object may be violated; at worst, the invariants of the language implementation may be violated, leading to loss of dynamic type safety.
In a language using message passing concurrency, all objects are local to some process, and become inaccessible once that process is killed, so their state at that point does not matter.
At the implementation level, it's probably not a good idea to attempt to kill threads at arbitrary points; they should only be killed at safe points. Some Lisp implementations may already have a suitable notion of safe point that is used for GC, for example.
On 9/17/05, David Hopwood david.nospam.hopwood@blueyonder.co.uk wrote:
In the case of shared memory concurrency, killing processes asynchronously is a minefield; it is so difficult to do right that IMO it should not be attempted.
Well, of course it shouldn't be done as is. But in as much as we're implementing an Erlang-like system on top of multithreaded Lisps, we precisely have to take into account such things.
That's why a thread should only actually die at a safe point, which is a matter of checking a flag when (yield) is called -- and of course it should by the message system *before* any message is received.
On the other hand, if SBCL does (eventually) provide ways to throw and safely catch asynchronous exceptions, there's no reason not to use them.
The specific reason why asynchronous termination is harder for a language using shared memory concurrency is that, if a thread is killed while it is holding a lock on an object, that object will still be accessible to other threads afterward. At best, the invariants of the object may be violated; at worst, the invariants of the language implementation may be violated, leading to loss of dynamic type safety.
I've formalized this phenomenon in my thesis :) This is why any higher-level concurrent language system must be able to define application invariants that the system will enforce. Erlang makes it possible by decomposing programs into linked processes.
At the implementation level, it's probably not a good idea to attempt to kill threads at arbitrary points; they should only be killed at safe points. Some Lisp implementations may already have a suitable notion of safe point that is used for GC, for example.
Once again, THE SYSTEM-LEVEL NOTION OF SAFE POINT IS NOT THE APPLICATION-LEVEL NOTION OF SAFE POINT, and cannot possibly be you allow the internalization of user-defined atomicity constraints in the language. Erlang works around this in a clever way, rather than attempt to solve the problem. Or maybe, in a way, it *does* provide a solution, but it is unclear (to me) how this solution would be translated in terms of imperative programming with internal side-effects.
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] We will encourage you to develop the three great virtues of a programmer: laziness, impatience, and hubris. -- LarryWall, Programming Perl (1st edition)
On 9236 day of my life David Hopwood wrote:
In the case of shared memory concurrency, killing processes asynchronously is a minefield; it is so difficult to do right that IMO it should not be attempted.
Un Unix-like (or POSIX-like) The only safe thing you can do during asynchronous interrupt handling is to read or set variable of some type sig_atomic_t.
Doing anything else is opening can of worm.
And it doesn't matter if it is shared memory concurrency or single-thread program: what if process receives yet another asynchronous signal in signal handler? It may corrupt variable you write to if it is other than sig_atomic_t.
On 9/18/05, Ivan Boldyrev lispnik@gmail.com wrote:
Un Unix-like (or POSIX-like) The only safe thing you can do during asynchronous interrupt handling is to read or set variable of some type sig_atomic_t.
Maybe you didn't realize that we were talking about Lisp and not C. Asynchronous interrupts at the Lisp level need not be asynchronous signals at the C level. Lisp interrupts can be either higher-level than the C level (already handling your sig_atomic_t and then doing something when back from the signal handler), or they can be lower-level (talking directly to the kernel, to avoid the limitations of the C library). And whatever level we are at, the problem of propagating asynchronous interrupts to the level above (that of the application -- which itself could be layered into many levels) is a non-trivial problem with essentially the same kind of solutions, only at a higher level.
And the challenge for concurrent languages is PRECISELY to offer a way to deal with asynchronousness that deals nicely with the above problem.
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] If a vegetarian eats vegetables, what does a humanitarian eat? -- Mark Twain
On 9236 day of my life fahree@gmail.com wrote:
Maybe you didn't realize that we were talking about Lisp and not C. Asynchronous interrupts at the Lisp level need not be asynchronous signals at the C level. Lisp interrupts can be either higher-level than the C level (already handling your sig_atomic_t and then doing something when back from the signal handler),
Then proper handling of sig_atomic_t is problem of Lisp system, not Lisp programmer.
or they can be lower-level (talking directly to the kernel, to avoid the limitations of the C library).
It's not limitation of C library, it is limitation of architecture. Layer of C library over signal API is too thin to limit something. No matter if you write you program in C, asm or Lisp, if processor can't write something atomically, you can't use it in async. interrupts.
For example, no matter which language you use, you can't atomically write or read 64-bit integer (aka long long) with x86 platform.
Ivan Boldyrev wrote:
On 9236 day of my life fahree@gmail.com wrote:
or they can be lower-level (talking directly to the kernel, to avoid the limitations of the C library).
It's not limitation of C library, it is limitation of architecture. Layer of C library over signal API is too thin to limit something.
It isn't particularly thin, and it certainly can be limiting.
No matter if you write you program in C, asm or Lisp, if processor can't write something atomically, you can't use it in async. interrupts.
For example, no matter which language you use, you can't atomically write or read 64-bit integer (aka long long) with x86 platform.
Bad example: 'lock cmpxchg8b'.
On 9/19/05, Ivan Boldyrev lispnik@gmail.com wrote:
On 9236 day of my life fahree@gmail.com wrote:
Maybe you didn't realize that we were talking about Lisp and not C. Asynchronous interrupts at the Lisp level need not be asynchronous signals at the C level. Lisp interrupts can be either higher-level than the C level (already handling your sig_atomic_t and then doing something when back from the signal handler),
Then proper handling of sig_atomic_t is problem of Lisp system, not Lisp programmer.
That's only the case if you're using C signals and wants to write in shared memory in C-multithreaded environment. A Lisp implementation may offer user-visible threads that have asynchronous interrupts at the lisp level and NOT USE ANY OF THE ABOVE AT ALL. And actually, that's what Erlang does and Erlisp could typically do: spawn one unix process per processor, each being the host of zillions of Erlang processes, that do not rely the least on either unix signals or unix shared memory for their internal asynchronous interrupts and communication.
It's not limitation of C library, it is limitation of architecture.
The C library definitely does add limitations in terms of reentrancy, memory model, lack of PCLSRing, etc. There are few system calls you can do from a C signal handler, yet the kernel would happily oblige you on all of them.
Layer of C library over signal API is too thin to limit something.
Bullshit. errno alone is enough to bring quite a lot of problems.
No matter if you write you program in C, asm or Lisp, if processor can't write something atomically, you can't use it in async. interrupts.
You can't imagine that the async interrupts in Lisp don't have to be async interrupts in C. And you don't need your processor to write anything atomically if you're not actually using multiple concurrent processors, which won't happen if you have only one process and thread at the C level.
For example, no matter which language you use, you can't atomically write or read 64-bit integer (aka long long) with x86 platform.
64-bit, you can actually (but then you'll tell me about some bigger size). However, it doesn't really matter what you can or cannot write atomically to shared memory, since the whole point of Erlisp is to avoid the shared memory model of concurrency.
I grant you this: if we want to catch unix signals and send asynchronous interrupts of Erlisp processes from them, then whenever we call foreign C code that may be non-reentrant, we must wrap that code into a proper flag polling mechanism (which needn't be atomic since it's per-thread) that allows us to achieve PCLSRing by calling the rest of the signal handler when the thread is safely out of the potentially problematic C code. All this needn't apply when we're running "normal" code generated by a compiler we control, at which time you may use more hacking techniques for PCLSRing.
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] Amateurs talk strategy. Professionals talk logistics. - old military saying
Faré wrote:
On 9/19/05, Ivan Boldyrev lispnik@gmail.com wrote:
For example, no matter which language you use, you can't atomically write or read 64-bit integer (aka long long) with x86 platform.
64-bit, you can actually (but then you'll tell me about some bigger size). However, it doesn't really matter what you can or cannot write atomically to shared memory, since the whole point of Erlisp is to avoid the shared memory model of concurrency.
OTOH, there are some very efficient lock-free implementations of message queues that you can use if pointer-sized words can be written atomically. But all multiprocessor platforms can do that.
On 9236 day of my life fahree@gmail.com wrote:
Maybe you didn't realize that we were talking about Lisp and not C.
I thought a little more and I think we are both right :)