Hi there,
Quoting Andrey Moskvitin (archimag@gmail.com):
I wrote a very simple library iolib.process, which allows you to run child processes and interact with them through the standard IO-streams. In contrast to the sb-ext:run-programm and similar tools offered by implementations, iolib.process not depend on the specific implementation, but only on iolib.syscalls and iolib.streams. iolib.process should work on all Unix-systems, tested on Linux with SBCL, Clozure CL and CLISP. Perhaps, after appropriate revision, it makes sense to include this library in the iolib.
having such a library sounds like a great idea, and I like your code in the sense that it looks somewhat similar architecturally to what I did when I needed something similar in Hemlock.
Unfortunately, it would also run into the same problems as my code did:
- On MacOS, SBCL doesn't survive a call to fork() if Lisp code in being run in the child process -- something about threading going wrong after the fork.
The solution, unattractive as it may sound, is to write the code for the child process as a glue function written in C, which also implies doing the fork in C.
- I'm a bit surprised that it works with CCL out of the box for you, because I recall having to disable GC or interrupts (or something like that) to by-pass a crash there.
Perhaps writing the code in C isn't that bad an idea after all, because it also reduces this kind of portability issue.
- When using the C code approach, some flexibility would get lost. In practise, user code often needs to set up the child process environment in ways that are hard to foresee for the library author, i.e. for FD redirection, tty and session handling, environment variables etc. (and attempts to implement a general API with lots of keyword arguments for those use cases does not lead to good API design, I think).
What I would like to see is a little domain specific language that describes common syscalls and library functions (dup2, open, setenv, ...). It would then compile those calls into a byte array, and pass that to the C function. Following the fork, the C code would execute the bytecode.
- As Stelian explained, there are certain issues with SIGCHLD that make this code unportable, because CLISP works very hard to keep iolib from getting its hands on the SIGCHLD handler.
I think there are several approaches to this:
a. Ignore the problem, declare CLISP unsupported.
b. Solve the problem by clever SIGCHLD handler chaining.
c. Write a separate "fork server", where the Lisp justs asks the server to spaws processes instead of doing that itself. (I believe Stelian wrote something like that, but I don't know where the code is.)
d. Like c., but in particular run that "fork server" process as a child process of the Lisp. So the process hierarchy would be:
<Lisp process> | ^ | | socketpair for communication v v <Helper process written in C> | | ... v v Forked process 1 Forked process 2 ...
The advantage is that the helper process never dies, so it doesn't lead to SIGCHLD, and the SIGCHLDs for the other processes get handled in C.
e. Distinguish between Lisps that need the hack described in d and those which don't. I.e., use method d on CLISP, but skip the helper process on SBCL, CCL and others.
The disadvantage would be that code behaves differently depending on the Lisp used.
Personally I would strongly prefer approaches a. or b.
David