You are receiving this message because you signed up to the streams-standard-discuss mailing list. It appears to be the first message sent to said list, so if you'd forgotten you subscribed and you don't wish to be, there should be details of how to change this in the footer.
So. This list and its siblings (for threads and MOP) were set up after discussions at the LSM 2004 in Bordeaux about areas of CL that we thought were worth de facto standardization. The discussion on threads centered around a short presentation given by Rudi Schlatte
http://www-jcsu.jesus.cam.ac.uk/~csr21/papers/lightning/lightning.html#htoc9
and discussion. Note that I'm going from my unreliable memory, reinforced by subsequent talks on IRC; this summary is by now more indicative of my own opinions than any consensus we reached at the time. So, if you were there (and even if not), chime in.
The prime existing contenders for extensible streamas are "Gray" streams and Franz' simple-streams. In fact these solve different problems.
Simple-streams are for when you have an external octet store of some kind (e.g. a file) and simply want to write some code that fills it with octets (when writing) or empties it of octets and makes them available to client code (when reading). It has some kind of translations for "special" characters but in the end - correct me if this is not true - everything comes down to a block of characters/bytes/similar
We suspect the simple-streams specification to have infelicities "around the edges" where it's not clear how much is intended to become a standard and how much is just Allegro implementation that happens to be based on simple-streams. e.g. the interface in ACL to create a socket stream, last time I looke at it, involved a call to some MAKE-SOCKET function instead of the tidier MAKE-INSTANCE of SOCKET-STREAM that you might have been hoping for. I may be misremembering this or I may be out of date; please feel free to correct me.
Gray streams are intended to operate at a very different level. An implementation of a Gray stream might want e.g. to recognise objects as they're output and remember their location on a graphical display; it seems clear that digesting all the objects to text strings before doing this is not The Right Thing, so the higher level interface is also needed.
The common complaints about Gray streams are: (1) that it's not specified which of the interface methods are used in the default implementation of which of the others, so you end up having to implement /everything/ yourself instead of knowing that, for example, you can provide methods for stream-read-char and stream-listen and will automatically get a working stream-read-char-no-hang. (2) CLOS dispatch in single charcter output is likely to be really rather slow
After reading a bit on some of the issues faced in i18n, I think Gray streams have other issues too: I'm not convinced that an interface that still talks in terms of column positions is necessarily going to be so useful in a world of proportional fonts, bidirectional rendering, asian languages and arabic, etc. Just what _is_ STREAM-START-LINE-P when rendering decimal numbers left-to-right inthe middle of a line of right-left hebrew text?
The primary hat I'm wearing here is as an SBCL implementor: the current streams mechanisms in SBCL are pretty grungey and really could use some redesign at the 'device' (simple-streams-like) level. It's unikely that we'd adopt simple-streams wholesale - apart from anything else our build process says we need streams before CLOS is available - but something that is at least philosophically compatible with the good bits would be nice.
OK, discussion is open to the floor.
-dan
Daniel Barlow writes:
CLOS dispatch in single charcter output is likely to be really rather slow
Yes. Could this be solved by simply adding stream-read-sequence and stream-write-sequence methods to the protocol? Probably not if we need to keep track of column position as well, which is another reason for eliminating the GFs that talk about columns.
Quoting Robert Strandh (strandh@labri.fr):
CLOS dispatch in single charcter output is likely to be really rather slow
Yes. Could this be solved by simply adding stream-read-sequence and stream-write-sequence methods to the protocol?
Well, no. I see the following kinds of streams:
- unbuffered I/O
This kind of stream is available in Unix as read() and write() (although there is no stream "object" in this layer other than the file descriptor). But it is an important concept and usually the lower layer other concepts are building on.
Common Lisp does not know unbuffered I/O.
Simple Streams call this the "device level" and provide support for it by documenting the device level functions, allowing the user to implement them using CLOS and also to call them directly, bypassing upper layers (not very elegant but at least possible).
Gray streams would allow for a stream to be unbuffered, but then there are all the single-character/single-byte functions in the API which do not make at lot of sense at this layer, because callers would only want to use the read-/write-sequence functions on such a stream for speed reasons.
- buffered I/O (of characters or bytes)
This is provided by C's stdio and by Common Lisp streams. (But not in an extensible way using portable features in either language.)
It is an abstraction needed for file I/O, sockets, etc. A very common application that should be fast.
Gray streams mirror the Common Lisp functions for this layer, but have a speed and design problem: If buffering is used anyway, the right abstraction is to document the buffer instead of having many streams reimplement it poorly.
Or at least that must be the idea behind simple streams.
Answering your question: It does not help at all to provide fast read-/write-sequence at this layer if read-/write-byte are not fast, since this layer is all about the buffering. If I wanted to prebuffer my output into long sequences and write those instead, I would not need this layer in the first place (it would buffer the same data a second time) and use unbuffered I/O instead.
- more general stream-like objects which don't involve such a buffer
stdio and Common Lisp do not know this.
An example would be CLIM, which takes the Common Lisp API for streams with similar purposes but a completely different implementation. A buffer of bytes would not help for such streams at all, yet re-using the existing stream API appears better than reinventing it.
Gray streams try to provide for this using CLOS.
CMUCL's ansi-streams solve it without CLOS by simulating their own object system.
Problems with the simple stream approach include streams which are buffered in a sense, but in a different way than the original protocol designer hoped they would be. For example, string streams are special case which users would not be able to implement if the simple streams implementation would have provided for that.
Problems with CMUCL's ANSI-STREAM include that they are structures and that other implementations might not willingly adopt some ad-hoc object system in its stream layer just for compatibility with, say, SBCL. OTOH simple streams, while simpler due to CLOS, seem to struggle with that CLOS usage: As far as I understand it, there are funny attempts at optimizing CLOS access away which are not exactly beautiful.
Extensibility for buffered I/O must be provided at the device level as with simple streams. Extensibility for arbitrary streams at or slightly below the ANSI CL level. So ANSI-STREAMs and simple streams are not alternative proposals for the same problem, but for different problems.
My claim is that both abstractions are needed. Buffered I/O because it is the most common case. More general streams because Lisp people are used to them. It would be nice if buffered I/O was implemented using the general abstractions. Right now I do not see how to implement, say, simple streams as ANSI-STREAMs largely because the latter are structures.
Probably not if we need to keep track of column position as well, which is another reason for eliminating the GFs that talk about columns.
The column position is optional already in that it may be NIL if the stream does not know the concept. So other than not being general enough, it does not do any harm either. Its importance would depend upon who the major users are. FRESH-LINE probably. Does the pretty-printer need it?
d.
David Lichteblau david@lichteblau.com writes:
Well, no. I see the following kinds of streams:
[nice sumamry deleted]
more general stream-like objects which don't involve such a buffer
An example would be CLIM, which takes the Common Lisp API for streams with similar purposes but a completely different implementation. A buffer of bytes would not help for such streams at all, yet re-using the existing stream API appears better than reinventing it.
I suspect that a credible GSLO extension interface has to have significantly more hookage into the pretty printer than Gray streams do, simpy to allow things like pretty printing with proportional fonts. Note that this would probably break the "write-to-string/render" equivalence that user of monospaced Lisps have grown to expect. For example,
(format stream format-specifier arg ...)
and
(write-sequence (format nil format-specifier arg ...) stream)
would, when sufficiently complicated tabulation/indentation/logical block constructs are in use, line up differently: in a sane world, the "current font" would be a per-stream attribute, so there's no way for (format nil ...) to know what the font metrics are.
(That's leaving aside the infelicities of the stream interface for non-English applications (e.g. l->r rendering assumptions, format ~p, ~@p directives) about which I'm mostly not qualified to speculate.)
-dan
David Lichteblau:
Problems with the simple stream approach include streams which are buffered in a sense, but in a different way than the original protocol designer hoped they would be. For example, string streams are special case which users would not be able to implement if the simple streams implementation would have provided for that.
Can you elaborate on that? It seems that a basic "read buffer, write and empty buffer, seek to position" interface could easily implement a string output stream if the write-and-empty-buffer didn't empty it but instead saved it and gave the caller a new buffer (perhaps contiguous) to scribble the next block in. Am I missing something important, or does simple-streams not allow this?
Problems with CMUCL's ANSI-STREAM include that they are structures and that other implementations might not willingly adopt some ad-hoc object system in its stream layer just for compatibility with, say, SBCL. OTOH simple streams, while simpler due to CLOS, seem to struggle with that CLOS usage: As far as I understand it, there are funny attempts at optimizing CLOS access away which are not exactly beautiful.
I wouldn't expect that any homebrew object system will get widespread traction any time soon: I think this is a problem that CMUCL/SBCL face and will have to solve for themselves with some kind of proxy/adaptor interface. </handwaving>
-dan
Quoting Daniel Barlow (dan@telent.net):
Problems with the simple stream approach include streams which are buffered in a sense, but in a different way than the original protocol designer hoped they would be. For example, string streams are special case which users would not be able to implement if the simple streams implementation would have provided for that.
Can you elaborate on that? It seems that a basic "read buffer, write and empty buffer, seek to position" interface could easily implement a string output stream if the write-and-empty-buffer didn't empty it but instead saved it and gave the caller a new buffer (perhaps contiguous) to scribble the next block in. Am I missing something important, or does simple-streams not allow this?
Simple streams allow for just that by permitting device level methods to exchange buffers, yes.
But string simple streams happen to be neither single nor dual channel simple streams, and users would not be able to define their own instance flag for this. I _believe_ this is because string streams are bypassing external formats and are not bivalent. (And I hope there are simple stream experts on this list who can correct me if I'm wrong here.)
But to implementing a bivalent string-stream-like class as a single-channel-simple stream should be possible.
(Except that when I tried something like this, I failed. My application were BLOBs stored in an OO database, and I tried to exchange buffers and frob the current output pointer. Finally I gave up and copied the buffer instead of exchanging it. So it may well just have been my lack of understanding for the details... In a way that may be a point, however. Looking at the documentation now trying to recall what I did back then, I again do not have a clue what DEVICE-EXTEND is for. In a new spec for such streams, please make sure that the buffering layer is as simple as possible. :))
d.
Time to join the fun ...
On 15. Sep 2004, at 13:31, David Lichteblau wrote:
unbuffered I/O
This kind of stream is available in Unix as read() and write() (although there is no stream "object" in this layer other than the
file descriptor). But it is an important concept and usually the lower layer other concepts are building on.
Bruno and me talked about that layer a little bit at LSM2004; I'm attaching the result. This is a possible interface for "device object that can be plugged in a buffering layer" that can be implemented by a user. We also had this neat idea that these device-streams can implement the Gray byte-oriented protocol as well, for users that want to operate on unbuffered streams.
First draft, perhaps oriented toward files a bit, etc. (I know Dan is thinking about a buffering layer; perhaps this lower part is useful for inspiration or something.)
Cheers,
Rudi
Rudi Schlatte wrote:
This is a possible interface for "device object that can be plugged in a buffering layer" that can be implemented by a user.
Thanks Rudi for posting this.
Let me explain the differences w.r.t. the simple-streams proposal (http://www.franz.com/support/documentation/6.0/doc/streams.htm, http://www.franz.com/support/documentation/6.2/doc/streams.htm):
- The common thing is that it's about interfacing to low-level devices that transport bytes. Examples: Sockets, ssh tunnels, gzipped data.
- The control-character processing has been removed from this layer. Rationale: Since the control-character processing is limited to bytes between 0 and 31, it's obviously meant for *terminal-io*, i.e. for a stream whose speed is irrelevant. Such control-character processing can be handled in upper layers without complicating the device layer.
- The external-format processing (conversion from byte sequence to character sequence or multi-byte integer sequence) is left in an upper layer. It is expected that the stream that does external-format processing delegates to the device-stream.
- The buffering is left to an upper layer. An implementation can thus offer buffered _or_ unbuffered streams that delegate to a user-written device-stream. (Whereas buffering in the simple-streams proposal is mandatory.)
The architecture thus looks like this:
Lisp stream
| | +----------------------------------+ | (optional) control | | character processing | | - provided by the implementation | +----------------------------------+ | |
Lisp stream
| | +----------------------------------+ | external-format processing | | - provided by the implementation | +----------------------------------+ | |
Lisp stream of element type (unsigned-byte 8)
| | +----------------------------------+ | (optional) buffering | +----------------------------------+ | | Device-stream interface | | +----------------------------------+ | a device-stream | | provided by the user | +----------------------------------+
- The device-read and device-write calling convention have been simplified (no need to distinguish two kinds of EOF, no need for "no-hang queries" since the caller can always pass a 1-byte buffer instead).
- The sequence type of the buffer is restricted to an array of (unsigned-byte 8), to avoid typechecks inside device-read and device-write.
Bruno
Quoting Rudi Schlatte (rudi@constantly.at): | Add a class 'device-stream' which is a subclass of stream of metaclass | standard-class. An instance of a subclass of device-stream is meant | to manage an "underlying device", e.g. a file or socket.
OK, so there will be at least two objects for each Common Lisp streams, the stream itself and the device beneath it. That is nice because it allows applications to call device layer functions directly without creating a full streams first and then "bypassing" the upper layers.
OTOH, I am unclear about how the device and the buffering layer would interact in such a scheme for streams more complicated than file-streams. We discussed string-stream-like approaches where buffers need to be exchanged after writing them to the device, and with this approach the device methods would need to access the buffering object, a layering violation? Perhaps a question that can only be answered when there is also a proposed design for the buffering layer.
| @smalldisplay | ;;;; rudi (2004-07-09): we should also specify the conditions that are | ;;;; thrown, if any. My gut feeling would be to work with return | ;;;; values only and throw conditions farther up, what do you think? | @end smalldisplay
So EOF is not a condition here because it is not really an exceptional situation, but rather expected. Fine. Actual errors, however, should be conditions, I think. (- -10 errno) does not look very lispy to me...
| @defun device-open device &rest initargs @result{} result | | @var{device}: an instance of device-stream | | @var{initargs}: options specific to the actual type of @var{device} | | @var{result}: a generalized boolean | | @code{device-open} performs the device-specific operations to open the | underlying device, taking any needed information from @var{initargs}. | @var{result} is true if the device could be successfully opened, false | if not. | @end defun
Can you elaborate on that? Given that CLOS is used here anyway, why is initialization of the stream not done by INITIALIZE-INSTANCE and friends? I know that simple streams have a DEVICE-OPEN method, but I am unclear about why that was invented, too.
Is it necessary to be able to create streams and open them only later? Or can streams be re-opened?
| @defun device-close device | | @var{device}: an instance of device-stream | | @code{device-close} closes the underlying device. | @end defun
Might need an ABORT argument?
| @defun device-read device buffer start end blocking @result{} result | @defun device-write device buffer start end blocking @result{} result
Sounds good.
| @defun device-clear-input device | | @var{device}: an instance of @code{device-stream} | | @code{device-clear-input} performs any device-specific operations necessary | to discard any input pending on the underlying device, including | clearing any os-level buffers or similar. | @end defun
The only correct implementation I can imagine for this function with Unix files is to loop in read() until nothing more is returned. Bad idea with, say, /dev/random, because it will loop forever. Is this really what is meant? If not, can we make it more precise or just drop it completely?
| @defun device-clear-output device | | @var{device}: an instance of @code{device-stream} | | @code{device-clear-output} performs any device-specific operations necessary | to discard any output pending on the underlying device, including | clearing any os-level buffers or similar. | @end defun
Similar question as for clear-input: This sounds like the device layer implementation of CLEAR-OUTPUT. Taking Unix file descriptors an an example, how would this work? (If there is no reasonable implementation of this on current operating systems, why not assume that CLEAR-OUTPUT flushes the higher-level buffer, but does not reach the device layer at all?)
| @defun device-flush-output device blocking | | @smalldisplay | ;;;; rudi (2004-10-09): device-flush-output instead of | ;;;; device-finish-output, device-force-output since the blocking | ;;;; parameter exists everywhere else as well (it seems more in line | ;;;; with the other methods, but I don't insist on this change) | @end smalldisplay
| @var{device}: an instance of @code{device-stream} | | @var{blocking}: a generalized boolean | | @code{device-flush-output} performs any device-specific operations | necessary to flush any output pending on the underlying device. If | @var{blocking} is true, @code{device-flush-output} will make a best effort to | block until the underlying device confirms completion of the output. | If @var{blocking} is false, @code{device-finish-output} returns immediately. | @end defun
To clarify, with a unix file, this should be fsync(),right?
(I have always assumed that FORCE-OUTPUT flushes a stream's buffer, and FINISH-OUTPUT does more: It calls fsync(). In reality, the Lisps I have looked at just flush the buffer for both functions. :()
On the BLOCKING argument: Not sure. Assuming it is possible, what good does syncing do when you do not wait for it to complete?
| @defun (setf device-file-position) device new-position @result{} result
@defun (setf device-file-position) new-position device @result{} result
Wishlist item: I have often missed a portable function FILE-TRUNCATE. Would such a function fit into this interface?
David
Robert Strandh asked:
Yes. Could this be solved by simply adding stream-read-sequence and stream-write-sequence methods to the protocol?
Definitely yes. Doing 100 operations of the same kind with the preparations (the CLOS dispatch, fetching the buffer pointer etc.) once is always faster than doing 100 times the preparation and one operation.
For example, CLISP's internal FILL-STREAM became 5 or 10 times faster when we implemented not only the STREAM-WRITE-CHAR function but also the STREAM-WRITE-CHAR-SEQUENCE function.
Btw, I propose to use STREAM-READ-CHAR-SEQUENCE and STREAM-READ-BYTE-SEQUENCE instead of STREAM-READ-SEQUENCE, and similarly STREAM-WRITE-CHAR-SEQUENCE and STREAM-WRITE-BYTE-SEQUENCE instead of STREAM-WRITE-SEQUENCE. Reason: - There are streams with element type (OR INTEGER CHARACTER). - It's unnecessary overhead to let the element type guessing be done at each invocation of these functions. - The precise rules for this guessing are unspecified by ANSI CL and therefore implementation dependent.
Bruno
Daniel Barlow wrote:
The common complaints about Gray streams are: (1) that it's not specified which of the interface methods are used in the default implementation of which of the others, so you end up having to implement /everything/ yourself instead of knowing that
Agreed. The user needs to know 1. what the default method behaviour of each of the function does 2. for what kind of stream this default behaviour is sufficient. As a start, you can look at the CLISP documentation of these: http://clisp.cons.org/impnotes/clos-stream.html#gray
Bruno
streams-standard-discuss@common-lisp.net