If a byte stream s has exactly n bytes remaining, then (read-stream-content-into-byte-vector s :initial-size n) will result in two unnecessary copies and a total allocation of 4n bytes:
- n for the initial buffer;
- 2n when the buffer is resized and copied since r-s-c-i-b-v doesn't yet realize that EOF has been reached;
- n when the now oversized buffer is downsized and copied at the end.
Perversely, this means that a size overestimate is more efficient than a correct estimate. I've attached a possible patch for this below. Sadly there is no peek-byte, so I had to opt for a rather awkward read-byte construction; maybe someone knows a better approach.
The patch additionally gets rid of the pitfall where idioms similar to (r-s-c-i-b-v s :initial-size (file-length s)) fail when (file-length s) is zero, but I don't know if broadening the type of initial-size would be considered a breaking change. The internal %length argument also becomes basically unnecessary, but I'd bet someone out there uses it as a workaround.
diff --git a/alexandria-1/io.lisp b/alexandria-1/io.lisp --- a/alexandria-1/io.lisp +++ b/alexandria-1/io.lisp @@ -95,29 +95,35 @@ unless it's NIL, which means the system default." (defun read-stream-content-into-byte-vector (stream &key ((%length length)) (initial-size 4096)) "Return "content" of STREAM as freshly allocated (unsigned-byte 8) vector." - (check-type length (or null non-negative-integer)) - (check-type initial-size positive-integer) - (do ((buffer (make-array (or length initial-size) - :element-type '(unsigned-byte 8))) - (offset 0) - (offset-wanted 0)) - ((or (/= offset-wanted offset) - (and length (>= offset length))) - (if (= offset (length buffer)) - buffer - (subseq buffer 0 offset))) - (unless (zerop offset) - (let ((new-buffer (make-array (* 2 (length buffer)) - :element-type '(unsigned-byte 8)))) - (replace new-buffer buffer) - (setf buffer new-buffer))) - (setf offset-wanted (length buffer) - offset (read-sequence buffer stream :start offset)))) + (check-type length (or null non-negative-integer)) ; for compatibility + (check-type initial-size non-negative-integer) + (setf initial-size (or length initial-size)) + (let ((result (make-array initial-size :element-type '(unsigned-byte 8))) + (bytes-read 0)) + (loop + (setf bytes-read (read-sequence result stream :start bytes-read)) + (when (and length (>= bytes-read length)) + (return)) + ;; There is no PEEK-BYTE, so we just try to read a byte. + (let ((next-byte (read-byte stream nil nil))) + (when (null next-byte) + (return)) + (let ((new-result (make-array (if (zerop (length result)) + 4096 + (* 2 (length result))) + :element-type '(unsigned-byte 8)))) + (replace new-result result :end1 bytes-read :end2 bytes-read) + (setf (aref new-result bytes-read) next-byte + result new-result) + (incf bytes-read)))) + (if (= bytes-read (length result)) + result + (subseq result 0 bytes-read))))
(defun read-file-into-byte-vector (pathname) "Read PATHNAME into a freshly allocated (unsigned-byte 8) vector." (with-input-from-file (stream pathname :element-type '(unsigned-byte 8)) - (read-stream-content-into-byte-vector stream '%length (file-length stream)))) + (read-stream-content-into-byte-vector stream :initial-size (file-length stream))))
(defun write-byte-vector-into-file (bytes pathname &key (if-exists :error) if-does-not-exist)
alexandria-devel@common-lisp.net