#43: unread-char doesn't change file-position ---------------------+------------------------------------------------------ Reporter: rtoy | Owner: Type: defect | Status: new Priority: major | Milestone: Component: Unicode | Version: 20b Keywords: | ---------------------+------------------------------------------------------ This test is from Douglas Crosher, on the maxima mailing list. {{{ (with-open-file (ostream "ctest.txt" :direction :output :external-format #+clisp "utf-8" #-clisp :utf-8) (dotimes (i 1000) (write-char (code-char #x1234) ostream)))
(with-open-file (stream "ctest.txt" :direction :input :external-format #+clisp "utf-8" #-clisp :utf-8) (let ((p0 (file-position stream)) (ch (read-char stream))) (unread-char ch stream) (let ((p0* (file-position stream))) (if (eql p0* p0) "Ok" "Broken")))) }}}
Cmucl returns "Broken" because {{{p0}}} = 0 but {{{p0*}}} = 2. I think {{{unread-char}}} didn't update everything needed by our unicode stream buffers.
#43: unread-char doesn't change file-position ----------------------+----------------------------------------------------- Reporter: rtoy | Owner: Type: defect | Status: closed Priority: major | Milestone: Component: Unicode | Version: 20b Resolution: fixed | Keywords: ----------------------+----------------------------------------------------- Changes (by rtoy):
* status: new => closed * resolution: => fixed
Comment:
Fixed and should be available in the April snapshot.
The issue was caused by {{{FAST-READ-CHAR-STRING-REFILL}}} not updating the ibuf head pointer when some octets in the buffer were not converted to characters because the last octets in the buffer do not form a complete character.
#43: unread-char doesn't change file-position ----------------------+----------------------------------------------------- Reporter: rtoy | Owner: Type: defect | Status: reopened Priority: major | Milestone: Component: Unicode | Version: 20b Resolution: | Keywords: ----------------------+----------------------------------------------------- Changes (by rtoy):
* status: closed => reopened * resolution: fixed =>
Comment:
The fix that was implemented in stream.lisp does fix this issue. However, it causes the Unicode tests in {{{src/i18n/tests}}} to fail. Cmucl ends up reading in the whole file into one string and causes an error.
The change has been reverted, pending a better fix. This ticket is then, of course, reopened.
#43: unread-char doesn't change file-position ----------------------+----------------------------------------------------- Reporter: rtoy | Owner: Type: defect | Status: closed Priority: major | Milestone: Component: Unicode | Version: 20b Resolution: fixed | Keywords: ----------------------+----------------------------------------------------- Changes (by rtoy):
* status: reopened => closed * resolution: => fixed
Comment:
This has been fixed (again) in a different way. The fix is in fd- stream.lisp in {{{FILE-POSITION}}}, which needs to account for any unprocessed characters as well as any octets that are in the buffer but have not been converted to characters yet.