Hello,
Attached are fixes for ecl_string_case, which does not handle Unicode correctly. This shows up in two places in the CL run-time.
Currently the READTABLE-CASE :INVERT issue is preventing Parenscript 2.8 from compiling.
-- Vladimir Sedach Software engineering services in Los Angeles https://oneofus.la
The next problem is that printing of Unicode symbol names does not work correctly. You can see this by trying:
(format nil "~A" (read-from-string "абракадабра"))
I do not understand enough about ECL's printing code yet to figure out which part of the symbol printing code is doing the wrong thing, and what the fix would be. I appreciate any help.
Attached are two test cases where this issue surfaces, that build on the previous patches.
-- Vladimir Sedach Software engineering services in Los Angeles https://oneofus.la
Ok, I figured out a fix (attached).
write_symbol_string was attempting to use some kind of buffering similar to writestr_stream in file.d, which buffering I replaced with calls to ecl_write_char.
This simplified the code, and does not seem to have had a noticeable negative effect on the performance of printing symbols:
--8<---------------cut here---------------start------------->8---
(defvar j (read-from-string "абракадабра"))
J
(time (dotimes (i 1000000) (with-output-to-string (x1) (print j x1))))
--8<---------------cut here---------------end--------------->8---
Before:
real time : 5.349 secs run time : 6.827 secs gc count : 552 times consed : 1055997968 bytes NIL
After:
real time : 5.278 secs run time : 7.756 secs gc count : 257 times consed : 1039999200 bytes NIL
-- Vladimir Sedach Software engineering services in Los Angeles https://oneofus.la
Hello Vladimir,
thank you very much for the bug reports and patches. I have incorporated your changes in the develop branch of the git repository. If you have any more bug fixes in the future, it would be preferable to directly open a merge request at https://gitlab.com/embeddable-common-lisp/ecl/ instead of sending patches to the mailing list, since it makes things easier for us.
Best regards, Marius Gerbershagen
Am 05.05.20 um 04:46 schrieb Vladimir Sedach:
Ok, I figured out a fix (attached).
write_symbol_string was attempting to use some kind of buffering similar to writestr_stream in file.d, which buffering I replaced with calls to ecl_write_char.
This simplified the code, and does not seem to have had a noticeable negative effect on the performance of printing symbols:
--8<---------------cut here---------------start------------->8---
(defvar j (read-from-string "абракадабра"))
J
(time (dotimes (i 1000000) (with-output-to-string (x1) (print j x1))))
--8<---------------cut here---------------end--------------->8---
Before:
real time : 5.349 secs run time : 6.827 secs gc count : 552 times consed : 1055997968 bytes NIL
After:
real time : 5.278 secs run time : 7.756 secs gc count : 257 times consed : 1039999200 bytes NIL
-- Vladimir Sedach Software engineering services in Los Angeles https://oneofus.la
Marius Gerbershagen marius.gerbershagen@gmail.com writes:
If you have any more bug fixes in the future, it would be preferable to directly open a merge request at https://gitlab.com/embeddable-common-lisp/ecl/ instead of sending patches to the mailing list, since it makes things easier for us.
Ok, I will do that. I was a bit confused about why I was not seeing the changes in develop for a while, then I noticed you made the changes by hand. There is no need to do that, you can apply patches with git-am (this is what makes patches in mailing lists faster and easier than using third party GitHubLabFace etc web interfaces). That will not mess up the repository history and other metadata like commit messages.
-- Vladimir Sedach Software engineering services in Los Angeles https://oneofus.la
Marius Gerbershagen marius.gerbershagen@gmail.com writes:
thank you very much for the bug reports and patches. I have incorporated your changes in the develop branch of the git repository. If you have any more bug fixes in the future, it would be preferable to directly open a merge request at https://gitlab.com/embeddable-common-lisp/ecl/ instead of sending patches to the mailing list, since it makes things easier for us.
One more question. Is there a reason you decided to keep buffer_write_char in write_symbol.d? This seems like double-buffering to me - shouldn't the stream take care of the buffering? In my simple test there did not seem to be a performance difference, and the code was simpler. Am I missing something?
-- Vladimir Sedach Software engineering services in Los Angeles https://oneofus.la
Not all streams may be buffered, think for example user-defined gray streams. And even if they implement buffering, it will still be faster to have one generic function dispatch instead of maybe 10 or 20 for each character. Since symbols are so common data types, I think the small optimization is worth it there.
Am 11.05.20 um 01:06 schrieb Vladimir Sedach:
Marius Gerbershagen marius.gerbershagen@gmail.com writes:
thank you very much for the bug reports and patches. I have incorporated your changes in the develop branch of the git repository. If you have any more bug fixes in the future, it would be preferable to directly open a merge request at https://gitlab.com/embeddable-common-lisp/ecl/ instead of sending patches to the mailing list, since it makes things easier for us.
One more question. Is there a reason you decided to keep buffer_write_char in write_symbol.d? This seems like double-buffering to me - shouldn't the stream take care of the buffering? In my simple test there did not seem to be a performance difference, and the code was simpler. Am I missing something?
-- Vladimir Sedach Software engineering services in Los Angeles https://oneofus.la