[parenscript-devel] suggestion for unicode output
hi, Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.
UTF-8 is by no means more readable than the other encoding schemata if your code points are beyond ISO-8859-1. You can't generally even say how long your string is by just looking at the UTF-8 output!! On 12/31/2012 04:07 AM, Canhua wrote:
hi,
Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.
_______________________________________________ parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel
I never noticed that Parenscript was escaping those characters in string literals. For the record, what happens is: PS> (ps (фоо бар "фоо бар")) "фоо(бар, '\\u0444\\u043E\\u043E \\u0431\\u0430\\u0440');" PS> I agree it's better to not escape non-ASCII, non-control characters in strings. Encoding itself is something that is up to the external format of the stream you are outputting Parenscript code to. I will make a patch to fix this soon. Happy hacking, Vladimir On Mon, Dec 31, 2012 at 6:06 AM, Nitralime <nitralime@googlemail.com> wrote:
UTF-8 is by no means more readable than the other encoding schemata if your code points are beyond ISO-8859-1. You can't generally even say how long your string is by just looking at the UTF-8 output!!
On 12/31/2012 04:07 AM, Canhua wrote:
hi,
Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.
_______________________________________________ parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel
_______________________________________________ parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel
On 31 Dec 2012, at 20:42, Vladimir Sedach wrote:
I never noticed that Parenscript was escaping those characters in string literals. For the record, what happens is:
PS> (ps (фоо бар "фоо бар")) "фоо(бар, '\\u0444\\u043E\\u043E \\u0431\\u0430\\u0440');" PS>
I agree it's better to not escape non-ASCII, non-control characters in strings. Encoding itself is something that is up to the external format of the stream you are outputting Parenscript code to.
I will make a patch to fix this soon.
Are you, however, aware of this pitfall: https://medium.com/joys-of- javascript/42a28471221d ? — B. Smilga
participants (4)
-
Boris Smilga
-
Canhua
-
Nitralime
-
Vladimir Sedach