hi,
Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.
UTF-8 is by no means more readable than the other encoding schemata if your code points are beyond ISO-8859-1. You can't generally even say how long your string is by just looking at the UTF-8 output!!
On 12/31/2012 04:07 AM, Canhua wrote:
hi,
Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.
parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel
I never noticed that Parenscript was escaping those characters in string literals. For the record, what happens is:
PS> (ps (фоо бар "фоо бар")) "фоо(бар, '\u0444\u043E\u043E \u0431\u0430\u0440');" PS>
I agree it's better to not escape non-ASCII, non-control characters in strings. Encoding itself is something that is up to the external format of the stream you are outputting Parenscript code to.
I will make a patch to fix this soon.
Happy hacking, Vladimir
On Mon, Dec 31, 2012 at 6:06 AM, Nitralime nitralime@googlemail.com wrote:
UTF-8 is by no means more readable than the other encoding schemata if your code points are beyond ISO-8859-1. You can't generally even say how long your string is by just looking at the UTF-8 output!!
On 12/31/2012 04:07 AM, Canhua wrote:
hi,
Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.
parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel
parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel
On 31 Dec 2012, at 20:42, Vladimir Sedach wrote:
I never noticed that Parenscript was escaping those characters in string literals. For the record, what happens is:
PS> (ps (фоо бар "фоо бар")) "фоо(бар, '\u0444\u043E\u043E \u0431\u0430\u0440');" PS>
I agree it's better to not escape non-ASCII, non-control characters in strings. Encoding itself is something that is up to the external format of the stream you are outputting Parenscript code to.
I will make a patch to fix this soon.
Are you, however, aware of this pitfall: https://medium.com/joys-of- javascript/42a28471221d ?
— B. Smilga
parenscript-devel@common-lisp.net