[parenscript-devel] suggestion for unicode output - parenscript-devel - mailman3.common-lisp.net

newer
[parenscript-devel] error using...

[parenscript-devel] suggestion for unicode output

older
[parenscript-devel] Bug: let...

Canhua

31 Dec 2012 31 Dec '12

3:07 a.m.

hi, Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.

Reply

Sign in to reply online Use email software

Show replies by date

Nitralime

31 Dec 31 Dec

2:06 p.m.

UTF-8 is by no means more readable than the other encoding schemata if your code points are beyond ISO-8859-1. You can't generally even say how long your string is by just looking at the UTF-8 output!! On 12/31/2012 04:07 AM, Canhua wrote:

hi,

Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.

_______________________________________________ parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel

Reply

Sign in to reply online Use email software

Vladimir Sedach

4:42 p.m.

I never noticed that Parenscript was escaping those characters in string literals. For the record, what happens is: PS> (ps (фоо бар "фоо бар")) "фоо(бар, '\\u0444\\u043E\\u043E \\u0431\\u0430\\u0440');" PS> I agree it's better to not escape non-ASCII, non-control characters in strings. Encoding itself is something that is up to the external format of the stream you are outputting Parenscript code to. I will make a patch to fix this soon. Happy hacking, Vladimir On Mon, Dec 31, 2012 at 6:06 AM, Nitralime <nitralime@googlemail.com> wrote:

UTF-8 is by no means more readable than the other encoding schemata if your code points are beyond ISO-8859-1. You can't generally even say how long your string is by just looking at the UTF-8 output!!

On 12/31/2012 04:07 AM, Canhua wrote:

...
hi,

Do you think it good idea to output unicode as utf-8 bytes rather than unicode code number like \u53D7 ? That would be more readable for output that is full of unicode characters.

_______________________________________________ parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel

_______________________________________________ parenscript-devel mailing list parenscript-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/parenscript-devel

Reply

Sign in to reply online Use email software

Boris Smilga

5:04 p.m.

On 31 Dec 2012, at 20:42, Vladimir Sedach wrote:

I never noticed that Parenscript was escaping those characters in string literals. For the record, what happens is:

PS> (ps (фоо бар "фоо бар")) "фоо(бар, '\\u0444\\u043E\\u043E \\u0431\\u0430\\u0440');" PS>

I agree it's better to not escape non-ASCII, non-control characters in strings. Encoding itself is something that is up to the external format of the stream you are outputting Parenscript code to.

I will make a patch to fix this soon.

Are you, however, aware of this pitfall: https://medium.com/joys-of- javascript/42a28471221d ? — B. Smilga

Reply

Sign in to reply online Use email software

4569

Age (days ago)

4569

Last active (days ago)

Download

3 comments

4 participants

tags

participants (4)

Boris Smilga
Canhua
Nitralime
Vladimir Sedach