Revision: 3603 Author: edi URL: http://bknr.net/trac/changeset/3603
Update cl-unicode to release version
U trunk/thirdparty/cl-unicode/CHANGELOG.txt U trunk/thirdparty/cl-unicode/doc/index.html U trunk/thirdparty/cl-unicode/test/simple U trunk/thirdparty/cl-unicode/util.lisp
Modified: trunk/thirdparty/cl-unicode/CHANGELOG.txt =================================================================== --- trunk/thirdparty/cl-unicode/CHANGELOG.txt 2008-07-23 23:01:07 UTC (rev 3602) +++ trunk/thirdparty/cl-unicode/CHANGELOG.txt 2008-07-23 23:02:44 UTC (rev 3603) @@ -1,3 +1,3 @@ Version 0.1.0 -2008-07-23 +2008-07-24 Initial release
Modified: trunk/thirdparty/cl-unicode/doc/index.html =================================================================== --- trunk/thirdparty/cl-unicode/doc/index.html 2008-07-23 23:01:07 UTC (rev 3602) +++ trunk/thirdparty/cl-unicode/doc/index.html 2008-07-23 23:02:44 UTC (rev 3603) @@ -867,8 +867,8 @@ look-ups by removing all whitespace, hyphens, and underline characters. <p> -Tries not to remove hyphens preceded by spaces if this could lead to -ambiguities as described in +Tries not to remove hyphens preceded by spaces or underlines if this +could lead to ambiguities as described in <a href="http://unicode.org/unicode/reports/tr18/#Name_Properties">http://unicode.org/unicode/reports/tr18/#Name_Properties</a>. <p> All CL-UNICODE functions which accept string <em>names</em> for characters @@ -895,8 +895,11 @@
CL-USER 7 > (canonicalize-name (canonicalize-name "TIBETAN LETTER -A")) "TIBETANLETTER -A" + +CL-USER 8 > (canonicalize-name "Tibetan_Letter_-A") +"TibetanLetter -A" </pre> -Note that the preceding space is relevant in the ambiguous cases (but +Note that the preceding chracter is relevant in the ambiguous cases (but there are only three of them): <pre> CL-USER 8 > (char= (<a href="#character-named" class=none>character-named</a> "TibetanLetter A") (<a href="#character-named" class=none>character-named</a> "TibetanLetter -A")) @@ -1160,6 +1163,9 @@ set <a href="#*try-lisp-syntax-p*"><code>*TRY-LISP-SYNTAX-P*</code></a> to a true value when enabling the alternative syntax, so that you can still use the short syntax (like <code>#\a</code>) for characters.) +<p> +For an alternative syntax for <em>strings</em> +see <a href="http://weitz.de/cl-interpol/">CL-INTERPOL</a>. </blockquote>
<!-- End of entry for ENABLE-ALTERNATIVE-CHARACTER-SYNTAX --> @@ -1264,7 +1270,7 @@ This documentation was prepared with <a href="http://weitz.de/documentation-template/">DOCUMENTATION-TEMPLATE</a>. </p> <p> -$Header: /usr/local/cvsrep/cl-unicode/doc/index.html,v 1.10 2008/07/23 02:22:20 edi Exp $ +$Header: /usr/local/cvsrep/cl-unicode/doc/index.html,v 1.12 2008/07/23 14:55:26 edi Exp $ <p><a href="http://weitz.de/index.html">BACK TO MY HOMEPAGE</a>
</body>
Modified: trunk/thirdparty/cl-unicode/test/simple =================================================================== --- trunk/thirdparty/cl-unicode/test/simple 2008-07-23 23:01:07 UTC (rev 3602) +++ trunk/thirdparty/cl-unicode/test/simple 2008-07-23 23:02:44 UTC (rev 3603) @@ -1,5 +1,5 @@ ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-UNICODE-TEST; Base: 10 -*- -;;; $Header: /usr/local/cvsrep/cl-unicode/test/simple,v 1.13 2008/07/21 23:12:56 edi Exp $ +;;; $Header: /usr/local/cvsrep/cl-unicode/test/simple,v 1.14 2008/07/23 14:11:42 edi Exp $
;;; some simple tests for CL-UNICODE - entered manually and to be read ;;; in the CL-UNICODE-TEST package; all forms are expected to return a @@ -390,8 +390,10 @@
;; ambiguous names (see NORMALIZE-NAME) (= #xf68 (character-named "TIBETAN LETTER A" :want-code-point-p t)) +(= #xf68 (character-named "Tibetan_Letter_A" :want-code-point-p t)) (= #xf68 (character-named "TIBETANLETTERA" :want-code-point-p t)) (= #xf60 (character-named "TIBETAN LETTER -A" :want-code-point-p t)) +(= #xf60 (character-named "Tibetan_Letter_-A" :want-code-point-p t)) (= #xfb8 (character-named "TIBETAN SUBJOINED LETTER A" :want-code-point-p t)) (= #xfb8 (character-named "TIBETANSUBJOINEDLETTERA" :want-code-point-p t)) (= #xfb0 (character-named "TIBETAN SUBJOINED LETTER -A" :want-code-point-p t))
Modified: trunk/thirdparty/cl-unicode/util.lisp =================================================================== --- trunk/thirdparty/cl-unicode/util.lisp 2008-07-23 23:01:07 UTC (rev 3602) +++ trunk/thirdparty/cl-unicode/util.lisp 2008-07-23 23:02:44 UTC (rev 3603) @@ -1,5 +1,5 @@ ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-UNICODE; Base: 10 -*- -;;; $Header: /usr/local/cvsrep/cl-unicode/util.lisp,v 1.26 2008/07/22 12:20:14 edi Exp $ +;;; $Header: /usr/local/cvsrep/cl-unicode/util.lisp,v 1.27 2008/07/23 14:11:40 edi Exp $
;;; Copyright (c) 2008, Dr. Edmund Weitz. All rights reserved.
@@ -45,10 +45,11 @@ All CL-UNICODE functions which accept string "names" for characters or properties will canonicalize the name first using this function and will then look up the name case-insensitively." - (values (ppcre:regex-replace-all "( -A| O-E)$|[-_\s]" name + (values (ppcre:regex-replace-all "[ _](-A|O-E)$|[-_\s]" name (lambda (match register) (declare (ignore match)) - (or register "")) + (cond (register (format nil " ~A" register)) + (t ""))) :simple-calls t)))
(defun property-symbol (name)