Raymond Toy pushed to branch issue-150-add-aliases-cp949-euckr at cmucl / cmucl
Commits:
cf5111ea by Raymond Toy at 2022-11-01T17:26:19-07:00
Fix #150: Add aliases for cp949 and euckr
Add alias for cp949 which is the same as euc-kr, which we already
support. As a convenience make euckr be an alias of euc-kr too.
Add a simple test for this tha just verifies that the
`stream::find-external-format` doesn't fail for these formats.
- - - - -
2 changed files:
- src/pcl/simple-streams/external-formats/aliases
- tests/issues.lisp
Changes:
=====================================
src/pcl/simple-streams/external-formats/aliases
=====================================
@@ -223,6 +223,8 @@ windows-cp1252 cp1252
windows-latin1 cp1252
ms-ansi cp1252
+euckr euc-kr
+cp949 euc-kr
;; These are not yet implemented
;;iso-2022-jp iso2022-jp
;;iso2022jp iso2022-jp
=====================================
tests/issues.lisp
=====================================
@@ -745,3 +745,10 @@
(assert-equal (map 'list #'char-name string)
(map 'list #'char-name (read-line s))))))
+
+(define-test issue.150
+ (:tag :issues)
+ (let ((ext:*gc-verbose* nil)
+ (*compile-print* nil))
+ (assert-true (stream::find-external-format :euckr))
+ (assert-true (stream::find-external-format :cp949))))
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/cf5111ea89421f8be2bd35a…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/cf5111ea89421f8be2bd35a…
You're receiving this email because of your account on gitlab.common-lisp.net.
Raymond Toy pushed to branch issue-135-unix-namestring-dot at cmucl / cmucl
Commits:
e8a0cc6c by Raymond Toy at 2022-10-30T15:03:27+00:00
Fix #147: Add method for stream-line-column
- - - - -
0dad5a1a by Raymond Toy at 2022-10-30T15:03:28+00:00
Merge branch 'issue-147-stream-line-column-impl' into 'master'
Fix #147: Add method for stream-line-column
Closes #147
See merge request cmucl/cmucl!104
- - - - -
1300830b by Raymond Toy at 2022-10-31T17:12:48+00:00
Address #139: *default-external-format* is :utf-8
- - - - -
649a4f1e by Raymond Toy at 2022-10-31T17:12:49+00:00
Merge branch 'issue-139-default-external-format-utf8' into 'master'
Address #139: *default-external-format* is :utf-8
See merge request cmucl/cmucl!103
- - - - -
88f6852f by Raymond Toy at 2022-11-01T12:04:55-07:00
Change :iso-8859-1 to :iso8859-1 in find-encoding
While there's an alias for `:iso-8859-1`, it's safer to use
`:iso8859-1` which is builtin. Using `:iso-8859-1` requires the alias
database to be loaded, which isn't (currently) guaranteed when
`find-encoding` is called. Thus use the builtin name instead.
Besides, `:iso8859-1` is used in other places in "intl.lisp".
(This is hard to test, but I noticed it when running
```
LANG=ko_KR.utf8 lisp
```
on the branch `issue-139-add-alias-local-external-format`.)
- - - - -
d5f1aa5e by Raymond Toy at 2022-11-01T20:35:49+00:00
Update release-21e.md with closed issues.
- - - - -
76f7ea5d by Raymond Toy at 2022-11-01T14:35:41-07:00
Merge branch 'master' into issue-135-unix-namestring-dot
- - - - -
7 changed files:
- src/code/extfmts.lisp
- src/code/intl.lisp
- src/general-info/release-21e.md
- src/pcl/gray-streams.lisp
- + tests/.gitignore
- tests/issues.lisp
- + tests/utf8.txt
Changes:
=====================================
src/code/extfmts.lisp
=====================================
@@ -22,7 +22,7 @@
describe-external-format))
(defvar *default-external-format*
- :iso8859-1
+ :utf-8
"The default external format to use if no other external format is
specified")
=====================================
src/code/intl.lisp
=====================================
@@ -105,7 +105,7 @@
(defun find-encoding (domain)
(when (null (domain-entry-encoding domain))
- (setf (domain-entry-encoding domain) :iso-8859-1)
+ (setf (domain-entry-encoding domain) :iso8859-1)
;; Domain lookup can call the compiler, so set the locale to "C"
;; so things work.
(let* ((*locale* "C")
=====================================
src/general-info/release-21e.md
=====================================
@@ -22,6 +22,7 @@ public domain.
* Feature enhancements
* Changes
* Update to ASDF 3.3.6
+ * The default external format is `:utf-8` instead of `:iso8859-1`
* ANSI compliance fixes:
* Bug fixes:
* ~~#97~~ Fixes stepping through the source forms in the debugger. This has been broken for quite some time, but it works now.
@@ -50,13 +51,17 @@ public domain.
* ~~#113~~ REQUIRE on contribs can pull in the wrong things via ASDF..
* ~~#121~~ Wrong column index in FILL-POINTER-OUTPUT-STREAM
* ~~#122~~ gcc 11 can't build cmucl
+ * ~~#124~~ directory with `:wild-inferiors` doesn't descend subdirectories
* ~~#125~~ Linux `unix-stat` returning incorrect values
* ~~#127~~ Linux unix-getpwuid segfaults when given non-existent uid..
* ~~#128~~ `QUIT` accepts an exit code
+ * ~~#130~~ Move file-author to C
* ~~#132~~ Ansi test `RENAME-FILE.1` no fails
* ~~#134~~ Handle the case of `(expt complex complex-rational)`
* ~~#136~~ `ensure-directories-exist` should return the given pathspec
+ * #139 `*default-external-format*` defaults to `:utf-8`
* ~~#142~~ `(random 0)` signals incorrect error
+ * ~~#147~~ `stream-line-column` method missing for `fundamental-character-output-stream`
* Other changes:
* Improvements to the PCL implementation of CLOS:
* Changes to building procedure:
=====================================
src/pcl/gray-streams.lisp
=====================================
@@ -235,6 +235,9 @@
defined for this function, although it is permissible for it to
always return NIL."))
+(defmethod stream-line-column ((stream fundamental-character-output-stream))
+ nil)
+
;;; Stream-line-length is a CMUCL extension to Gray streams.
(defgeneric stream-line-length (stream)
(:documentation _N"Return the stream line length or Nil."))
=====================================
tests/.gitignore
=====================================
@@ -0,0 +1 @@
+/out-utf8.txt
=====================================
tests/issues.lisp
=====================================
@@ -5,6 +5,12 @@
(in-package "ISSUES-TESTS")
+(defparameter *test-path*
+ (merge-pathnames (make-pathname :name :unspecific :type :unspecific
+ :version :unspecific)
+ *load-truename*)
+ "Path to where this file is.")
+
(defun square (x)
(expt x 2))
@@ -676,7 +682,21 @@
;; work and not return NIL.
(assert-true (file-author "."))
(assert-true (file-author "bin/build.sh"))
- (assert-true (file-author "tests/안녕하십니까.txt")))
+ (let ((unix::*filename-encoding* :utf-8))
+ ;; Set filename encoding to utf-8 so that we can encode the
+ ;; filename properly.
+ (assert-true
+ (file-author
+ (merge-pathnames
+ (concatenate 'string
+ ;; Write the test file name this way so
+ ;; that it's independent of the encoding
+ ;; used to load this file. The name is
+ ;; "안녕하십니까".
+ '(#\Hangul_Syllable_An #\Hangul_Syllable_Nyeong #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Sib #\Hangul_Syllable_Ni #\Hangul_Syllable_Gga)
+ ".txt")
+ *test-path*)))))
(define-test issue.135
(:tag :issues)
@@ -704,3 +724,51 @@
:type "lisp")
(pathname bar))))
(assert-true (delete-file "foo.txt"))))
+
+(define-test issue.139-default-external-format
+ (:tag :issues)
+ (assert-eq :utf-8 stream:*default-external-format*))
+
+(define-test issue.139-default-external-format-read-file
+ (:tag :issues)
+ (let ((string (concatenate 'string
+ ;; This is "hello" in Korean
+ '(#\Hangul_syllable_an
+ #\Hangul_Syllable_Nyeong
+ #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Se
+ #\Hangul_Syllable_Yo))))
+ ;; Test that opening a file for reading uses the the default :utf8
+ ;; encoding.
+ (with-open-file (s (merge-pathnames "utf8.txt"
+ *test-path*)
+ :direction :input)
+ ;; The first line should be "hello" in Hangul.
+ (assert-equal (map 'list #'char-name string)
+ (map 'list #'char-name (read-line s))))))
+
+(define-test issue.139-default-external-format-write-file
+ (:tag :issues)
+ ;; Test that opening a file for writing uses the default :utf8.
+ ;; First write something out to the file. Then read it back in
+ ;; using an explicit format of utf8 and verifying that we got the
+ ;; right contents.
+ (let ((string (concatenate 'string
+ ;; This is "hello" in Korean
+ '(#\Hangul_syllable_an
+ #\Hangul_Syllable_Nyeong
+ #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Se
+ #\Hangul_Syllable_Yo))))
+ (with-open-file (s (merge-pathnames "out-utf8.txt"
+ *test-path*)
+ :direction :output
+ :if-exists :supersede)
+ (write-line string s))
+ (with-open-file (s (merge-pathnames "out-utf8.txt"
+ *test-path*)
+ :direction :input
+ :external-format :utf-8)
+ (assert-equal (map 'list #'char-name string)
+ (map 'list #'char-name (read-line s))))))
+
=====================================
tests/utf8.txt
=====================================
@@ -0,0 +1,2 @@
+안녕하세요
+UTF8 test. The above line is "Hello" in Hangul.
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/ef2a1fc306f966f78c8663…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/ef2a1fc306f966f78c8663…
You're receiving this email because of your account on gitlab.common-lisp.net.
Raymond Toy pushed to branch issue-140-stream-element-type-two-way-stream at cmucl / cmucl
Commits:
e8a0cc6c by Raymond Toy at 2022-10-30T15:03:27+00:00
Fix #147: Add method for stream-line-column
- - - - -
0dad5a1a by Raymond Toy at 2022-10-30T15:03:28+00:00
Merge branch 'issue-147-stream-line-column-impl' into 'master'
Fix #147: Add method for stream-line-column
Closes #147
See merge request cmucl/cmucl!104
- - - - -
1300830b by Raymond Toy at 2022-10-31T17:12:48+00:00
Address #139: *default-external-format* is :utf-8
- - - - -
649a4f1e by Raymond Toy at 2022-10-31T17:12:49+00:00
Merge branch 'issue-139-default-external-format-utf8' into 'master'
Address #139: *default-external-format* is :utf-8
See merge request cmucl/cmucl!103
- - - - -
88f6852f by Raymond Toy at 2022-11-01T12:04:55-07:00
Change :iso-8859-1 to :iso8859-1 in find-encoding
While there's an alias for `:iso-8859-1`, it's safer to use
`:iso8859-1` which is builtin. Using `:iso-8859-1` requires the alias
database to be loaded, which isn't (currently) guaranteed when
`find-encoding` is called. Thus use the builtin name instead.
Besides, `:iso8859-1` is used in other places in "intl.lisp".
(This is hard to test, but I noticed it when running
```
LANG=ko_KR.utf8 lisp
```
on the branch `issue-139-add-alias-local-external-format`.)
- - - - -
d5f1aa5e by Raymond Toy at 2022-11-01T20:35:49+00:00
Update release-21e.md with closed issues.
- - - - -
4e75e96f by Raymond Toy at 2022-11-01T14:33:54-07:00
Merge branch 'master' into issue-140-stream-element-type-two-way-stream
- - - - -
7 changed files:
- src/code/extfmts.lisp
- src/code/intl.lisp
- src/general-info/release-21e.md
- src/pcl/gray-streams.lisp
- + tests/.gitignore
- tests/issues.lisp
- + tests/utf8.txt
Changes:
=====================================
src/code/extfmts.lisp
=====================================
@@ -22,7 +22,7 @@
describe-external-format))
(defvar *default-external-format*
- :iso8859-1
+ :utf-8
"The default external format to use if no other external format is
specified")
=====================================
src/code/intl.lisp
=====================================
@@ -105,7 +105,7 @@
(defun find-encoding (domain)
(when (null (domain-entry-encoding domain))
- (setf (domain-entry-encoding domain) :iso-8859-1)
+ (setf (domain-entry-encoding domain) :iso8859-1)
;; Domain lookup can call the compiler, so set the locale to "C"
;; so things work.
(let* ((*locale* "C")
=====================================
src/general-info/release-21e.md
=====================================
@@ -22,6 +22,7 @@ public domain.
* Feature enhancements
* Changes
* Update to ASDF 3.3.6
+ * The default external format is `:utf-8` instead of `:iso8859-1`
* ANSI compliance fixes:
* Bug fixes:
* ~~#97~~ Fixes stepping through the source forms in the debugger. This has been broken for quite some time, but it works now.
@@ -50,14 +51,18 @@ public domain.
* ~~#113~~ REQUIRE on contribs can pull in the wrong things via ASDF..
* ~~#121~~ Wrong column index in FILL-POINTER-OUTPUT-STREAM
* ~~#122~~ gcc 11 can't build cmucl
+ * ~~#124~~ directory with `:wild-inferiors` doesn't descend subdirectories
* ~~#125~~ Linux `unix-stat` returning incorrect values
* ~~#127~~ Linux unix-getpwuid segfaults when given non-existent uid..
* ~~#128~~ `QUIT` accepts an exit code
+ * ~~#130~~ Move file-author to C
* ~~#132~~ Ansi test `RENAME-FILE.1` no fails
* ~~#134~~ Handle the case of `(expt complex complex-rational)`
* ~~#136~~ `ensure-directories-exist` should return the given pathspec
+ * #139 `*default-external-format*` defaults to `:utf-8`
* ~~#140~~ External format of `two-way-stream`
* ~~#142~~ `(random 0)` signals incorrect error
+ * ~~#147~~ `stream-line-column` method missing for `fundamental-character-output-stream`
* Other changes:
* Improvements to the PCL implementation of CLOS:
* Changes to building procedure:
=====================================
src/pcl/gray-streams.lisp
=====================================
@@ -235,6 +235,9 @@
defined for this function, although it is permissible for it to
always return NIL."))
+(defmethod stream-line-column ((stream fundamental-character-output-stream))
+ nil)
+
;;; Stream-line-length is a CMUCL extension to Gray streams.
(defgeneric stream-line-length (stream)
(:documentation _N"Return the stream line length or Nil."))
=====================================
tests/.gitignore
=====================================
@@ -0,0 +1 @@
+/out-utf8.txt
=====================================
tests/issues.lisp
=====================================
@@ -5,6 +5,12 @@
(in-package "ISSUES-TESTS")
+(defparameter *test-path*
+ (merge-pathnames (make-pathname :name :unspecific :type :unspecific
+ :version :unspecific)
+ *load-truename*)
+ "Path to where this file is.")
+
(defun square (x)
(expt x 2))
@@ -676,8 +682,69 @@
;; work and not return NIL.
(assert-true (file-author "."))
(assert-true (file-author "bin/build.sh"))
- (assert-true (file-author "tests/안녕하십니까.txt")))
+ (let ((unix::*filename-encoding* :utf-8))
+ ;; Set filename encoding to utf-8 so that we can encode the
+ ;; filename properly.
+ (assert-true
+ (file-author
+ (merge-pathnames
+ (concatenate 'string
+ ;; Write the test file name this way so
+ ;; that it's independent of the encoding
+ ;; used to load this file. The name is
+ ;; "안녕하십니까".
+ '(#\Hangul_Syllable_An #\Hangul_Syllable_Nyeong #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Sib #\Hangul_Syllable_Ni #\Hangul_Syllable_Gga)
+ ".txt")
+ *test-path*)))))
+
+(define-test issue.139-default-external-format
+ (:tag :issues)
+ (assert-eq :utf-8 stream:*default-external-format*))
+(define-test issue.139-default-external-format-read-file
+ (:tag :issues)
+ (let ((string (concatenate 'string
+ ;; This is "hello" in Korean
+ '(#\Hangul_syllable_an
+ #\Hangul_Syllable_Nyeong
+ #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Se
+ #\Hangul_Syllable_Yo))))
+ ;; Test that opening a file for reading uses the the default :utf8
+ ;; encoding.
+ (with-open-file (s (merge-pathnames "utf8.txt"
+ *test-path*)
+ :direction :input)
+ ;; The first line should be "hello" in Hangul.
+ (assert-equal (map 'list #'char-name string)
+ (map 'list #'char-name (read-line s))))))
+
+(define-test issue.139-default-external-format-write-file
+ (:tag :issues)
+ ;; Test that opening a file for writing uses the default :utf8.
+ ;; First write something out to the file. Then read it back in
+ ;; using an explicit format of utf8 and verifying that we got the
+ ;; right contents.
+ (let ((string (concatenate 'string
+ ;; This is "hello" in Korean
+ '(#\Hangul_syllable_an
+ #\Hangul_Syllable_Nyeong
+ #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Se
+ #\Hangul_Syllable_Yo))))
+ (with-open-file (s (merge-pathnames "out-utf8.txt"
+ *test-path*)
+ :direction :output
+ :if-exists :supersede)
+ (write-line string s))
+ (with-open-file (s (merge-pathnames "out-utf8.txt"
+ *test-path*)
+ :direction :input
+ :external-format :utf-8)
+ (assert-equal (map 'list #'char-name string)
+ (map 'list #'char-name (read-line s))))))
+
;;; Test stream-external-format for various types of streams.
;; Test two-way-stream where both streams have the same external
=====================================
tests/utf8.txt
=====================================
@@ -0,0 +1,2 @@
+안녕하세요
+UTF8 test. The above line is "Hello" in Hangul.
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/5944241c9f506d25e64af7…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/5944241c9f506d25e64af7…
You're receiving this email because of your account on gitlab.common-lisp.net.
Raymond Toy pushed to branch issue-149-add-setlocale at cmucl / cmucl
Commits:
d5f1aa5e by Raymond Toy at 2022-11-01T20:35:49+00:00
Update release-21e.md with closed issues.
- - - - -
ab0b8a63 by Raymond Toy at 2022-11-01T14:05:16-07:00
Merge branch 'master' into issue-149-add-setlocale
- - - - -
a4694c6d by Raymond Toy at 2022-11-01T14:06:03-07:00
Update release notes for #149
- - - - -
1 changed file:
- src/general-info/release-21e.md
Changes:
=====================================
src/general-info/release-21e.md
=====================================
@@ -51,14 +51,18 @@ public domain.
* ~~#113~~ REQUIRE on contribs can pull in the wrong things via ASDF.
* ~~#121~~ Wrong column index in FILL-POINTER-OUTPUT-STREAM
* ~~#122~~ gcc 11 can't build cmucl
+ * ~~#124~~ directory with `:wild-inferiors` doesn't descend subdirectories
* ~~#125~~ Linux `unix-stat` returning incorrect values
* ~~#127~~ Linux unix-getpwuid segfaults when given non-existent uid.
* ~~#128~~ `QUIT` accepts an exit code
+ * ~~#130~~ Move file-author to C
* ~~#132~~ Ansi test `RENAME-FILE.1` no fails
* ~~#134~~ Handle the case of `(expt complex complex-rational)`
* ~~#136~~ `ensure-directories-exist` should return the given pathspec
* #139 `*default-external-format*` defaults to `:utf-8`
* ~~#142~~ `(random 0)` signals incorrect error
+ * ~~#147~~ `stream-line-column` method missing for `fundamental-character-output-stream`
+ * ~~#149~~ Call setlocale(3C) on startup
* Other changes:
* Improvements to the PCL implementation of CLOS:
* Changes to building procedure:
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/5e164e8315e896736a8872…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/5e164e8315e896736a8872…
You're receiving this email because of your account on gitlab.common-lisp.net.
Raymond Toy pushed to branch master at cmucl / cmucl
Commits:
d5f1aa5e by Raymond Toy at 2022-11-01T20:35:49+00:00
Update release-21e.md with closed issues.
- - - - -
1 changed file:
- src/general-info/release-21e.md
Changes:
=====================================
src/general-info/release-21e.md
=====================================
@@ -51,14 +51,17 @@ public domain.
* ~~#113~~ REQUIRE on contribs can pull in the wrong things via ASDF.
* ~~#121~~ Wrong column index in FILL-POINTER-OUTPUT-STREAM
* ~~#122~~ gcc 11 can't build cmucl
+ * ~~#124~~ directory with `:wild-inferiors` doesn't descend subdirectories
* ~~#125~~ Linux `unix-stat` returning incorrect values
* ~~#127~~ Linux unix-getpwuid segfaults when given non-existent uid.
* ~~#128~~ `QUIT` accepts an exit code
+ * ~~#130~~ Move file-author to C
* ~~#132~~ Ansi test `RENAME-FILE.1` no fails
* ~~#134~~ Handle the case of `(expt complex complex-rational)`
* ~~#136~~ `ensure-directories-exist` should return the given pathspec
* #139 `*default-external-format*` defaults to `:utf-8`
* ~~#142~~ `(random 0)` signals incorrect error
+ * ~~#147~~ `stream-line-column` method missing for `fundamental-character-output-stream`
* Other changes:
* Improvements to the PCL implementation of CLOS:
* Changes to building procedure:
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/d5f1aa5e51624159c61bdf0…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/d5f1aa5e51624159c61bdf0…
You're receiving this email because of your account on gitlab.common-lisp.net.
Raymond Toy pushed to branch issue-139-add-alias-local-external-format at cmucl / cmucl
Commits:
88f6852f by Raymond Toy at 2022-11-01T12:04:55-07:00
Change :iso-8859-1 to :iso8859-1 in find-encoding
While there's an alias for `:iso-8859-1`, it's safer to use
`:iso8859-1` which is builtin. Using `:iso-8859-1` requires the alias
database to be loaded, which isn't (currently) guaranteed when
`find-encoding` is called. Thus use the builtin name instead.
Besides, `:iso8859-1` is used in other places in "intl.lisp".
(This is hard to test, but I noticed it when running
```
LANG=ko_KR.utf8 lisp
```
on the branch `issue-139-add-alias-local-external-format`.)
- - - - -
35b1282d by Raymond Toy at 2022-11-01T12:08:51-07:00
Merge branch 'master' into issue-139-add-alias-local-external-format
- - - - -
7b17a82e by Raymond Toy at 2022-11-01T12:38:48-07:00
Add interface unix-get-locale-codeset
* src/lisp/os-common.c
* Add the function `os_get_locale_codeset` to get the codeset for our
locale.
* src/code/unix.lisp
* Add alien interface to `os_get_locale_codeset`, named `unix-get-locale-codeset`.
* src/code/save.lisp
* Use `unix-get-locale-codeset` to figure out how to set up an alias
for the external format named `:locale`. Set this up in the
initial function when we save lisp.
- - - - -
4 changed files:
- src/code/intl.lisp
- src/code/save.lisp
- src/code/unix.lisp
- src/lisp/os-common.c
Changes:
=====================================
src/code/intl.lisp
=====================================
@@ -105,7 +105,7 @@
(defun find-encoding (domain)
(when (null (domain-entry-encoding domain))
- (setf (domain-entry-encoding domain) :iso-8859-1)
+ (setf (domain-entry-encoding domain) :iso8859-1)
;; Domain lookup can call the compiler, so set the locale to "C"
;; so things work.
(let* ((*locale* "C")
=====================================
src/code/save.lisp
=====================================
@@ -145,53 +145,25 @@
(defun set-up-locale-external-format ()
"Add external format alias for :locale to the format specified by
the envvar LANG and friends if available."
- ;; Find the envvar that will tell us what encoding to use.
- ;;
- ;; See https://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html
- ;;
- (let* ((lang (or (unix:unix-getenv "LC_ALL")
- (unix:unix-getenv "LC_MESSAGES")
- (unix:unix-getenv "LANG")))
- (length (length lang)))
- ;; If LANG isn't set, just set :locale to alias to the
- ;; default-external-format.
- (unless lang
- (setf (gethash :locale stream::*external-format-aliases*) *default-external-format*)
- (return-from set-up-locale-external-format (values)))
- ;; Extract the external format from the envvar and set up the
- ;; :locale alias.
- (let ((new-alias
- (cond
- ((or (string-equal "C" lang :end2 (min 1 length))
- (string-equal "POSIX" lang :end2 (min 5 length)))
- ;; If the lang is "C" or "POSIX", ignoring anything after
- ;; that, default to :iso8859-1.
- :iso8859-1)
- ((string-equal "/" lang :end2 (min 1 length))
- ;; Also, we don't handle the case where the locale starts
- ;; with a slash which means a pathname to a file created by
- ;; the localedef utility. So use our defaults for that case
- ;; as well.
- :iso8859-1)
- (t
- ;; Simple parsing of LANG. We assume it looks like
- ;; "language[_territory][.codeset]". We're only interested
- ;; in the codeset, if given. Some LC_ vars also have an
- ;; optional @modifier after the codeset; we ignore that too.
- (let ((dot (position #\. lang))
- (at (or (position #\@ lang) nil)))
- (when dot
- (let* ((codeset (subseq lang (1+ dot) at))
- (format (intern codeset "KEYWORD")))
- (cond ((stream::find-external-format format nil)
- format)
- (t
- (warn "Unknown or unsupported external format: ~S"
- codeset)
- *default-external-format*)))))))))
- (assert new-alias)
- (setf (gethash :locale stream::*external-format-aliases*) new-alias))
- (values)))
+ (let ((codeset (unix::unix-get-locale-codeset)))
+ (cond ((zerop (length codeset))
+ ;; Codeset was the empty string, so just set :locale to
+ ;; alias to the default external format.
+ (setf (gethash :locale stream::*external-format-aliases*)
+ *default-external-format*))
+ (t
+ (let ((codeset-format (intern codeset "KEYWORD")))
+ ;; If we know the format, we can set the alias.
+ ;; Otherwise, print a warning and use :iso8859-1 as the
+ ;; alias.
+ (setf (gethash :locale stream::*external-format-aliases*)
+ (if (stream::find-external-format codeset-format nil)
+ codeset-format
+ (progn
+ (warn "Unsupported external format; using :iso8859-1 instead: ~S"
+ codeset-format)
+ :iso8859-1)))))))
+ (values))
(defun save-lisp (core-file-name &key
@@ -301,6 +273,7 @@
(reinit)
(environment-init)
(dolist (f *after-save-initializations*) (funcall f))
+ (stream::load-external-format-aliases)
(intl::setlocale)
(ext::process-command-strings process-command-line)
(setf *editor-lisp-p* nil)
=====================================
src/code/unix.lisp
=====================================
@@ -2893,3 +2893,13 @@
of the child in the parent if it works, or NIL and an error number if it
doesn't work."
(int-syscall ("fork")))
+
+(defun unix-get-locale-codeset ()
+ _N"Get the codeset from the locale"
+ (with-alien ((codeset (array c-call:char 512)))
+ (alien-funcall
+ (extern-alien "os_get_locale_codeset"
+ (function void (* char) int))
+ (cast codeset (* c-call:char))
+ 512)
+ (cast codeset c-string)))
=====================================
src/lisp/os-common.c
=====================================
@@ -7,6 +7,8 @@
#include <assert.h>
#include <errno.h>
+#include <langinfo.h>
+#include <locale.h>
#include <math.h>
#include <netdb.h>
#include <pwd.h>
@@ -773,3 +775,15 @@ exit:
return result;
}
+
+void
+os_get_locale_codeset(char* codeset, int len)
+{
+ char *code;
+
+ setlocale(LC_ALL, "");
+
+ code = nl_langinfo(CODESET);
+
+ strncpy(codeset, code, len);
+}
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/69f2a9909142a2e6d5b91d…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/69f2a9909142a2e6d5b91d…
You're receiving this email because of your account on gitlab.common-lisp.net.
Raymond Toy pushed to branch master at cmucl / cmucl
Commits:
88f6852f by Raymond Toy at 2022-11-01T12:04:55-07:00
Change :iso-8859-1 to :iso8859-1 in find-encoding
While there's an alias for `:iso-8859-1`, it's safer to use
`:iso8859-1` which is builtin. Using `:iso-8859-1` requires the alias
database to be loaded, which isn't (currently) guaranteed when
`find-encoding` is called. Thus use the builtin name instead.
Besides, `:iso8859-1` is used in other places in "intl.lisp".
(This is hard to test, but I noticed it when running
```
LANG=ko_KR.utf8 lisp
```
on the branch `issue-139-add-alias-local-external-format`.)
- - - - -
1 changed file:
- src/code/intl.lisp
Changes:
=====================================
src/code/intl.lisp
=====================================
@@ -105,7 +105,7 @@
(defun find-encoding (domain)
(when (null (domain-entry-encoding domain))
- (setf (domain-entry-encoding domain) :iso-8859-1)
+ (setf (domain-entry-encoding domain) :iso8859-1)
;; Domain lookup can call the compiler, so set the locale to "C"
;; so things work.
(let* ((*locale* "C")
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/88f6852f1f3dd8687b43989…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/88f6852f1f3dd8687b43989…
You're receiving this email because of your account on gitlab.common-lisp.net.
Raymond Toy pushed to branch issue-139-add-alias-local-external-format at cmucl / cmucl
Commits:
4a7207b6 by Raymond Toy at 2022-10-17T18:58:45+00:00
Fix #130: Implement file_author in C
- - - - -
ba5c5d2a by Raymond Toy at 2022-10-17T18:58:45+00:00
Merge branch 'issue-130-file-author-in-c' into 'master'
Fix #130: Implement file_author in C
Closes #130
See merge request cmucl/cmucl!88
- - - - -
e8a0cc6c by Raymond Toy at 2022-10-30T15:03:27+00:00
Fix #147: Add method for stream-line-column
- - - - -
0dad5a1a by Raymond Toy at 2022-10-30T15:03:28+00:00
Merge branch 'issue-147-stream-line-column-impl' into 'master'
Fix #147: Add method for stream-line-column
Closes #147
See merge request cmucl/cmucl!104
- - - - -
1300830b by Raymond Toy at 2022-10-31T17:12:48+00:00
Address #139: *default-external-format* is :utf-8
- - - - -
649a4f1e by Raymond Toy at 2022-10-31T17:12:49+00:00
Merge branch 'issue-139-default-external-format-utf8' into 'master'
Address #139: *default-external-format* is :utf-8
See merge request cmucl/cmucl!103
- - - - -
69f2a990 by Raymond Toy at 2022-10-31T10:14:48-07:00
Merge branch 'master' into issue-139-add-alias-local-external-format
- - - - -
9 changed files:
- src/code/extfmts.lisp
- src/code/filesys.lisp
- src/general-info/release-21e.md
- src/lisp/os-common.c
- src/pcl/gray-streams.lisp
- + tests/.gitignore
- tests/issues.lisp
- + tests/utf8.txt
- + tests/안녕하십니까.txt
Changes:
=====================================
src/code/extfmts.lisp
=====================================
@@ -22,7 +22,7 @@
describe-external-format))
(defvar *default-external-format*
- :iso8859-1
+ :utf-8
"The default external format to use if no other external format is
specified")
=====================================
src/code/filesys.lisp
=====================================
@@ -1079,13 +1079,21 @@ optionally keeping some of the most recent old versions."
:pathname file
:format-control (intl:gettext "~S doesn't exist.")
:format-arguments (list file)))
- (multiple-value-bind (winp dev ino mode nlink uid)
- (unix:unix-stat name)
- (declare (ignore dev ino mode nlink))
- (when winp
- (let ((user-info (unix:unix-getpwuid uid)))
- (when user-info
- (unix:user-info-name user-info))))))))
+ ;; unix-namestring converts "." to "". Convert it back to
+ ;; "." so we can stat the current directory. (Perhaps
+ ;; that's a bug in unix-namestring?)
+ (when (zerop (length name))
+ (setf name "."))
+ (let (author)
+ (unwind-protect
+ (progn
+ (setf author (alien:alien-funcall
+ (alien:extern-alien "os_file_author"
+ (function (alien:* c-call:c-string) c-call:c-string))
+ (unix::%name->file name)))
+ (unless (alien:null-alien author)
+ (alien:cast author c-call:c-string)))
+ (alien:free-alien author))))))
;;;; DIRECTORY.
=====================================
src/general-info/release-21e.md
=====================================
@@ -22,6 +22,7 @@ public domain.
* Feature enhancements
* Changes
* Update to ASDF 3.3.6
+ * The default external format is `:utf-8` instead of `:iso8859-1`
* ANSI compliance fixes:
* Bug fixes:
* ~~#97~~ Fixes stepping through the source forms in the debugger. This has been broken for quite some time, but it works now.
@@ -56,6 +57,7 @@ public domain.
* ~~#132~~ Ansi test `RENAME-FILE.1` no fails
* ~~#134~~ Handle the case of `(expt complex complex-rational)`
* ~~#136~~ `ensure-directories-exist` should return the given pathspec
+ * #139 `*default-external-format*` defaults to `:utf-8`
* ~~#142~~ `(random 0)` signals incorrect error
* Other changes:
* Improvements to the PCL implementation of CLOS:
=====================================
src/lisp/os-common.c
=====================================
@@ -5,12 +5,16 @@
*/
+#include <assert.h>
#include <errno.h>
#include <math.h>
#include <netdb.h>
+#include <pwd.h>
#include <stdio.h>
+#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
+#include <unistd.h>
#include <time.h>
#include "os.h"
@@ -715,3 +719,57 @@ os_lstat(const char* path, u_int64_t *dev, u_int64_t *ino, unsigned int *mode, u
return rc;
}
+
+/*
+ * Interface for file-author. Given a pathname, returns a new string
+ * holding the author of the file or NULL if some error occurred. The
+ * caller is responsible for freeing the memory used by the string.
+ */
+char *
+os_file_author(const char *path)
+{
+ struct stat sb;
+ char initial[1024];
+ char *buffer, *obuffer;
+ size_t size;
+ struct passwd pwd;
+ struct passwd *ppwd;
+ char *result;
+
+ if (stat(path, &sb) != 0) {
+ return NULL;
+ }
+
+ result = NULL;
+ buffer = initial;
+ obuffer = NULL;
+ size = sizeof(initial) / sizeof(initial[0]);
+
+ /*
+ * Keep trying with larger buffers until a maximum is reached. We
+ * assume (1 << 20) is large enough for any OS.
+ */
+ while (size <= (1 << 20)) {
+ switch (getpwuid_r(sb.st_uid, &pwd, buffer, size, &ppwd)) {
+ case 0:
+ /* Success, though we might not have a matching entry */
+ result = (ppwd == NULL) ? NULL : strdup(pwd.pw_name);
+ goto exit;
+ case ERANGE:
+ /* Buffer is too small, double its size and try again */
+ size *= 2;
+ obuffer = (buffer == initial) ? NULL : buffer;
+ if ((buffer = realloc(obuffer, size)) == NULL) {
+ goto exit;
+ }
+ continue;
+ default:
+ /* All other errors */
+ goto exit;
+ }
+ }
+exit:
+ free(obuffer);
+
+ return result;
+}
=====================================
src/pcl/gray-streams.lisp
=====================================
@@ -235,6 +235,9 @@
defined for this function, although it is permissible for it to
always return NIL."))
+(defmethod stream-line-column ((stream fundamental-character-output-stream))
+ nil)
+
;;; Stream-line-length is a CMUCL extension to Gray streams.
(defgeneric stream-line-length (stream)
(:documentation _N"Return the stream line length or Nil."))
=====================================
tests/.gitignore
=====================================
@@ -0,0 +1 @@
+/out-utf8.txt
=====================================
tests/issues.lisp
=====================================
@@ -5,6 +5,12 @@
(in-package "ISSUES-TESTS")
+(defparameter *test-path*
+ (merge-pathnames (make-pathname :name :unspecific :type :unspecific
+ :version :unspecific)
+ *load-truename*)
+ "Path to where this file is.")
+
(defun square (x)
(expt x 2))
@@ -670,3 +676,72 @@
(err (relerr value answer)))
(assert-true (<= err eps) base err eps)))))))
+(define-test issue.130
+ (:tag :issues)
+ ;; Just verify that file-author works. In particular "." should
+ ;; work and not return NIL.
+ (assert-true (file-author "."))
+ (assert-true (file-author "bin/build.sh"))
+ (let ((unix::*filename-encoding* :utf-8))
+ ;; Set filename encoding to utf-8 so that we can encode the
+ ;; filename properly.
+ (assert-true
+ (file-author
+ (merge-pathnames
+ (concatenate 'string
+ ;; Write the test file name this way so
+ ;; that it's independent of the encoding
+ ;; used to load this file. The name is
+ ;; "안녕하십니까".
+ '(#\Hangul_Syllable_An #\Hangul_Syllable_Nyeong #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Sib #\Hangul_Syllable_Ni #\Hangul_Syllable_Gga)
+ ".txt")
+ *test-path*)))))
+
+(define-test issue.139-default-external-format
+ (:tag :issues)
+ (assert-eq :utf-8 stream:*default-external-format*))
+
+(define-test issue.139-default-external-format-read-file
+ (:tag :issues)
+ (let ((string (concatenate 'string
+ ;; This is "hello" in Korean
+ '(#\Hangul_syllable_an
+ #\Hangul_Syllable_Nyeong
+ #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Se
+ #\Hangul_Syllable_Yo))))
+ ;; Test that opening a file for reading uses the the default :utf8
+ ;; encoding.
+ (with-open-file (s (merge-pathnames "utf8.txt"
+ *test-path*)
+ :direction :input)
+ ;; The first line should be "hello" in Hangul.
+ (assert-equal (map 'list #'char-name string)
+ (map 'list #'char-name (read-line s))))))
+
+(define-test issue.139-default-external-format-write-file
+ (:tag :issues)
+ ;; Test that opening a file for writing uses the default :utf8.
+ ;; First write something out to the file. Then read it back in
+ ;; using an explicit format of utf8 and verifying that we got the
+ ;; right contents.
+ (let ((string (concatenate 'string
+ ;; This is "hello" in Korean
+ '(#\Hangul_syllable_an
+ #\Hangul_Syllable_Nyeong
+ #\Hangul_Syllable_Ha
+ #\Hangul_Syllable_Se
+ #\Hangul_Syllable_Yo))))
+ (with-open-file (s (merge-pathnames "out-utf8.txt"
+ *test-path*)
+ :direction :output
+ :if-exists :supersede)
+ (write-line string s))
+ (with-open-file (s (merge-pathnames "out-utf8.txt"
+ *test-path*)
+ :direction :input
+ :external-format :utf-8)
+ (assert-equal (map 'list #'char-name string)
+ (map 'list #'char-name (read-line s))))))
+
=====================================
tests/utf8.txt
=====================================
@@ -0,0 +1,2 @@
+안녕하세요
+UTF8 test. The above line is "Hello" in Hangul.
=====================================
tests/안녕하십니까.txt
=====================================
@@ -0,0 +1,3 @@
+The file name of this file is "안녕하십니까.txt" ("Hello" in Korean.)
+
+
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/b627c1e36a02140adc6264…
--
View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/compare/b627c1e36a02140adc6264…
You're receiving this email because of your account on gitlab.common-lisp.net.