Raymond Toy pushed to branch rtoy-unicode-collation-ducet at cmucl / cmucl Commits: ca615266 by Raymond Toy at 2026-06-16T19:34:27-07:00 Speed up collation conformance tests The tests ran in ~110s, dominated by COLLATION-HEX-LIST: profiling showed it consing 3.5 GB across 2.2M calls, almost all of it inside PARSE-INTEGER. Accumulate the (16-bit, fixnum) hex values directly instead. The suite now runs in ~14s with identical results. Also compile all the helper functions because they do a lot of processing of the test file and the each test file has over 200K tests. - - - - - 1 changed file: - tests/unicode-collation.lisp Changes: ===================================== tests/unicode-collation.lisp ===================================== @@ -26,16 +26,26 @@ (defun collation-hex-list (string) "Parse all space-separated hexadecimal numbers in STRING into a list of integers, in order. Non-hex runs are skipped." - (let ((result nil) (i 0) (n (length string))) + (let ((result nil) + (i 0) + (n (length string))) (loop - (loop while (and (< i n) (not (digit-char-p (char string i) 16))) + ;; Skip any non-hexadecimal characters. + (loop while (and (< i n) + (null (digit-char-p (char string i) 16))) do (incf i)) (when (>= i n) (return)) - (let ((j i)) - (loop while (and (< j n) (digit-char-p (char string j) 16)) - do (incf j)) - (push (parse-integer string :start i :end j :radix 16) result) - (setf i j))) + ;; Accumulate one hexadecimal number. PARSE-INTEGER is avoided + ;; here because it conses, and this runs several times per line + ;; over hundreds of thousands of conformance lines; the values are + ;; 16-bit and fit in a fixnum. + (let ((val 0) + (d nil)) + (loop while (and (< i n) + (setf d (digit-char-p (char string i) 16))) + do (setf val (+ (* val 16) d)) + (incf i)) + (push val result))) (nreverse result))) (defun collation-split-on-bar (string) @@ -133,3 +143,18 @@ must match the expected key in the line's comment." (:tag :unicode) (run-collation-conformance (ducet) *collation-non-ignorable-test* :non-ignorable)) + +;; A DEFINE-TEST body is stored as source and run interpreted, and the +;; test runner (tests/run-tests.lisp) loads this file as source, so its +;; functions would otherwise run interpreted. The per-line parsing and +;; string building run on every one of several hundred thousand +;; conformance lines, so interpreted they make the suite about ten times +;; slower. Compile the hot functions on load. +(eval-when (:load-toplevel :execute) + (dolist (name '(collation-hex-list + collation-split-on-bar + collation-parse-expected-key + collation-parse-test-line + collation-test-string + run-collation-conformance)) + (compile name))) View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/ca615266d431bc6f92022a13... -- View it on GitLab: https://gitlab.common-lisp.net/cmucl/cmucl/-/commit/ca615266d431bc6f92022a13... You're receiving this email because of your account on gitlab.common-lisp.net. Manage all notifications: https://gitlab.common-lisp.net/-/profile/notifications | Help: https://gitlab.common-lisp.net/help