Doing some profiling I found a good portion of time and memory was being spent in rune-in-range-p among others. Investigating the source I found a comment in german that google translates into something like "that can be done nevertheless better". So I did.
We now use a binary search instead of a linear search. Type declarations (simple-vector) helped a bit on sbcl. As well as rune-name-char-p had been calling rune-in-range-p with each of a bunch of different lists. So I combined these lists at compile time, and search through the sorted list once.
Some tests I did (each one twice):
;;ORGINAL seconds | consed | calls | sec/call | name -------------------------------------------------------- 0.075 | 1,266,472 | 43,752 | 0.000002 | RUNE-IN-RANGE-P 0.010 | 804,632 | 34,574 | 0.0000003 | RUNE-NAME-CHAR-P 0.000 | 1,018,912 | 4,670 | 0.000000 | VALID-NAME-P -------------------------------------------------------- 0.085 | 3,090,016 | 82,996 | | Total seconds | consed | calls | sec/call | name -------------------------------------------------------- 0.067 | 1,222,936 | 43,752 | 0.000002 | RUNE-IN-RANGE-P 0.003 | 1,078,888 | 34,574 | 0.0000001 | RUNE-NAME-CHAR-P 0.000 | 1,036,544 | 4,670 | 0.000000 | VALID-NAME-P -------------------------------------------------------- 0.070 | 3,338,368 | 82,996 | | Total
;;NEW seconds | consed | calls | sec/call | name -------------------------------------------------------- 0.043 | 0 | 37,001 | 0.000001 | RUNE-IN-RANGE-P 0.000 | 833,704 | 34,574 | 0.000000 | RUNE-NAME-CHAR-P 0.000 | 1,056,296 | 4,670 | 0.000000 | VALID-NAME-P -------------------------------------------------------- 0.043 | 1,890,000 | 76,245 | | Total seconds | consed | calls | sec/call | name -------------------------------------------------------- 0.028 | 0 | 37,001 | 0.000001 | RUNE-IN-RANGE-P 0.000 | 808,528 | 34,574 | 0.000000 | RUNE-NAME-CHAR-P 0.000 | 1,116,776 | 4,670 | 0.000000 | VALID-NAME-P -------------------------------------------------------- 0.028 | 1,925,304 | 76,245 | | Total
Nathan Bird