Hi,
In a parser I'm working on, trying to convert hand-written documents into XHTML form. And for this purpose using CL-WHO integrated within META-SEXP. Because of character-by-character parsing, I need to escape unrecognized atoms on-the-fly. At the moment, I'm using below method.
(elt (cl-who:escape-string (make-string 1 :initial-element character-needs-escaping)) 0)
Yep, quite nasty code piece to escape a single character. Therefore, I'd ask if it'd be possible to expose the character escaping routines. (If you approve the proposal, I'm volunteered to send a patch.)
Regards.
On Mon, 16 Jul 2007 14:13:27 +0300, Volkan YAZICI yazicivo@ttnet.net.tr wrote:
In a parser I'm working on, trying to convert hand-written documents into XHTML form. And for this purpose using CL-WHO integrated within META-SEXP. Because of character-by-character parsing, I need to escape unrecognized atoms on-the-fly. At the moment, I'm using below method.
(elt (cl-who:escape-string (make-string 1 :initial-element character-needs-escaping)) 0)
Hmm, I don't think I undertstand that. Grabbing just the first character will usually just give you the ampersand:
CL-USER 1 > (let ((character-needs-escaping #>)) (elt (cl-who:escape-string (make-string 1 :initial-element character-needs-escaping)) 0)) #&
Is that really what you want?
Edi Weitz edi@agharta.de writes:
On Mon, 16 Jul 2007 14:13:27 +0300, Volkan YAZICI yazicivo@ttnet.net.tr wrote: Hmm, I don't think I undertstand that. Grabbing just the first character will usually just give you the ampersand:
CL-USER 1 > (let ((character-needs-escaping #>)) (elt (cl-who:escape-string (make-string 1 :initial-element character-needs-escaping)) 0)) #&
Is that really what you want?
Execuse, that's my fault. I realized the mistake after I pressed C-c C-c. Here's a small snippet from the real code:
(write-string (cl-who:escape-string (make-string 1 :initial-element c)) some-output-stream)
Regards.
On Mon, 16 Jul 2007 14:29:26 +0300, Volkan YAZICI yazicivo@ttnet.net.tr wrote:
Execuse, that's my fault. I realized the mistake after I pressed C-c C-c. Here's a small snippet from the real code:
(write-string (cl-who:escape-string (make-string 1 :initial-element c)) some-output-stream)
OK, I see.
It's fine with me if you want to isolate the corresponding code and export a function which works on characters as long as your patch adheres with these guidelines:
Go wild, Edi.
Edi Weitz edi@agharta.de writes:
On Mon, 16 Jul 2007 14:29:26 +0300, Volkan YAZICI yazicivo@ttnet.net.tr wrote:
(write-string (cl-who:escape-string (make-string 1 :initial-element c)) some-output-stream)
OK, I see.
It's fine with me if you want to isolate the corresponding code and export a function which works on characters as long as your patch adheres with these guidelines:
I attached the related patch with the post. But if you'd ask for my opinion, escaping functions are just polluting function namespace. IMHO, it would be better to collect them under a single generic function. (Also by preserving old ones for compatibility.) For instance:
(defmethod escape ((input character) &optional test) ...) (defmethod escape ((input string) &optional text) ...)
And then we just supply the related escape predicates as global variables. (This time we pollute variable namespace.) Another, suggestion:
(defmethod escape ((type (eql :ascii)) (input character)) ...) (defmethod escape ((type (eql :ascii)) (input string)) ...) (defmethod escape ((type (eql :minimal)) ...) ...) ...
By the way, (eq *html-node* :xml) checks in the code make FORMAT optimization impossible for character escaping routines. I didn't test the impact of this from the performance point of view, but how many clients there are that doesn't support hexadecimals in the escaped entities? (Maybe let that check as a compile time parameter?)
Anyway, I'm just thinking loudly and sure you'll conclude to the best.
Regards.
Sorry for the delay. Busy...
On Mon, 16 Jul 2007 16:35:40 +0300, Volkan YAZICI yazicivo@ttnet.net.tr wrote:
I attached the related patch with the post.
Thanks. That's OK with me except that I'd use FLET instead of LET for the test functions. But the patch for the HTML documentation is missing.
But if you'd ask for my opinion, escaping functions are just polluting function namespace.
I don't think that's a big issue because we have packages. CL-WHO only exports two dozens of symbols or so.
IMHO, it would be better to collect them under a single generic function.
Yeah, but it'd be "harder" to use. Again, I think this is not a big issue and mainly a matter of taste.
By the way, (eq *html-node* :xml) checks in the code make FORMAT optimization impossible for character escaping routines. I didn't test the impact of this from the performance point of view, but how many clients there are that doesn't support hexadecimals in the escaped entities? (Maybe let that check as a compile time parameter?)
I agree that it'd be nicer to make this a compile-time decision. I'm not so much concerned about performance, but it'd be good for consistency.
Thanks, Edi.
Edi Weitz edi@agharta.de writes:
But the patch for the HTML documentation is missing.
I totally missed to diff index.html. Here it is.
Regards.
On Mon, 16 Jul 2007 16:35:40 +0300, Volkan YAZICI yazicivo@ttnet.net.tr wrote:
I attached the related patch with the post.
The patch contained TAB characters and wrong links amongst other things, so I had to go through it manually anyway. That's why it took me a while. Here's the new release.
Thanks, Edi.