Dear cl-interpol developer(s),
I would like to use cl-interpol to parse strings containing Perl subpattern variables like $1 and Makefile special variables like $<, $^ etc. I also want statements like #?"$(f 5)" when *optional-delimiters-p* is t to be expanded into something like
(with-output-to-string (#:string) (princ (f 5) #:string))
rather than
(with-output-to-string (#:string) (princ (progn f 5) #:string))
as they do now. (However, for better compatibility with Makefiles I need to treat the case of the list made of a single element in a special way, expanding it to the value of the corresponding variable.) Overall, I would like to have better control over the read-form function, maybe the possibility to supply my own one. Could you install some hook there? I'd gladly do it myself and send you a patch but I'm not sure which way is the best and whether you will accept such a change at all. If you advice me some workaround that doesn't involve modifying cl-interpol it would also be great. I just don't want to fork or recreate the whole library because of a single function.
Hi Evgeniy,
I haven't used cl-interpol for quite some time, so I'm generally cool with extensions or changes as long as a few criteria are met:
1. The code should try to be backwards-compatible. People who don't want to use the new features should be able, if possible, to use the new version like they used the old one.
2. Everything that's added should be documented with the same attention to detail that is currently used. I'm not willing to accept patches where I have to go over the patch and spend a lot of time correcting and re-formatting. The general patch guidelines are here: http://weitz.de/patches.html
Judging from what you described, I agree that the best solution might be to add one or two hooks that enable people to extend the syntax if they want to. I'm not sure about things like $1 and so on. That'd require tight integration with CL-PPCRE, wouldn't it?
Maybe the best idea is to provide a rough sketch of what you want to do and present it to the mailing list before you do the actual work. I'd be more interested in a detailed description of the new features than in the implementation.
Thanks, Edi.
On Wed, Dec 16, 2009 at 4:00 PM, Evgeniy Zhemchugov jini.zh@gmail.com wrote:
Dear cl-interpol developer(s),
I would like to use cl-interpol to parse strings containing Perl subpattern variables like $1 and Makefile special variables like $<, $^ etc. I also want statements like #?"$(f 5)" when *optional-delimiters-p* is t to be expanded into something like
(with-output-to-string (#:string) (princ (f 5) #:string))
rather than
(with-output-to-string (#:string) (princ (progn f 5) #:string))
as they do now. (However, for better compatibility with Makefiles I need to treat the case of the list made of a single element in a special way, expanding it to the value of the corresponding variable.) Overall, I would like to have better control over the read-form function, maybe the possibility to supply my own one. Could you install some hook there? I'd gladly do it myself and send you a patch but I'm not sure which way is the best and whether you will accept such a change at all. If you advice me some workaround that doesn't involve modifying cl-interpol it would also be great. I just don't want to fork or recreate the whole library because of a single function.
cl-interpol-devel site list cl-interpol-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-interpol-devel
I'm not sure about things like $1 and so on. That'd require tight integration with CL-PPCRE, wouldn't it?
I was thinking about something like this:
(defvar *groups*)
(defmacro =~ (scanner string &body body) `(let (($_ ,string)) (multiple-value-bind ($- $+ @- @+) (cl-ppcre:scan ,scanner $_) (let ((*groups* (map 'vector (lambda (start end) (subseq $_ start end)) @- @+))) ,@body))))
; Just as in cl-interpol (defvar *readtable-copy* (copy-readtable))
; Readtable modification for $1, $2, etc (set-macro-character #$ (lambda (stream char) (if (digit-char-p (peek-char nil stream nil #\newline t)) ; Doesn't work well with symbols like $1:a. Also, interns unnecessary ; symbols. The alternative is to parse the stream until a terminating ; character occur (with some check for package designators), but ; how to determine whether a given character is a terminating one? (let ((symbol (read stream nil nil t))) (if (subtypep (type-of symbol) 'integer) `($ ,symbol) (car (multiple-value-list (intern (concatenate 'string "$" (symbol-name symbol))))))) (let ((*readtable* *readtable-copy*)) (read (make-concatenated-stream (make-string-input-stream "$") stream) t nil t)))))
; Accessor for *groups* (defun $ (index) (when (<= index (length *groups*)) (aref *groups* (1- index))))
I haven't thought well the expansion of the #$ macro character yet. Anyway, it's an userspace code. As for cl-interpol, I want $1 to be expanded into ($ 1) in interpolated strings as well. I see several possible approaches here. The one with the most features seems to be an ad-hoc implementation of something similar to Lisp readtable. The idea is to provide a function via *inner-delimiters* variable that will do the reading and expansion for cl-interpol in the same way as functions set via set-macro-character do that for the Lisp reader. If we redefine the read-form as follows:
(defun read-form () (let* ((start-delimiter (peek-char*)) (end-delimiter (get-end-delimiter start-delimiter *inner-delimiters*))) (if (consp end-delimiter) (progn (read-char*) (funcall (cadr end-delimiter) *stream* start-delimiter (car end-delimiter))) (cond ((null end-delimiter) (if *optional-delimiters-p* (read-optional-delimited) nil)) (t `(progn ,@(progn (read-char*) (let ((*readtable* (copy-readtable*))) ;; temporarily change the readtable (set-syntax-from-char end-delimiter #)) (read-delimited-list end-delimiter *stream* t)))))))))
then I can do what I want to do in this way:
(defun digit-reader (stream start end) (loop for d = start then (read-char stream nil nil t) while (and d (digit-char-p d)) collect d into r finally (progn (when d (unread-char d stream)) (return `($ ,(parse-integer (coerce r 'string)))))))
(setf *inner-delimiters* (list ; $(A B) should call the function A with argument B (list #( #) (lambda (stream start end) (read-delimited-list #) stream))) ; ${a} should expand to the value of variable a (it's for cases like ; "ab${cd}e") (list #{ #} (lambda (stream start end) (let ((*readtable* (copy-readtable*))) (set-syntax-from-char #} #)) (prog1 (read stream) (when (char/= (read-char stream) #}) (error "Interpolation error"))))))
(loop for d from 1 to 9 do (push (list (digit-char d) nil #'digit-reader) *inner-delimiters*))
This solution seems to be backward compatible.
If you want to incorporate this or similar machinery in cl-ppcre then we can create even better support for Perl variables by declaring dynamic variables (or #$ character macro expansions) $_, $`, $&, $', $+, $-, @+, @-, etc and binding them on every call to cl-ppcre:scan if some flag is set. I'm not sure that Perl approach to regular expressions will work well with Lisp conceptions though. What do you think of it?
I'm not sure if I really like this, especially as I think this is a very specific addition only to be Perl-compatible. I'd be more happy with something like a very thin layer atop cl-interpol that enables users to extend it (for example in the way you want) but doesn't prescribe an actual syntax.
But I'm pretty sure I don't want to change CL-PPCRE for this goal.
Anyway, I won't have more time to think about this before the next year begins. Too busy, sorry.
Thanks and Happy Holidays, Edi.
On Thu, Dec 17, 2009 at 8:58 PM, Evgeniy Zhemchugov jini.zh@gmail.com wrote:
I'm not sure about things like $1 and so on. That'd require tight integration with CL-PPCRE, wouldn't it?
I was thinking about something like this:
(defvar *groups*)
(defmacro =~ (scanner string &body body) `(let (($_ ,string)) (multiple-value-bind ($- $+ @- @+) (cl-ppcre:scan ,scanner $_) (let ((*groups* (map 'vector (lambda (start end) (subseq $_ start end)) @- @+))) ,@body))))
; Just as in cl-interpol (defvar *readtable-copy* (copy-readtable))
; Readtable modification for $1, $2, etc (set-macro-character #$ (lambda (stream char) (if (digit-char-p (peek-char nil stream nil #\newline t)) ; Doesn't work well with symbols like $1:a. Also, interns unnecessary ; symbols. The alternative is to parse the stream until a terminating ; character occur (with some check for package designators), but ; how to determine whether a given character is a terminating one? (let ((symbol (read stream nil nil t))) (if (subtypep (type-of symbol) 'integer) `($ ,symbol) (car (multiple-value-list (intern (concatenate 'string "$" (symbol-name symbol))))))) (let ((*readtable* *readtable-copy*)) (read (make-concatenated-stream (make-string-input-stream "$") stream) t nil t)))))
; Accessor for *groups* (defun $ (index) (when (<= index (length *groups*)) (aref *groups* (1- index))))
I haven't thought well the expansion of the #$ macro character yet. Anyway, it's an userspace code. As for cl-interpol, I want $1 to be expanded into ($ 1) in interpolated strings as well. I see several possible approaches here. The one with the most features seems to be an ad-hoc implementation of something similar to Lisp readtable. The idea is to provide a function via *inner-delimiters* variable that will do the reading and expansion for cl-interpol in the same way as functions set via set-macro-character do that for the Lisp reader. If we redefine the read-form as follows:
(defun read-form () (let* ((start-delimiter (peek-char*)) (end-delimiter (get-end-delimiter start-delimiter *inner-delimiters*))) (if (consp end-delimiter) (progn (read-char*) (funcall (cadr end-delimiter) *stream* start-delimiter (car end-delimiter))) (cond ((null end-delimiter) (if *optional-delimiters-p* (read-optional-delimited) nil)) (t `(progn ,@(progn (read-char*) (let ((*readtable* (copy-readtable*))) ;; temporarily change the readtable (set-syntax-from-char end-delimiter #)) (read-delimited-list end-delimiter *stream* t)))))))))
then I can do what I want to do in this way:
(defun digit-reader (stream start end) (loop for d = start then (read-char stream nil nil t) while (and d (digit-char-p d)) collect d into r finally (progn (when d (unread-char d stream)) (return `($ ,(parse-integer (coerce r 'string)))))))
(setf *inner-delimiters* (list ; $(A B) should call the function A with argument B (list #( #) (lambda (stream start end) (read-delimited-list #) stream))) ; ${a} should expand to the value of variable a (it's for cases like ; "ab${cd}e") (list #{ #} (lambda (stream start end) (let ((*readtable* (copy-readtable*))) (set-syntax-from-char #} #)) (prog1 (read stream) (when (char/= (read-char stream) #}) (error "Interpolation error"))))))
(loop for d from 1 to 9 do (push (list (digit-char d) nil #'digit-reader) *inner-delimiters*))
This solution seems to be backward compatible.
If you want to incorporate this or similar machinery in cl-ppcre then we can create even better support for Perl variables by declaring dynamic variables (or #$ character macro expansions) $_, $`, $&, $', $+, $-, @+, @-, etc and binding them on every call to cl-ppcre:scan if some flag is set. I'm not sure that Perl approach to regular expressions will work well with Lisp conceptions though. What do you think of it?
cl-interpol-devel site list cl-interpol-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-interpol-devel
I'm not sure if you get me right. Perhaps I haven't made myself clear enough, sorry. The =~ macro is what I'm going to use in my code, I don't think it should belong to the library. At least not until I test it in one of my projects. The actual modification I propose is the change in the read-form function so user would be able to supply their own reader to be executed during expansion of the $form --- in the same way as we supply readers to Lisp's read via set-macro-character. No change of CL-PPCRE is required in this case --- everything can be done on the user side.
I can suggest another solution though. We can declare the *read-form* special variable like this:
(defvar *read-form* #'read-form)
and replace the calls to read-form by funcalls to *read-form* in the inner-reader function. The user can then set this variable to the value they desire thus controlling the way cl-interpol expands $-forms. However, I think the former solution is more elegant.
Don't bother to answer soon. I'm also busy these days and probably won't be able to work with cl-interpol until the middle of January. Happy new year!
OK, I see. Sorry, I only took a quick glance. That's why I asked for a description instead of the implementation... :)
If you can provide in January what would be added to the cl-interpol documentation, then we'll have something to talk about. We can discuss the implementation afterwards.
Thanks again, Edi.
On Mon, Dec 21, 2009 at 7:40 PM, Evgeniy Zhemchugov jini.zh@gmail.com wrote:
I'm not sure if you get me right. Perhaps I haven't made myself clear enough, sorry. The =~ macro is what I'm going to use in my code, I don't think it should belong to the library. At least not until I test it in one of my projects. The actual modification I propose is the change in the read-form function so user would be able to supply their own reader to be executed during expansion of the $form --- in the same way as we supply readers to Lisp's read via set-macro-character. No change of CL-PPCRE is required in this case --- everything can be done on the user side.
I can suggest another solution though. We can declare the *read-form* special variable like this:
(defvar *read-form* #'read-form)
and replace the calls to read-form by funcalls to *read-form* in the inner-reader function. The user can then set this variable to the value they desire thus controlling the way cl-interpol expands $-forms. However, I think the former solution is more elegant.
Don't bother to answer soon. I'm also busy these days and probably won't be able to work with cl-interpol until the middle of January. Happy new year!
cl-interpol-devel site list cl-interpol-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-interpol-devel
cl-interpol-devel@common-lisp.net