Hi, Edi,
I just downloaded cl-ppcre to see if it would help with some parsing I'm working on. I thought I'd drop you a line just to alert you to the issues I ran into. I don't think they're worth a lot of attention, but just so you know they're there.
I'm doing development on Genera, and there were several issues with compiling cl-ppcre, mostly due to Genera not quite being ANSI compliant. I finally got it to compile and ran the test. Some (21) of the tests failed. Most of the failures, I've noted are based on assumptions about character codes that do not hold for Genera (e.g. test 432 -- line termination (char-code #\Return) -> #o215, not #o12.) The others I haven't thoroughly investigated. I believe that I'll be able to use the limited features I need, and if I run into trouble, I'll let you know.
The feature that I'd like to see relates to your s-expression parse tree capability. I'd like to abstract sub-parse-trees. A small change to the end of convert-aux,
(otherwise (let ((translation (get parse-tree 'parse-tree-synonym))) (if translation (convert-aux translation) (signal-ppcre-syntax-error "Unknown token ~A in parse-tree" parse-tree))))
and a quick macro,
(defmacro DEFINE-PARSE-TREE-SYNONYM (name parse-tree) `(setf (get ',name 'ppcre::parse-tree-synonym) ',parse-tree))
and I can do something like:
(define-parse-tree-synonym A (:CHAR-CLASS (:RANGE #\a #\z) (:RANGE #\A #\Z))) (define-parse-tree-synonym X (:CHAR-CLASS (:RANGE #\a #\z) (:RANGE #\A #\Z) :DIGIT-CLASS)) (define-parse-tree-synonym N :DIGIT-CLASS) (define-parse-tree-synonym SMA-DATE (:SEQUENCE n n a a a)) (define-parse-tree-synonym AIRLINE-DESIGNATOR (:SEQUENCE x x (:GREEDY-REPETITION 0 1 a))) (define-parse-tree-synonym FLIGHT-NUMBER (:GREEDY-REPETITION 3 4 n)) (define-parse-tree-synonym OPERATIONAL-SUFFIX (:GREEDY-REPETITION 0 1 a))
(defparameter *flight-scanner* (ppcre:create-scanner '(:sequence airline-designator flight-number operational-suffix "/" sma-date)))
Much more perspicuous, especially in more complex parse trees where the abstracted elements are repeated.
Just a thought.
- Patrick O'Donnell pao@ascent.com
Date: Wed, 15 Sep 2004 13:27:23 -0400 (EDT) From: "Patrick O'Donnell" pao@ascent.com
The feature that I'd like to see relates to your s-expression parse tree capability. I'd like to abstract sub-parse-trees. A small change to the end of convert-aux,
(otherwise (let ((translation (get parse-tree 'parse-tree-synonym))) (if translation (convert-aux translation) (signal-ppcre-syntax-error "Unknown token ~A in parse-tree" parse-tree))))
Except that it has to be:
(otherwise (let ((translation (and (symbolp parse-tree) (get parse-tree 'parse-tree-synonym)))) (if translation (convert-aux (copy-tree translation)) (signal-ppcre-syntax-error "Unknown token ~A in parse-tree" parse-tree))))))))
- Pat
Hi Patrick!
On Wed, 15 Sep 2004 13:27:23 -0400 (EDT), "Patrick O'Donnell" pao@ascent.com wrote:
I'm doing development on Genera, and there were several issues with compiling cl-ppcre, mostly due to Genera not quite being ANSI compliant.
If you send a #+/#- patch to make compilation on Genera work I'll gladly integrate it.
I finally got it to compile and ran the test. Some (21) of the tests failed. Most of the failures, I've noted are based on assumptions about character codes that do not hold for Genera (e.g. test 432 -- line termination (char-code #\Return) -> #o215, not #o12.) The others I haven't thoroughly investigated. I believe that I'll be able to use the limited features I need, and if I run into trouble, I'll let you know.
I can't do anything about these assumptions because CL-PPCRE purports to be Perl-compatible but I'd be glad to add a note to the docs or the README file about these (expected) failures on Genera if you send details - preferably as a patch do the README file. (I've bought an Alpha and a copy of Open Genera some months ago but haven't yet had the time to install it - let alone to play with it. Sigh...) Let me know if there are any other problems.
The feature that I'd like to see relates to your s-expression parse tree capability. I'd like to abstract sub-parse-trees.
Thanks. I've just released a new version which incorporates your patch. I've changed the API a little bit such that you also have a functional interface to these synonyms - see the docs. I've also wrapped the macro with EVAL-WHEN because it might break compiler macros otherwise.
Cheers, Edi.
From: Edi Weitz edi@agharta.de Date: Thu, 16 Sep 2004 10:53:04 +0200
On Wed, 15 Sep 2004 13:27:23 -0400 (EDT), "Patrick O'Donnell" pao@ascent.com wrote: > I'm doing development on Genera, and there were several issues with > compiling cl-ppcre, mostly due to Genera not quite being ANSI > compliant.
If you send a #+/#- patch to make compilation on Genera work I'll gladly integrate it.
OK. I'll have to go over it again to clean it up; some of the earlier conditionalizations I did were superseded by more comprehensive changes later. Quick precis: user::in-package is still a function, so (in-package #:cl-ppcre) doesn't work. I just changed all those to :cl-ppcre, the keyword clutter being less ugly to me than #+/#- for all those. In the package declaration, I :use'd FUTURE-COMMON-LISP instead of CL, which almost worked. I also had to shadowing import LAMBDA from CL. (Weird.) Either Genera doesn't handle the simple-string type right or future-common-lisp:simple-string isn't fully implemented. I'll want to investigate that better to determine the best solution. As it is, I just conditionalized all the simple-string usages to string. There were a couple other minor things.
I'll send a cleaned-up diff sometime when deadline pressure is relieved.
I can't do anything about these assumptions because CL-PPCRE purports to be Perl-compatible but I'd be glad to add a note to the docs or the README file about these (expected) failures on Genera if you send details ...
OK. (For some of the tests, I'll still have to wrap my brain around the Perl syntax, to figure out what's going wrong, to see whether they are the char-code issue or something else!) (I've bought an Alpha and a copy of Open Genera some months ago but haven't yet had the time to install it - let alone to play with it. Sigh...)
I understand. I've had a 3650 in my basement for some years, now, and I still haven't time to set it up.
- Pat
On Thu, 16 Sep 2004 09:17:26 -0400 (EDT), "Patrick O'Donnell" pao@ascent.com wrote:
I'll send a cleaned-up diff sometime when deadline pressure is relieved.
Cool, that'd be nice. Take your time.
Thanks, Edi.
Edi,
Date: Thu, 16 Sep 2004 22:49:51 +0200 From: Edi Weitz edi@agharta.de
On Thu, 16 Sep 2004 09:17:26 -0400 (EDT), "Patrick O'Donnell" pao@ascent.com wrote: > I'll send a cleaned-up diff sometime when deadline pressure is > relieved.
Cool, that'd be nice. Take your time.
I took a few moments to clean things up a bit. The diff is included, below.
Most of the porting problems were taken care of by judicious package manipulation.
The change in convert.lisp was because Genera had problems failing to grow the hash table when the rehash threshold was 1.0 for certain sizes of table.
In errors.lisp, Symbolics didn't get around to adding the :default-initargs option. Just commenting this out causes the errors to not print, but that didn't bother me, so I haven't spent time fixing it.
In optimize.lisp and lexer.lisp, Genera had problems with the string type declarations. I just diked them.
I moved the defpackage of cl-ppcre-test from ppcre-tests.lisp into packages.lisp. That way, Genera could correctly utilize the package specification in the file attribute list. I could see no downside.
- Pat
diff -r cl-ppcre-0.8.0/convert.lisp cl-ppcre/convert.lisp 118c118 < :rehash-threshold 1.0) ---
:rehash-threshold #-genera 1.0 #+genera 0.99)
diff -r cl-ppcre-0.8.0/errors.lisp cl-ppcre/errors.lisp 1c1 < ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE-LISP; Base: 10 -*- ---
;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
44a45
#-genera
diff -r cl-ppcre-0.8.0/lexer.lisp cl-ppcre/lexer.lisp Warning: missing newline at end of file cl-ppcre-0.8.0/lexer.lisp Warning: missing newline at end of file cl-ppcre/lexer.lisp 89c89 < (type string string)) ---
#-genera (type string string))
diff -r cl-ppcre-0.8.0/load.lisp cl-ppcre/load.lisp 30c30 < (in-package #:cl-user) ---
(in-package :cl-user)
diff -r cl-ppcre-0.8.0/optimize.lisp cl-ppcre/optimize.lisp Warning: missing newline at end of file cl-ppcre-0.8.0/optimize.lisp Warning: missing newline at end of file cl-ppcre/optimize.lisp 48c48 < (declare (type string string)) ---
#-genera (declare (type string string))
54c54 < (declare (type string string)) ---
#-genera (declare (type string string))
diff -r cl-ppcre-0.8.0/packages.lisp cl-ppcre/packages.lisp 30c30 < (in-package #:cl-user) ---
(in-package :cl-user)
35c35,36 < (:use #:cl) ---
#+genera (:shadowing-import-from #:common-lisp #:lambda #:simple-string #:string) (:use #-genera #:cl #+genera #:future-common-lisp)
92a94,105
#-:cormanlisp (defpackage #:cl-ppcre-test #+genera (:shadowing-import-from #:common-lisp #:lambda) (:use #-genera #:cl #+genera #:future-common-lisp #:cl-ppcre) (:export #:test))
#+:cormanlisp (defpackage "CL-PPCRE-TEST" (:use "CL" "CL-PPCRE") (:export "TEST"))
diff -r cl-ppcre-0.8.0/ppcre-tests.lisp cl-ppcre/ppcre-tests.lisp 30,41d29 < (in-package #:cl-user) < < #-:cormanlisp < (defpackage #:cl-ppcre-test < (:use #:cl #:cl-ppcre) < (:export #:test)) < < #+:cormanlisp < (defpackage "CL-PPCRE-TEST" < (:use "CL" "CL-PPCRE") < (:export "TEST")) < 154c142 < :type nil :version nil ---
:type #-genera nil #+genera :unspecific :version nil
diff -r cl-ppcre-0.8.0/regex-class.lisp cl-ppcre/regex-class.lisp Warning: missing newline at end of file cl-ppcre-0.8.0/regex-class.lisp Warning: missing newline at end of file cl-ppcre/regex-class.lisp 35a36,40
;;; Genera need the eval-when, here, or the types created by the ;;; class definitions aren't seen by the typep calls later in the ;;; file. (eval-when (:compile-toplevel :load-toplevel :execute)
238a244,245
);;; End eval-when
Hi Patrick!
On Wed, 29 Sep 2004 13:03:14 -0400 (EDT), "Patrick O'Donnell" pao@ascent.com wrote:
I took a few moments to clean things up a bit. The diff is included, below.
Cool, thanks! I'll release a new version as soon as possible. Two little questions:
1. The abstract lists the implementations CL-PPCRE is known to work with. What am I supposed to add in this case? Something like "Genera (version x.y on Symbolics LispMachine ZZZ)", i.e. what's the official name and version number of the Lisp implementation and the OS you're using.
2. You mentioned there are a couple of issues with the tests due to different character encodings. Is there a short sentence I could add to the README file like: "Note that some tests will fail on Genera because characters like ... have encodings which differ from Perl's expectations?"
Thanks again, Edi.
From: Edi Weitz edi@agharta.de Date: Wed, 29 Sep 2004 20:04:04 +0200
Two little questions:
1. The abstract lists the implementations CL-PPCRE is known to work with. What am I supposed to add in this case? Something like "Genera (version x.y on Symbolics LispMachine ZZZ)", i.e. what's the official name and version number of the Lisp implementation and the OS you're using.
Genera 8.5.
2. You mentioned there are a couple of issues with the tests due to different character encodings. Is there a short sentence I could add to the README file like: "Note that some tests will fail on Genera because characters like ... have encodings which differ from Perl's expectations?"
Return, Linefeed, and Tab. There are others, such as Back-Space and Page, but I don't think they appeared in the tests.
You also should mention the issue with ppcre's errors -- that incomplete ANSI compatibility in Genera means that attempts to print the errors will fail.
- Pat
cl-ppcre-devel@common-lisp.net