==> Here is the trouble: how to make the match abort when position 17 is reach. Coz from there, the filter will always returns nil. So the last 30 calls are wasted time.
Well, this is Common Lisp...
CL-USER> (defvar *my-string* "line1 word1 word2 line2 word1 word2 line3 word1 word2") *MY-STRING* CL-USER> (defvar *my-scanner* '(:sequence (:filter my-filter 0) :word-boundary (:greedy-repetition 1 nil :word-char-class) :word-boundary)) *MY-SCANNER* CL-USER> (let ((end-of-first-line 17)) (defun my-filter (pos) (format t "Called at: ~A~%" pos) (cond ((< pos end-of-first-line) pos) (t (throw 'stop-it nil))))) ; Converted MY-FILTER. MY-FILTER CL-USER> (catch 'stop-it (scan *my-scanner* *my-string*)) Called at: 0 0 5 #() #() CL-USER> (setf *my-scanner* '(:sequence (:filter my-filter 0) :word-boundary "line2" (:greedy-repetition 1 nil :word-char-class) :word-boundary)) (:SEQUENCE (:FILTER MY-FILTER 0) :WORD-BOUNDARY "line2" (:GREEDY-REPETITION 1 NIL :WORD-CHAR-CLASS) :WORD-BOUNDARY) CL-USER> (catch 'stop-it (scan *my-scanner* *my-string*)) Called at: 0 Called at: 1 Called at: 2 Called at: 3 Called at: 4 Called at: 5 Called at: 6 Called at: 7 Called at: 8 Called at: 9 Called at: 10 Called at: 11 Called at: 12 Called at: 13 Called at: 14 Called at: 15 Called at: 16 Called at: 17 NIL
Throw & Catch, of course. I'm just not very familiar with this kind of big jumps. I should !!!!
I think the loop I'm speaking about is created by "insert-advance-fn"
Yes. It's the normal loop that advances through the regular expression.
Last point, I can't access the position where the match actually has started (the first of the fourth values returned by scan), so I have no way to extract the current global match without using register.
Sure you can:
CL-USER> (let (match-start) (defun set-match-start (pos) (setq match-start pos)) (defun show-match-start (pos) (format t "Match start is ~A, pos is ~A~%" match-start pos) pos)) ; Converted SET-MATCH-START. ; Converted SHOW-MATCH-START. SHOW-MATCH-START CL-USER> (setf *my-scanner* '(:sequence (:filter set-match-start 0) "abc" (:filter show-match-start 0) (:alternation #\x #\y))) (:SEQUENCE (:FILTER SET-MATCH-START 0) "abc" (:FILTER SHOW-MATCH-START 0) (:ALTERNATION #\x #\y)) CL-USER> (scan *my-scanner* "abczabcabcx") Match start is 0, pos is 3 Match start is 4, pos is 7 Match start is 7, pos is 10 7 11 #() #()
Just make sure SET-MATCH-START is at the very beginning of your regular expression and not within a group or alternation or somesuch.
It just add a little work to craft the parse tree but that's OK. It seems that filters are really powerful !!!
I've got everything I need for now. I will try all that & will give you some feedback when it's done in a few days.
Finally, I just want to thank you very much, Edi, for all your help & work.
Cheers, Sebastien.