
Chris Dean <ctdean@sokitomi.com> writes:
(cl-ppcre:scan "^\\[([^ ]{1,})+[ ]*(.{1,})?\\]" "foo [[Main]] [http://baz]''bold'''''''bar''''" :start 13)
This is a regular expression that does lots of backtracking when it fails. If you change that you'll most likely see a large performance improvement.
A small change is to simplify the first grouping:
"^\\[([^ ]{1,})[ ]*(.{1,})?\\]"
The reason that having :start is so much slower is that the regex matches a different string that needs far less backtracking that without the :start.
Next time, how can I understand when a regex will need that much backtracking? I'll be really appreciated if you'd explain the pattern a little bit more. By the way, what I'm trying to do is to parse string patterns like `[href]' and `[href text]'. And as you can realize from :START 13 keyword, I'm previously determined that at 13th character, there exists a `['. Do you suggest any other method to parse such strings more efficiently? Regards.