Re: [cl-ppcre-devel] report a bug

On Jun 25, 9:21 pm, Edi Weitz <e...@agharta.de> wrote:
The question is - what do you want to achieve with this regular expression? Can't you write it in a simpler way?
Isn't this pattern pretty useful in general: A@B where A and B are word characters and @ is a specific non-word character? How else could we specify it? [a-zA-Z0-9] doesn't seem acceptable to me since it relies on the latin alphabet... Leslie -- http://www.linkedin.com/in/polzer

On Fri, Jun 26, 2009 at 15:10, Leslie P. Polzer<sky@viridian-project.de> wrote:
On Jun 25, 9:21 pm, Edi Weitz <e...@agharta.de> wrote:
The question is - what do you want to achieve with this regular expression? Can't you write it in a simpler way?
Isn't this pattern pretty useful in general:
A@B
where A and B are word characters and @ is a specific non-word character?
Sure, but the original bug report was about this: (\\w+)*\\@\\w+ I can't make any sense of this regular expression, but maybe it is because I am lacking some skills. Maybe Wu can explain what he wants to achive with it? -Hans

Very sorry, it is a typo, :( It should be: (cl-ppcre:scan (cl-ppcre:create-scanner "(_\\w+)*\\@\\w+") "______________________________________" :start 0) but other examples indicate the accurate idea. On 6/26/09, Hans Hübner <hans.huebner@gmail.com> wrote:
On Fri, Jun 26, 2009 at 15:10, Leslie P. Polzer<sky@viridian-project.de> wrote:
On Jun 25, 9:21 pm, Edi Weitz <e...@agharta.de> wrote:
The question is - what do you want to achieve with this regular expression? Can't you write it in a simpler way?
Isn't this pattern pretty useful in general:
A@B
where A and B are word characters and @ is a specific non-word character?
Sure, but the original bug report was about this:
(\\w+)*\\@\\w+
I can't make any sense of this regular expression, but maybe it is because I am lacking some skills. Maybe Wu can explain what he wants to achive with it?
-Hans
_______________________________________________ cl-ppcre-devel site list cl-ppcre-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-ppcre-devel
-- 片云天共远永夜月同孤

Xiangjun Wu wrote:
Very sorry, it is a typo, :(
It should be:
(cl-ppcre:scan (cl-ppcre:create-scanner "(_\\w+)*\\@\\w+") "______________________________________" :start 0)
but other examples indicate the accurate idea.
Looking at this I'm not sure what this is good for. Why would we want to match strings of the form _xxx@xxx in a full-text indexer? Perhaps it would be best to get rid of the whole messy regex (of which this is only a small part) and write a new documented one from scratch. Or use a custom state-based tokenizer.

Xiangjun Wu <netawater@gmail.com> writes:
(cl-ppcre:scan (cl-ppcre:create-scanner "(_\\w+)*\\@\\w+") "______________________________________" :start 0)
Perhaps (cl-ppcre:create-scanner "(_[_\\w]+)?@\\w+") will work for your app? The problem in the original expression is the "+" followed by the "*" can lead to a combinatorial explosion. If you loosen the requirement that all non-zero matches in the first expression must begin with an "_" you could have: (cl-ppcre:create-scanner "[_\\w]*@\\w+") Cheers, Chris Dean

片云天共远永夜月同孤 On Sat, Jun 27, 2009 at 2:17 AM, Chris Dean <ctdean@sokitomi.com> wrote:
Xiangjun Wu <netawater@gmail.com> writes:
(cl-ppcre:scan (cl-ppcre:create-scanner "(_\\w+)*\\@\\w+") "______________________________________" :start 0)
Perhaps
(cl-ppcre:create-scanner "(_[_\\w]+)?@\\w+")
will work for your app? The problem in the original expression is the "+" followed by the "*" can lead to a combinatorial explosion.
If you loosen the requirement that all non-zero matches in the first expression must begin with an "_" you could have:
(cl-ppcre:create-scanner "[_\\w]*@\\w+")
Cheers, Chris Dean
_______________________________________________ cl-ppcre-devel site list cl-ppcre-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-ppcre-devel
Thank you, it works for our application.
participants (4)
-
Chris Dean
-
Hans Hübner
-
Leslie P. Polzer
-
Xiangjun Wu