[thelist] Reg Ex - everything except a phrase

Elankath, Tarun (Cognizant) ETarun at blr.cognizant.com
Thu Sep 11 02:36:36 CDT 2003


Hi,

How about the piece of Python code below (which uses negative lookahead) ?

import re
r = re.compile(r"^(?:(?!\bFred\b).)*$")
a = "Fred is a nice guy"
b = "Freddy is a nice guy"
c = "a nice guy is Fred"
d = "who ? Fred ? Oh he's a nice guy"
e = "Who ? AFreddy ? On he's a sissy"
r.search(a) # doesn't match [search is similar to Per's =~ operator, I believe]
r.search(b) # matches
r.search(c) # doesn't match
r.search(d) # doesn't match
r.search(e) # matches.


I use a word boundary around Fred so that Freddy/AFreddy are not rejected.
Negative lookahead doesn't eat the input string, and  we need to proceed character by character ahead, hence the "."

I do think, however, that this regex is _very_ slow. It would be better to use the ! operator.

Will this work for all cases ?

Regards,
Tarun




-----Original Message-----
From: Kelly Hallman [mailto:khallman at wrack.org]
Sent: Wednesday, September 10, 2003 11:41 PM
To: thelist at lists.evolt.org
Subject: Re: [thelist] Reg Ex - everything except a phrase


On Wed, 10 Sep 2003, Joshua Olson wrote:
> You da man!  I'm not familiar with Perl syntax... could you do me a favor
> and explain what this line does:
> 
> $d =~ s/^(.*)(?:fred)(.*)$/$1$2/ while $d =~ /fred/; print $d, "\n";

(?:xxxxx) is a positive lookahead assertion. This is an extended regex 
feature that you may not find in many regex engines that don't use PCRE 
(perl-compatible regular expressions, such as preg_ functions in PHP).

Originally I tried to use a lookahead assertion, but I couldn't figure one
that did what was requested, which was to match anything that did not
contain a certain sequence of characters. Indeed, the above regex matches
the sequence of characters, which was not the problem. No need for a
lookahead assertion if you simply want to remove a sequence of characters.
As someone said, the regex above is equivalent to $d =~ s/fred//g;

There is also a negative lookahead assertion (?!xxxxx) which "matches if
the expression wouldn't match xxxxx next". I also tried that, and couldn't
get the results I wanted...but it may contain a key to this problem...

-- 
Kelly Hallman
http://wrack.org/

-- 
* * Please support the community that supports you.  * *
http://evolt.org/help_support_evolt/

For unsubscribe and other options, including the Tip Harvester 
and archives of thelist go to: http://lists.evolt.org 
Workers of the Web, evolt ! 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: InterScan_Disclaimer.txt
URL: <http://lists.evolt.org/pipermail/thelist/attachments/20030911/32acc0fe/attachment.txt>


More information about the thelist mailing list