[thelist] Reg Ex - everything except a phrase

Sam sam at sam-i-am.com
Wed Sep 10 10:27:05 CDT 2003


> $d =~ s/^(.*)(?:fred)(.*)$/$1$2/ while $d =~ /fred/; print $d, "\n";

As long as the string $d contains fred,
	in $d, match any character up to the first fred followed by whatever 
follows.
	Truncate the string back to and including the first fred.
then print out $d.

I've not tried this but it looks like you'll get stuck in a loop to me, 
as matches $1 and $2 are the characters up to fred, and fred 
respectively. And as fred ($2) is replaced back in $d =~ /fred/ is 
always going to be true. Maybe you meant s/^(.*)(?:fred)(.*)$/$1$3/  ?

Translating the perlisms, you get

while($d matches in /fred/) {
   replace $d with 1st and 2nd sub matches from /^(.*)(?:fred)(.*)$/
}
print "$d\n"

I still think the iterator might be wrong here. This should do it?
# start perl to remove fred
$str = "My name is fred.\nFred for short.";

while($str =~ /(.*?)fred/gi) {	
	# the 'g' modifier in a loop context means your
	# match starts where the last one left off

	$str =~ s/fred//i;
}
print $str;
# end


This problem reminds me of the problem when matching and stripping block 
comments (e.g /* comment here
over 2 lines */)

For which Jeffrey Friedl recommends:
# strip C style comments

undef $/;
$_ = join('', <>);
s{
	# first we'll list things we want to match, but not throw away
	(
		  [^"'/]+			# other stuff
		|					# or
		  " (\\.|[^"\\])* "	# double quoted string
		|					# or
		  ' (\\.|[^'\\])* '	# single qouted string
	)
| 	# OR...
	# we'll match a comment. Since it's not in the $1 parentheses above,
	# the comments will disappear when we use $1 as the replacement text
	/\* .*? \*/				# Traditional C comments
	| 						# or
	//[^\n]*				# C++ // style comments
}{$1}gsx;
print;

from Mastering Regular Expressions, p293 (1st edition)
This handles cases where you want to skip /* and */ sequences in quoted 
strings. (I rekeyed from the book, but I tested it and it seems to work).

Sam



More information about the thelist mailing list