[thelist] Regular Expression Question

Beau Hartshorne beau at hartshorne.ca
Sat Nov 8 20:19:39 CST 2003


I am trying to write a regular expression (using PHP's preg_replace())
that will take a string like this:

She said &quot;hello, world&quot; [<a target=&quot;_new&quot;
href=&quot;link.html&quot;>1</a>]

And turn it into a string like this:

She said &quot;hi, there.&quot; [<a target"_blank"
href="link.html">1</a>]

Basically, I only want to convert the quotes that sit between < and >.
One expression I've come up with would work if there were only ever one
attribute, but sometimes there are several, as in an img tag. The
closest I've come is this:

$pattern = '/=&quot;(((?!&quot;).)*)&quot;/i';
$replacement = '="\\1"';
$string = preg_replace($pattern,$replacement,$string);

This will only match =&quot;anytext&quot; (note the equals sign). It
will match all of the HTML attributes but will probably not match
anything else. I guess I have two questions:

1. Is an unencoded quote (") OK to use in HTML text outside of a tag?
(If this is the case, I'll just do a $string =
str_replace('&quot;','"',$string);.)

2. Can anyone tell me how to re-write the regex so that it *only* makes
changes to the &quot;s that sit inside <>?

Thanks!

Beau



More information about the thelist mailing list