[thelist] howto RegularExpression in PHP
Geoffrey Sneddon
foolistbar at googlemail.com
Mon Jan 8 10:14:41 CST 2007
On 8 Jan 2007, at 07:12, Max Schwanekamp wrote:
> S. F. Alim wrote:
>> I need help with this function `eregi_replace()` in PHP4.4.4. Well
>> actually
>> I need help in forming proper regex so I can remove `<img . />`
>> image tag of
>> html which is coming from database. All I want is to remove this
>> tag and all
>> its attributes.
>
> If you're using regex in PHP, you're better off using the PCRE library
> (preg_match() and friends). The POSIX Extended regex functions
> (eregi...()) are slower and less useful.
>
> Using preg_replace(), this should do it:
>
> $str = 'text text <img src="evil.gif" /> text text';
> echo preg_replace('/<img[^>]*>/iU','',$str);
> //outputs text text text text
There's a bug in that just shouting at me, try running it on:
$str = 'text text <img title=" /> " src="evil.gif" /> text text'; //
and yes, that is valid HTML
echo preg_replace('/<img[^>]*>/iU','',$str);
// outputs text text " src="evil.gif" /> text text
For HTML, you need something like…
echo preg_replace('/<img((\s*(([^\s:]+:)?[^\s:]+)(\s*=\s*("([^"]*)"|
\'([^\']*)\'|([a-z0-9\-._:]*)))?)*)\s*(\/)?>/i', '', $str);
For XML, you can go with something shorter like…
echo preg_replace('/<img((\s*(([^\s:]+:)?[^\s:]+)\s*=\s*("([^"]*)"|
\'([^\']*)\'))*)\s*\/>/', '', $str);
If anyone finds any bugs in either, please don't hesitate to let me
know.
- Geoffrey Sneddon
More information about the thelist
mailing list