[thelist] php preg_replace

Robert Vreeland vreeland at studioframework.com
Thu Oct 2 16:06:42 CDT 2003


I working on an php based html parser that takes IE jscript outerHTML as the input. Regardless of how well formed the html doc is, IE drops a lot of the 'optional' closing tags. I'm using preg_replace to re-insert the closing tag, but for some reason it replaces every other instance. My current work around is to use preg_replace twice. Just wondering if anyone can figure out what is wrong with my regex? I've included some exapmle output.
 the preg_replace code:

$liPattern = "|(<li>)(.+[^(</li>)])(<li>)|i";
$bufferString = preg_replace($liPattern, "\\1\\2</li>\\3", $bufferString,-1);

IE's output:
<TD rowSpan=3>
<UL>
<LI><A>HTML</A> 
<LI><A>CSS - DHTML</A> 
<LI>JavaScript 
<LI>PHP 
<LI>PERL 
<LI>ASP.NET 
<LI>C# 
<LI>XML</LI></UL></TD>

first pass on the preg replace
<TD rowSpan=3> 
<UL> 
<LI><A>HTML</A>< /li>
<LI><A>CSS - DHTML</A>
 <LI>JavaScript </li>
<LI>PHP 
<LI>PERL< /li>
<LI>ASP.NET 
<LI>C# </li>
<LI>XML</LI></UL></TD> 

second pass on the preg replace
<TD rowSpan=3> 
<UL> 
<LI><A>HTML</A>< /li>
<LI><A>CSS - DHTML</A>< /li>
 <LI>JavaScript </li>
<LI>PHP < /li>
<LI>PERL< /li>
<LI>ASP.NET < /li>
<LI>C# </li>
<LI>XML</LI></UL></TD> 
Thanks!
Robert
vreeland at studioframework.com


More information about the thelist mailing list