[thelist] php regexp help (or some other solution maybe..)

Kelly Hallman khallman at wrack.org
Tue Dec 31 12:31:01 CST 2002


To remove the unwanted header information this actually seems to be the
easiest, fastest, least error-prone method*:

This regex uses a positive lookahead:
list($head,$body) = preg_split("/(?=<\?xml)/",$inputdata,2);

That is, it doesn't gobble up any of the characters, but looks ahead in
the stream for matches.  So it generates a match directly prior to <?xml
(versus a regex like /<\?xml/ which, when used as a split, would remove
the <?xml since it would be considered a delimiter).

You could omit the limit argument to preg_split, if you wanted to split at
every XML PI, but this example just assumes two parts.


* Let's see if that improves response :)

On Tue, 31 Dec 2002, Tom Dell'Aringa wrote:
> I've got this continuing problem with this XML file that Carl gave me
> some valuable help on. Anyway, we think the solution is to remove the
> unwanted information that is probably causing the error before the
> info is written to our file on the server. I assume regexp can be
> used to do this, but maybe there is another way..anyway - here again
> is what is being written to the file:

--
Kelly Hallman
http://wrack.org/




More information about the thelist mailing list