[thelist] XSLT Challenge

Carol Stein techwatcher at accesswriters.com
Thu Apr 3 11:11:42 CST 2003


Hi, Mark --

_______You wrote:
I have the following:

<p>
	Mixed content Paragraph 1
	<br/>
	<br/>
	Paragraph 2 <b>Some more content</b>
	<br/>
	<br/>
	<br/>
	Paragraph 3
	<br/>
	... and so on...
</p>

And I want to split this up, using XSLT, into

<p>
	Mixed content Paragraph 1
</p>
<p>
	Paragraph 2 <b>Some more content</b>
</p>

.. and so on, using 2 or more <br/>s as a paragraph break.  Basically, I wanna
deserialise this bad HTML into something more structural.  It would be the
equivalent of inseting </p><p> everytime you find a <br/><br/> (or more)
except
XSLT doesn't really do this sort of thing.  The paragraphs are actually mixed
content, so I need to preserve the HTML inside.

I have had a go, and my understanding of functional programming extends to
using
a recursive template to take a node tree, process the first few nodes, write
them to the result tree, then call myself with the remainder of the tree.  It
sounds like it should be simple, but it's just complex enough that I get
thoroughly lost trying to manage it all.  Is there another way?  Or has
anybody
already written code to do this that they wouldn't mind sharing?
_______________________
A couple points... First, if it would help to get those double <br/> tags
replaced, look into the wonderful freeware (or shareware, I forget) BackUp
& Replace'Em program. It does all the regular expression work for you.

Second, you realize, I hope, that <br/> is NOT correct for XHTML syntax
(that is, HTML that XML can handle) -- it must be <br />! NOTE that space
before the closing /... I don't know enough to know if XSLT can handle your
job if you fixed those tags, but it couldn't hurt, especially given how
easy it would be with my first suggestion.

Cheers --
Carol



More information about the thelist mailing list