[thelist] SGML files - no idea where to start

Peter Van Dijck peter.vandijck at vardus.com
Tue Jan 8 11:13:15 CST 2002


> supposed to make the web site display these files. The problem is, I HAVE
NO
> IDEA WHAT TO DO. >
> <!DOCTYPE SERIAL PUBLIC "-//Publisher//DTD for Online Journals//EN">
> <SERIAL><SERFRONT><SERPUBFR><VOLID>2</VOLID>
> <ISSUEID><ISSUENO>2</ISSUENO></ISSUEID>
> <FPAGE>14</FPAGE><LPAGE>17</LPAGE>
> <DATE>198004</DATE>
> <CPYRT><DATE>1980</DATE></CPYRT>
>
<DOI>10.1043/0199-610X(1980)002<0014:WTFWS>2.3.CO;2</DOI></SERPUBFR></SERFRO
> NT> ...etc.

Hi rebecca,
you have two basic options I can see:
- fudge your way into turning these into HTML (by search and replaces, or
regular expressions) Disadvantage: it may break.
- do it the proper way: read them in as XML and convert them with a
stylesheet. Advantage: it won't break and it's fantastic to have on your CV.

I am not aware how different SGML can be of generic XML, or if it is in your
case for any practical purposes. If the SGML documents you're getting are
valid XML, all you need to do is convert them to HTML with your favourite
method. This could be loading the doc as XML into a language like PHP or ASP
and converting it with search/replaces or with regular expressions, or you
could load it and run it trough an XML parser and then convert it, or you
could learn XSLT, write an XSLT to convert the doc into HTML and then write
the code that takes the docs and does that.

Good luck
Peter
Portfolio http://petervandijck.net





More information about the thelist mailing list