[thelist] Converting MS Word to text, preserving entities

martin.p.burns at uk.pwcglobal.com martin.p.burns at uk.pwcglobal.com
Mon Apr 8 11:51:00 CDT 2002


Memo from Martin P Burns of PricewaterhouseCoopers

-------------------- Start of message text --------------------

Hi Michael

How easy do you think this would be to integrate into Zope?

What I'm after is a setup where content editors can upload an
RTF file to Zope and have it nicely drop into the standard template.

Cheers
Martin



Francois Jordaan wrote:

> A week ago, Michael Mell wrote,
> > I've already written the basics of a simple tool in Python to convert
> > rtf
>
> To get to the point, I'm looking for a simple conversion tool that'll
take
> Word docs or RTF and convert them to text with all extended characters
> correctly converted to numeric entities. Does such a tool already exist?
> Mike, does your Python tool do that?

Yes. http://www.nthwave.net/rtf2HTML/
I have not yet read the rtf spec or incorporated the plethora of rtf codes
into the script. However, what is there works for me and is easily
extendable.
To include codes that your authors use, simple edit the two dictionaries at
the top of the script. The script contains further documentation.

The script will create a new file with a .txt extension. At the top of this
new file, there will be about a page full of rtf junk that you can delete.
The
rest of the file will be your converted document.

Please let me know how you would like this to be further improved (aside
from
the obvious one of including all the codes). I can't always read all of
[thelist], so a private message will be more certain to get my attention.


--------------------- End of message text --------------------

This e-mail is sent by the above named in their
individual, non-business capacity and is not on
behalf of PricewaterhouseCoopers.

PricewaterhouseCoopers may monitor outgoing and incoming
e-mails and other telecommunications on its e-mail and
telecommunications systems.
----------------------------------------------------------------
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material.  Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited.   If you received
this in error, please contact the sender and delete the material from any
computer.




More information about the thelist mailing list