[thelist] Extracting plain text without line breaks from an Adobe PDF file ANSWER

John C Bullas jcbullas at nildram.co.uk
Thu Oct 16 06:19:35 CDT 2003


At 10:14 AM 16-10-03, John C Bullas wrote:
>Ok, I am writing a technical paper and sadly only can find up to V2 in 
>word but have v4.4 as a PDF
>the figure are the same but the text is way different, using the "export 
>document to text" route strips out the
>images and generates some "swarf" from my line numbers BUTthe problem is 
>the line returns from the original PDF
>are retained

Physician heal thyself?

Went into Word, the line breaks "appeared" as Paragraph marks (gothic 
"P"'s) when view all formatting was chosen

I marked up any line and paragraph breaks I needed with "$$$" in the text

Did Replace >>>> MORE >>>> SPECIAL >>>>> paragraph marks

replace with nothing (ie.  remove them)

that killed all the paragraph marks ie .line breaks to produce a big wodge 
of words.....

then did a replace all $$$ with 2 paragraph marks

this produced a reasonable rendition of the text in the original

FB 



More information about the thelist mailing list