[thelist] Re: Google and PDF's
Laura Carlson
lcarlson at d.umn.edu
Thu Feb 19 10:28:52 CST 2004
>> you can select all text from a pdf and paste it into another file
>> (edit->select all or ctrl+a) - is this what you mean?
> No dynamically, I need to strip text from over 1,000 articles.
Would this technique from the University of Minnesota Web Accessibility
Standards help?
A link can be created that passes the URL of a PDF document ? as a query string ? to an Adobe Acrobat conversion utility script on the access.adobe.com server. A HTML document is returned, which approximates the logical reading order of the text in the PDF document and is formatted it as a single column of text.
All existing hypertext links are converted into HTML links. This includes intra-document links as well as links to other documents on the Internet. Extra HTML links are also created to enable easy navigation between pages.
Link for the PDF version:
<a href="http://www.domain.com/example.pdf">Example Document</a>
Link for the HTML version:
<a href="http://access.adobe.com/perl/convertPDF.pl?url=http://www.domain.
com/example.pdf">Convert "Example Document" to HTML</a>
For more info see:
http://cap.umn.edu/ait/Web/Downloads.html
Laura
___________________________________________
Laura L. Carlson
Information Technology Systems and Services
University of Minnesota Duluth
Duluth, MN 55812-3009
http://www.d.umn.edu/goto/webdesign/
More information about the thelist
mailing list