For what Shawn is suggesting, there is always Adobe Writer, which I hear can do this exact task, although I've never done it myself. I'd imaging Adobe products (other than the free Reader) can get pricy. On Tue, Dec 16, 2008 at 9:25 PM, Shawn Fumo <programming at shawnfumo.com>wrote: > ** Be sure to fill out the survey/skills inventory in the member's area. > ** If you did, we all thank you. > > > As an FYI, from my understanding the problem with pdf conversion is > that it isn't a normal document format. Unlike HTML, it isn't bunches > of text with some formatting applied. Instead, each character is > treated like a separate entity. So it'll store a letter "a" at a > particular pixel location, an "x" at some location, etc. They may not > even be stored in the same order they're displayed in the resulting > document. So any program that extracts text from a PDF basically has > to act like an OCR program to try to reconstruct the document. > > There's a variety of programs out there for extracting text/converting > from pdfs, both open source and commercial. One of the commercial ones > is from a local company Snowtide Informatics. > > I'd agree with many of the responders that if there's some way of > getting at the text and images before it's actually made into a pdf, > that'd probably be the easiest way to go. > > Shawn > > > > > Hi H-Ters, > > I'm editor at 2 quarterly business magazines. We publish in print and > online. > > (Yes, I'm a H-T member, since the beginning, and live here in the Happy > Valley.) > > We're looking for the simplest, most automated (if possible) way to > convert the final PDF files we send our printer into MS-Word so our > webmaster can post the upcoming issue online ASAP. Our Art director is doing > it manually now, not the best use of her time. Here's what our webmaster > wrote: > > "I need articles in MS Word, plus I need a PDF copy of the magazine so > that we can use it as a guide when posting the articles, as well as extract > the images from the PDFs for use in the online articles. We will not pull > content from the PDF copy as Acrobat does nasty things to text when you pull > it out of a PDF. It's a nightmare to work with PDF-extracted text." > > Questions: > > 1) Any help with converting PDFs to Word? > > 2) Is there a better way to do this? > > Thanks for any ideas, > > Eddy > > Eddy Goldberg, Managing Editor > > Franchise Update Media Group > > 413-256-6616 > > eddyg at franchiseupdatemedia.com > > www.franchiseupdatemedia.com > _______________________________________________ > Hidden-discuss mailing list - home page: http://www.hidden-tech.net > Hidden-discuss at lists.hidden-tech.net > > You are receiving this because you are on the Hidden-Tech Discussion list. > If you would like to change your list preferences, Go to the Members > page on the Hidden Tech Web site. > http://www.hidden-tech.net/members > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.hidden-tech.net/pipermail/hidden-discuss/attachments/20081217/5f955ecc/attachment.html