There are numerous products on the market that will extract text from pdf files, including Adobe Acrobat itself, which allows you to save a text-based pdf file as a MS Word or plain text file. Both Nuance (formerly Scansoft) and ABBYY, the two leading purveyors of OCR software, sell software that converts to and from pdf files but also use OCR to extract text from pdf files that consist of scanned images. Both claim to be able to output to Excel files directly. Both cost $100 and both use activation over the Internet to restrict their usage to one computer, which makes them unacceptable to me (since I work on more than one computer), but others may not object. Nuance also sells a $50 version that can only extract from, but not create, pdf files. A friend who has published prolifically in academic publications over many decades has successfully used ABBYY's Transformer to extract papers he had written from JStor's scanned image pdf files into Microsoft Word. Jan Werner __________ Bill Bither wrote: > ** The author of this post was a Good Dobee. > ** You too can help the group > ** Fill out the survey/skills inventory in the member's area. > ** If you did, we all thank you. > > >> I'm looking for software that can convert PDF to text, and ideally > that >> has some options for reformatting or global replacement, since what I >> most often need to do is to break a PDF file down into fields and spit > >> out tab-delimited text. I've been using a freeware product that just >> isn't reliable when big files are involved. Does anyone have > something >> they like for this? > > I noticed that some of the responses have directed you to an OCR > product. Most PDF files have text already stored in them, so what you > really need is a product that will extract the text out of the PDF. > This is much more reliable than OCR. There is actually a local software > company (www.snowtide.com) that does this but the product is a developer > toolkit more for the enterprise market. Ask for Chas, he might give a > local software company a deal. > > OCR would be required only if the PDF contained an image, without text. > In that case I'm unaware of an off the shelf product that would > accomplish this. We offer OCR and PDF Rasterization technology that can > do this for the developer. > > While we're on the topic of OCR and PDF's, we will be releasing a beta > of an application that will generate searchable PDF's from scanned image > documents. Send me an email if you're interested in testing it out. > > Best Regards, > > Bill Bither > Atalasoft, Inc. > www.atalasoft.com > www.billbither.com > > > > _______________________________________________ > Hidden-discuss mailing list - home page: http://www.hidden-tech.net > Hidden-discuss at lists.hidden-tech.net > > You are receiving this because you are on the Hidden-Tech Discussion list. > If you would like to change your list preferences, Go to the Members > page on the Hidden Tech Web site. > http://www.hidden-tech.net/members > >