I'll second the vote for Tesseract - all the work we do (e.g. https://compass.fivecolleges.edu/) uses Tesseract to generate OCR and it works very well. Not perfect especially for handwritten stuff, but nothing will be. -- Noah Smith Founder + CEO Pronouns: he/him/his. Scheduling a meeting with me? View my availability <https://calendar.google.com/calendar/u/0/embed?src=noah@born-digital.com&ctz=America/New_York&mode=WEEK> Born-Digital | 84 Russell St, Hadley MA | (413) 259-6777 | born-digital.com On Wed, May 5, 2021 at 4:37 PM Rich at tnr via Hidden-discuss < hidden-discuss at lists.hidden-tech.net> wrote: > funny you should ask - as I am just finishing digitizing 40 years of > journals from Watervliet (NY)Shaker Village and have digitized a large > number of resources > as part of Shakerpedia and other projects. > > SO this is not just a pick software question -- it's more about the > overall project design: > Just a few parts: > most higher level scanners include good OCR systems -- ABBYY is > included with many PC/MAC systems > Are the journals sheet feed-able or can they be even if cutting > the bindings. > Or even better, has someone else digitized or will help digitize - > just are Archive.org that has both a major archive and infrastructure for > digitizing. > Is there the staff to handle this or what cost has to be covered. > What is the most effective platform, there are reason to get into > linux systems - such as Tesseract > Once it's digitized, how will it be search -- the most common > system for such online use is Elasticsearch, which you can run on AWS or > almost any cloud platform. > > As you can tell - there is a lot more to that question than just software > -- there are few comments above - if you want to discuss this more, email > off-list > > Stay well - Rich > On 5/5/2021 3:53 PM, Joanna Campe via Hidden-discuss wrote: > > Hi everyone, > > I hope you are all safe and well. > > We would like to digitize our archival hardcopy magazines, and we are > looking for the best option. Does anyone have experience with this and can > make a recommendation for OCR software? > > We have tried Adobe Acrobat Pro and a couple others, but are having some > difficulty recognizing text that is printed over images. > > Important features are searchable PDF creation in a magazine format. We > are using an Epson Perfection V500 Plus scanner, if that matters. > > Your recommendations are much appreciated! > > My best, > > Joanna > > Joanna Campe > Executive Director > Remineralize the Earth > 152 South Street > Northampton, MA 01060 USA > > Tel: 413-563-9938 > Email: jcampe at remineralize.org > http://www.remineralize.org > > > *Book* > Geotherapy: Innovative Methods of Soil Fertility Restoration, Carbon > Sequestration, and Reversing CO2 Increase > http://www.crcpress.com/product/isbn/9781466595392 > > Please join and support us on *Patr <https://www.patreon.com/RTE>**eon > <https://www.patreon.com/RTE>* > https://www.patreon.com/RTE > > _______________________________________________ > Hidden-discuss mailing list - home page: http://www.hidden-tech.netHidden-discuss@lists.hidden-tech.net > > You are receiving this because you are on the Hidden-Tech Discussion list. > If you would like to change your list preferences, Go to the Members > page on the Hidden Tech Web site.http://www.hidden-tech.net/members > > -- > Rich Roth > CEO TnR Global > > Bio and personal blog: http://rizbang.com > Building the really big sites: http://www.tnrglobal.com > Small/Soho business in the PV: http://www.hidden-tech.net > Places to meet for business: http://www.meetmewhere.com > And for Arts and relaxation:http://TarotMuertos.com - Artistic Tarot Deck > http://www.welovemuseums.com > http://www.artonmytv.com/ > Helping move the world: http://www.earththrives.com > > _______________________________________________ > Hidden-discuss mailing list - home page: http://www.hidden-tech.net > Hidden-discuss at lists.hidden-tech.net > > You are receiving this because you are on the Hidden-Tech Discussion list. > If you would like to change your list preferences, Go to the Members > page on the Hidden Tech Web site. > http://www.hidden-tech.net/members > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.hidden-tech.net/pipermail/hidden-discuss/attachments/20210506/85d6aae2/attachment-0001.html>