[Hidden-tech] Recommendation for OCR software for digitizing magazines

Rich@tnr rich at tnrglobal.com
Wed May 5 20:36:52 UTC 2021


funny you should ask - as I am just finishing digitizing 40 years of 
journals from Watervliet (NY)Shaker Village and have digitized a large 
number of resources
as part of Shakerpedia and other projects.

SO this is not just a pick software question -- it's more about the 
overall project design:
    Just a few parts:
         most higher level scanners include good OCR systems -- ABBYY is 
included with many PC/MAC systems
         Are the journals sheet feed-able  or can they be even if 
cutting the bindings.
         Or even better, has someone else digitized or will help 
digitize - just are Archive.org that has both a major archive and 
infrastructure for digitizing.
         Is there the staff to handle this or what cost has to be covered.
         What is the most effective platform, there are reason to get 
into linux systems - such as Tesseract
          Once it's digitized, how will it be search -- the most common 
system for such online use is Elasticsearch, which you can run on AWS or 
almost any cloud platform.

As you can tell - there is a lot more to that question than just 
software -- there are few comments above - if you want to discuss this 
more, email off-list

Stay well - Rich

On 5/5/2021 3:53 PM, Joanna Campe via Hidden-discuss wrote:
> Hi everyone,
>
> I hope you are all safe and well.
>
> We would like to digitize our archival hardcopy magazines, and we are 
> looking for the best option. Does anyone have experience with this and 
> can make a recommendation for OCR software?
>
> We have tried Adobe Acrobat Pro and a couple others, but are having 
> some difficulty recognizing text that is printed over images.
>
> Important features are searchable PDF creation in a magazine format. 
> We are using an Epson Perfection V500 Plus scanner, if that matters.
>
> Your recommendations are much appreciated!
>
> My best,
>
> Joanna
>
> Joanna Campe
> Executive Director
> Remineralize the Earth
> 152 South Street
> Northampton, MA 01060 USA
>
> Tel: 413-563-9938
> Email: jcampe at remineralize.org <mailto:jcampe at remineralize.org>
> http://www.remineralize.org <http://www.remineralize.org/>
>
> *Book*
> Geotherapy: Innovative Methods of Soil Fertility Restoration, Carbon 
> Sequestration, and Reversing CO2 Increase
> http://www.crcpress.com/product/isbn/9781466595392
>
> Please join and support us on *Patr <https://www.patreon.com/RTE>**eon 
> <https://www.patreon.com/RTE>*
> https://www.patreon.com/RTE
>
> _______________________________________________
> Hidden-discuss mailing list - home page: http://www.hidden-tech.net
> Hidden-discuss at lists.hidden-tech.net
>
> You are receiving this because you are on the Hidden-Tech Discussion list.
> If you would like to change your list preferences, Go to the Members
> page on the Hidden Tech Web site.
> http://www.hidden-tech.net/members

-- 
Rich Roth
CEO TnR Global

Bio and personal blog: http://rizbang.com
Building the really big sites:      http://www.tnrglobal.com
Small/Soho business in the PV:        http://www.hidden-tech.net
Places to meet for business:        http://www.meetmewhere.com
And for Arts and relaxation:
http://TarotMuertos.com - Artistic Tarot Deck
    http://www.welovemuseums.com
    http://www.artonmytv.com/
Helping move the world:             http://www.earththrives.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.hidden-tech.net/pipermail/hidden-discuss/attachments/20210505/ad3d1ca5/attachment.html>


Google

More information about the Hidden-discuss mailing list