<div dir="ltr">I'll second the vote for Tesseract - all the work we do (e.g. <a href="https://compass.fivecolleges.edu/">https://compass.fivecolleges.edu/</a>) uses Tesseract to generate OCR and it works very well. Not perfect especially for handwritten stuff, but nothing will be.<br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"></div><div dir="ltr"><br>--<br><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:8pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Noah Smith</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:8pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Founder + CEO</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:8pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Pronouns: he/him/his. Scheduling a meeting with me? <a href="https://calendar.google.com/calendar/u/0/embed?src=noah@born-digital.com&ctz=America/New_York&mode=WEEK" target="_blank">View my availability</a></span></p><span style="font-size:8pt;font-family:Arial;font-weight:700;vertical-align:baseline;white-space:pre-wrap"><font color="#0b5394">Born-Digital</font></span><span style="font-size:8pt;font-family:Arial;color:rgb(255,102,0);font-weight:700;vertical-align:baseline;white-space:pre-wrap"> </span><span style="font-size:8pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">| 84 Russell St, Hadley MA | (413) 259-6777 | </span><span style="font-size:8pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap"> </span><span style="font-size:8pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap"><font color="#0b5394"><a href="http://born-digital.com/" target="_blank">born-digital.com</a></font></span></div></div></div></div></div><div dir="ltr"><div style="font-family:arial"><font size="1"></font></div></div></div></div></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, May 5, 2021 at 4:37 PM Rich@tnr via Hidden-discuss <<a href="mailto:hidden-discuss@lists.hidden-tech.net">hidden-discuss@lists.hidden-tech.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <p>funny you should ask - as I am just finishing digitizing 40 years
      of journals from Watervliet (NY)Shaker Village and have digitized
      a large number of resources<br>
      as part of Shakerpedia and other projects.<br>
      <br>
      SO this is not just a pick software question -- it's more about
      the overall project design:<br>
         Just a few parts:<br>
              most higher level scanners include good OCR systems --
      ABBYY is included with many PC/MAC systems<br>
              Are the journals sheet feed-able  or can they be even if
      cutting the bindings.<br>
              Or even better, has someone else digitized or will help
      digitize - just are Archive.org that has both a major archive and
      infrastructure for digitizing.<br>
              Is there the staff to handle this or what cost has to be
      covered.<br>
              What is the most effective platform, there are reason to
      get into linux systems - such as Tesseract <br>
               Once it's digitized, how will it be search -- the most
      common system for such online use is Elasticsearch, which you can
      run on AWS or almost any cloud platform.<br>
    </p>
    <p>As you can tell - there is a lot more to that question than just
      software -- there are few comments above - if you want to discuss
      this more, email off-list</p>
    <p>Stay well - Rich<br>
    </p>
    <div>On 5/5/2021 3:53 PM, Joanna Campe via
      Hidden-discuss wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large">Hi
          everyone,<br>
          <br>
          I hope you are all safe and well.<br>
          <br>
          We would like to digitize our archival hardcopy magazines, and
          we are looking for the best option. Does anyone have
          experience with this and can make a recommendation for OCR
          software? </div>
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large"><br>
        </div>
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large">We have
          tried Adobe Acrobat Pro and a couple others, but are having
          some difficulty recognizing text that is printed over images.</div>
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large"><br>
        </div>
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large">Important
          features are searchable PDF creation in a magazine format. We
          are using an Epson Perfection V500 Plus scanner, if that
          matters.</div>
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large"><br>
        </div>
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large">Your
          recommendations are much appreciated!<br>
          <br>
          My best,<br>
          <br>
          Joanna</div>
        <div class="gmail_default" style="font-family:garamond,serif;font-size:large"><br>
        </div>
        <div>
          <div dir="ltr">
            <div dir="ltr">
              <div>
                <div dir="ltr">
                  <div>
                    <div dir="ltr">
                      <div dir="ltr">
                        <div dir="ltr">
                          <div dir="ltr"><font size="4" face="garamond,
                              serif">Joanna Campe<br>
                              Executive Director<br>
                              Remineralize the Earth<br>
                              152 South Street<br>
                              Northampton, MA 01060 USA </font></div>
                          <div dir="ltr"><br>
                          </div>
                          <div dir="ltr"><font size="4" face="garamond,
                              serif">Tel: 413-563-9938</font>
                            <div><font size="4" face="garamond, serif">Email: </font><a href="mailto:jcampe@remineralize.org" style="font-family:Times;font-size:18px" target="_blank">jcampe@remineralize.org</a><font size="4" face="garamond, serif"><br>
                              </font><font size="4" face="garamond,
                                serif"><a href="http://www.remineralize.org/" target="_blank">http://www.remineralize.org</a>
                                <div style="display:inline-block;width:16px;height:16px"> </div>
                              </font></div>
                            <div><font size="4" face="garamond, serif">
                                <div style="display:inline-block;width:16px;height:16px"><br>
                                </div>
                              </font></div>
                            <div><font size="4" face="garamond, serif" color="#000000"><b>Book</b></font></div>
                            <div><font size="4" face="garamond, serif" color="#000000">Geotherapy: Innovative
                                Methods of Soil Fertility Restoration,
                                Carbon Sequestration, and Reversing CO2
                                Increase</font></div>
                            <div><font size="4" face="garamond, serif"><a href="http://www.crcpress.com/product/isbn/9781466595392" style="color:rgb(17,85,204)" target="_blank">http://www.crcpress.com/product/isbn/9781466595392</a> </font></div>
                            <div><font size="4" face="garamond, serif"><br>
                              </font></div>
                            <div><font size="4" face="garamond, serif">Please
                                join and support us on <b><a href="https://www.patreon.com/RTE" target="_blank">Patr</a></b></font><b><a href="https://www.patreon.com/RTE" target="_blank"><span style="font-family:garamond,serif;font-size:large">e</span><span style="font-family:garamond,serif;font-size:large">on</span></a></b></div>
                            <div><font size="4" face="garamond, times
                                new roman, serif"><a href="https://www.patreon.com/RTE" target="_blank">https://www.patreon.com/RTE</a></font><font size="4" face="garamond, serif"><br>
                              </font></div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset></fieldset>
      <pre>_______________________________________________
Hidden-discuss mailing list - home page: <a href="http://www.hidden-tech.net" target="_blank">http://www.hidden-tech.net</a>
<a href="mailto:Hidden-discuss@lists.hidden-tech.net" target="_blank">Hidden-discuss@lists.hidden-tech.net</a>

You are receiving this because you are on the Hidden-Tech Discussion list.
If you would like to change your list preferences, Go to the Members
page on the Hidden Tech Web site.
<a href="http://www.hidden-tech.net/members" target="_blank">http://www.hidden-tech.net/members</a>
</pre>
    </blockquote>
    <pre cols="72">-- 
Rich Roth
CEO TnR Global

Bio and personal blog: <a href="http://rizbang.com" target="_blank">http://rizbang.com</a>
Building the really big sites:      <a href="http://www.tnrglobal.com" target="_blank">http://www.tnrglobal.com</a>
Small/Soho business in the PV:        <a href="http://www.hidden-tech.net" target="_blank">http://www.hidden-tech.net</a>
Places to meet for business:        <a href="http://www.meetmewhere.com" target="_blank">http://www.meetmewhere.com</a>
And for Arts and relaxation:
<a href="http://TarotMuertos.com" target="_blank">http://TarotMuertos.com</a> - Artistic Tarot Deck
   <a href="http://www.welovemuseums.com" target="_blank">http://www.welovemuseums.com</a>
   <a href="http://www.artonmytv.com/" target="_blank">http://www.artonmytv.com/</a>
Helping move the world:             <a href="http://www.earththrives.com" target="_blank">http://www.earththrives.com</a></pre>
  </div>

_______________________________________________<br>
Hidden-discuss mailing list - home page: <a href="http://www.hidden-tech.net" rel="noreferrer" target="_blank">http://www.hidden-tech.net</a><br>
<a href="mailto:Hidden-discuss@lists.hidden-tech.net" target="_blank">Hidden-discuss@lists.hidden-tech.net</a><br>
<br>
You are receiving this because you are on the Hidden-Tech Discussion list.<br>
If you would like to change your list preferences, Go to the Members<br>
page on the Hidden Tech Web site.<br>
<a href="http://www.hidden-tech.net/members" rel="noreferrer" target="_blank">http://www.hidden-tech.net/members</a><br>
</blockquote></div>