OCR?

biofrequencies's Avatar

biofrequencies

13 Feb, 2020 12:28 AM

Does Anki offer any OCR capabilities?
I have a deck with tons of images and very clear text in the images. It would be so helpful to be able to search through that text.

Thanks.

  1. 1 Posted by ijg on 15 Feb, 2020 12:44 PM

    ijg's Avatar

    No this is not in Anki.

    For manually doing is there's software that does OCR and parts of your screen and puts the output into your clipboard so that you can paste it.

    you write that you have tons of images so you probably want something automated. So you want an add-on. You should search this forum, the anki subreddit and do a google search. But I think such an add-on does not exist though multiple have asked for something like this over the years. Did I miss something?

    You need an external OCR software. Windows, Mac, Linux are different. At the moment I can only think of one software that runs on all three OSs - the free and open source software tesseract. Downside: you need the command line to install it on Macs which is unusual and more complicated than the usual drag and drop. But it shouldn't be too hard to wrap this software in an add-on and include the output of the ocr of all images to a special field.

    Maybe wait and hope? Some people have paid add-on developers to create custom add-ons, see here. If you really really care about this and have some money for this I think that recently the reddit user /u/arthurmilchior (who has about 65 add-ons published on ankiweb) was hired a few times.

  2. 2 Posted by biofrequencies on 16 Feb, 2020 06:27 PM

    biofrequencies's Avatar

    Ok I found an alternative solution.

    I copied all the pictures in my folder for the deck I wanted, added them to Microsoft Word (LibreOffice Writer can't handle large documents), then saved it as a PDF file. I had to split the PDF files because there's a limit to what my computer can handle. I had 2000-3500 pages per PDF file of pictures.

    Then I used Abby FineReader to do OCR on each of these giant documents, because Adobe Acrobat can't handle doing OCR on documents more than about 60 pages of images.

    Then I asked Abby FineReader to save the resulting PDF in Searchable PDF format.

    Then I used Adobe Pro to catalog an Index of all the giant PDF documents together.

    Now I can instantly search what I'm looking for among these thousands of images, in all three documents, with one search bar.

    I attached an image example. Had to save it as zip to upload here.

  3. 3 Posted by ijg on 17 Feb, 2020 12:58 AM

    ijg's Avatar

    Thanks for sharing this write-up.

Comments are closed, but you can start a new discussion.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac