AlfredOCR - Optical Character Recognition

zeitlings · April 29, 2023

15 minutes ago, xilopaint said:

When it comes to Alfred's workflows, transparency is important for security reasons, of course.

Sure, I completely agree and understand the concern. The binaries are signed with my personal credentials, which at least means that if something sinister were to go on, I could hardly escape the blame. I would prefer to have them notarized, but alas, Apple charges for that, and since this is just a hobby, I've decided not to do so for now. If that isn't enough to inspire confidence, I won't blame anyone, but ... 🤷‍♂️

As for the MIT license being a problem here, I don't think it's a problem at all: The software is provided "as is".

xilopaint · April 29, 2023

6 hours ago, zeitlings said:

I would prefer to have them notarized, but alas, Apple charges for that

Authority=Apple Development: pat****@*****.com (W2VCU4XR6K)
Authority=Apple Worldwide Developer Relations Certification Authority
Authority=Apple Root CA

Doesn’t the codesign output above indicate that you’re already enrolled in the Apple Developer Program, which grants you the privilege of notarizing the app without additional costs?

zeitlings · April 29, 2023

12 minutes ago, xilopaint said:

Doesn’t the codesign output above indicate that you’re already enrolled in the Apple Developer Program, which grants you the privilege of notarizing the app without additional costs?

Nope

xilopaint · April 29, 2023

2 minutes ago, zeitlings said:

Nope

I know there’s an annual cost for the enrollment. My question was about the codesign output for your binary:

Authority=Apple Development: pat****@*****.com (W2VCU4XR6K)
Authority=Apple Worldwide Developer Relations Certification Authority
Authority=Apple Root CA

I thought this kind of output implied that you were already enrolled, which means you have already paid for the annual membership and are able to notarize binaries without any additional costs. If it's not true, what kind of certificate is this?

zeitlings · April 29, 2023

Whatever kind of certificate Apple grants you for development when you log in with your Apple-ID I guess ¯\_(ツ)_/¯

xilopaint · April 30, 2023

6 hours ago, zeitlings said:

Whatever kind of certificate Apple grants you for development when you log in with your Apple-ID I guess ¯\_(ツ)_/¯

Could you guide me on how to sign a binary the way you did?

Edited April 30, 2023 by xilopaint

xilopaint · April 30, 2023

5 hours ago, xilopaint said:

Could you guide me on how to sign a binary the way you did?

Never mind, I just managed to do it by configuring the "Signing and Capabilities" section in Xcode.

Edited April 30, 2023 by xilopaint

zeitlings · April 30, 2023

1 hour ago, xilopaint said:

Never mind, I just managed to do it by configuring the "Signing and Capabilities" section in Xcode.

Ah, yes, for the bare minimum, checking "Automatically manage signing" and selecting a Team should be enough. To have a Command Line Tool type project signed and populated with some metadata, I manually create an Info.plist file for the project and make sure it is embedded in the binary via Build Settings.

infoplist.png.1e14771f237219804f25cf615bb82bc2.png

Faris Najem · June 14, 2023

On 3/2/2023 at 7:27 PM, zeitlings said:

Alfred OCR

I noticed that Apple's Vision framework finally produces some usable results.

This means: No external dependencies are required to perform the OCR.

Alfred OCR Light

The workflow allows you to copy text from images using optical character recognition.

Take a snapshot with your mouse or trackpad to automatically copy the recognized text to the clipboard.

Alfred OCR+

The workflow allows you to copy text from images, or to convert PDF files into searchable PDF documents
using optical character recognition, and to apply compression to PDF documents.

1 / Snapshot
Take a snapshot with your mouse or trackpad to automatically copy the recognized text to the clipboard.

Default shortcut: ⌘+⇧+6

Default keyword: ocr

2 / PDF Document

To convert a PDF into a searchable PDF document, pass it to the workflow’s Universal Action.
To compress the resulting PDF, pass the source document on while pressing the ⌘+⇧ keys.

To open the resulting PDF, pass the source document on while pressing the ⌥+⇧ keys.

To force the replacement of a source document, pass it on while pressing the ⌥+⌘ keys.

To compress a PDF without performing OCR, pass it to the Compress PDF Document File Action.

To view the progress tracker, re-enable the workflow with the Keyword (default: ocr).

Configuration

To open the OCR Workflow Configuration, type the keyword preceeded by a colon (default: ocr).

Languages

Specify the languages you want the OCR process to consider by adding the appropriate RFC-5646 language tag. The following languages (and regions) are currently supported: en-US, fr-FR, it-IT, de-DE, es-ES, pt-BR, zh-Hans, zh-Hant, yue-Hans, yue-Hant, ko-KR, ja-JA, ru-RU, uk-UA

Explanations:

en-US: (English as used in the United States)

de-DE: (German as used in Germany)

fr-FR: (French as used in France)

it-IT: (Italian as used in Italy)

es-ES: (Spanish as used in Spain)

pt-BR: (Portuguese as used in Brazil)

ko-KR: (Korean as used in South Korea)

uk-UA: (Ukrainian as used in Ukraine)

ja-JA: (Japanese as used in Japan)

ru-RU: (Russian as used in Russia)

yue-Hant: (Traditional Cantonese)

yue-Hans: (Simplified Cantonese)

zh-Hant: (Traditional Chinese)

zh-Hans: (Simplified Chinese)

Change Log

v1.3.0 (OCR+)

Added PDF compression

Added a keyword for quick access to the workflow configuration (Alfred 5.1+)

Added Universal Action modifier option to apply compression to PDFs (⇧⌘)

Added Universal Action modifier option to open converted PDFs in the default application (⌥⇧)

Added a configuration option to open converted PDFs in the default application

Added a configuration option to specify how text should be joined when taking a snapshot

Added a File Action to compress PDF documents

Changed the modifier keys to replace a PDF and added noticeable visual cues (⌥⌘)

Changed the way an export strategy is specified by using a pop-up selection box

Improved performance

v1.2.3 (OCR+)

Fixed an error thrown due to missing workflow cache directory

Fixed snapshot tasks queuing up if they are started before the previous task has finished

Added explicit opt-out of Snapshot tasks while PDF conversions are running

v1.2.2 (OCR+)

Fixed low contrast output images produced for some PDF documents

Added progress tracker for the document recognition process

Added three options to handle document output: export to location, copy to same location and replace. Priority behavior: Replace > Copy > Export.

Added new icons

Improved output file size for PDF documents that do not already contain text

v1.1.0 (OCR Light)

Updated configuration and documentation

Added new icon

Hi zeitlings, could you, please, add an Arabic OCR? it will help me so much. 💐

zeitlings · June 17, 2023

Hey @Faris Najem, Arabic will be available as soon as Apple adds it as a supported language. Beyond that, there is nothing I could do 🤷‍♂️

Faris Najem · June 17, 2023

1 hour ago, zeitlings said:

Hey @Faris Najem, Arabic will be available as soon as Apple adds it as a supported language. Beyond that, there is nothing I could do 🤷‍♂️

Ok, I will wait. thank you so much 💐.

xilopaint · July 18, 2023

@zeitlings, the workflow is broken for me:

[21:00:20.969] OCR[Universal Action] Processing complete
[21:00:20.987] OCR[Universal Action] Passing output '/Users/xxx/Desktop/sample.pdf' to Arg and Vars
[21:00:20.989] OCR[Arg and Vars] Processing complete
[21:00:20.990] OCR[Arg and Vars] Passing output '' to Run Script
[21:00:21.014] OCR[Run Script] Processing complete
[21:00:21.016] OCR[Run Script] Passing output 'OCR Failure: Recognizer init failure (ocr)
' to Conditional
[21:00:21.017] OCR[Conditional] Processing complete
[21:00:21.017] OCR[Conditional] Passing output 'OCR Failure: Recognizer init failure (ocr)
' to Debug
[21:00:21.018] OCR[Debug] 'OCR Failure: Recognizer init failure (ocr)
', {
  compress = "0"
  DEV = ""
  export_path = "/Users/xxx/Desktop"
  export_strategy = "exp.same"
  gristle = "nl"
  key = "ocr"
  languages = "en-US, de-DE, fr-FR"
  open_pdf = "1"
  pdf_path = "/Users/xxx/Desktop/sample.pdf"
  revision = "3"
}
[21:00:21.019] OCR[Debug] Processing complete
[21:00:21.019] OCR[Debug] Passing output 'OCR Failure: Recognizer init failure (ocr)
' to Play Sound

Edited July 18, 2023 by xilopaint

zeitlings · July 19, 2023

Hey, thanks for reporting. I'll have a look at it soon.

xilopaint · August 16, 2023

On 7/19/2023 at 7:05 AM, zeitlings said:

Hey, thanks for reporting. I'll have a look at it soon.

Any news?

Afoan · August 17, 2023

There were a way i could use the arabic language.. now i can't remember how i did it. Can anyone help

zeitlings · August 19, 2023

On 8/16/2023 at 12:45 PM, xilopaint said:

Any news?

Hey, yeah. The program has undergone quite a few internal changes and I am not 100% happy with all of the results of the alterations.

However, I'm cobbling together a version that allows to fall back to the previous (v1.3.0) state if something unexpected should happen.

On 8/17/2023 at 6:57 AM, Afoan said:

There were a way i could use the arabic language.. now i can't remember how i did it. Can anyone help

Hey @Afoan, since Apple doesn't support Arabic right now I don't think it ever worked with this workflow.

Edited August 19, 2023 by zeitlings

zeitlings · August 19, 2023

OCR v1.4.0

Added bitmap compression and compression facets
Added embedding strategy options
- "Word Granularity" attempts to embed the text word by word
- "Line Oriented" is the strategy previously used (use if you encounter unexpected results)
Improved OCR embedding granularity
Fixed 'Recognizer init' error

The new default is an improved embedding strategy that tries to align every word with its underlying picture. This works best with straight forward text documents, but may sometimes yield unexpected results. If that is the case for you, you may want to opt for the "Line Oriented" embedding strategy, which was the default up until now. The compression facet "Aggressive (Quartz)" was the default until now.

NB: Please note that only compressing PDF documents that already contain text with any bitmap method may result in inaccuracies with respect to the position of embedded words.

xilopaint · August 19, 2023

@zeitlings The software got stuck on my first attempt with the “Word Granularity” strategy, so I had to kill the ocr process and delete the cache file to make it work with the “Line Oriented” strategy, otherwise the workflow will be permanently unusable. To fix the bug you need to ensure the cache is cleared or rewritten at the start of each run.

Edited August 19, 2023 by xilopaint

zeitlings · August 19, 2023

@xilopaint, is this happening with all PDFs you are testing? Nothing sticks on my end 🤷‍♂️ The workflow cleans up after itself, which should only fail if there is a proper panic at some point. Is it possible that an old progress log file was still around from before updating?

xilopaint · August 19, 2023

3 hours ago, zeitlings said:

@xilopaint, is this happening with all PDFs you are testing?

I just tried with a second file. This time the process has been completed with the “Word Granularity” strategy but the OCR quality with "Line Oriented" is way better.

3 hours ago, zeitlings said:

Is it possible that an old progress log file was still around from before updating?

It doesn't matter. After clearing the cache I always face the same issue with “Word Granularity” if I run the workflow with the first PDF file.

Edited August 19, 2023 by xilopaint

zeitlings · August 20, 2023

Could you share a - perhaps truncated - version of the PDF file where this happens? Maybe I can trace what happens under the hood and where things go wrong with it. Thanks for making the tests!

May I ask what kind of PDFs you usually scan? (My primary test source are journal articles).

Edited August 20, 2023 by zeitlings

xilopaint · August 20, 2023

Unfortunately I can't share the files.

TomBenz · October 15, 2023

https://github.com/zeitlings/alfred-workflows/releases/tag/v1.1.1-ocr

In OCR Light, please consider adding universal file action to process OCR image file and copy the content to clipboard. This is similar to one done for PDF.

zeitlings · November 10, 2023

On 10/15/2023 at 5:11 AM, TomBenz said:

In OCR Light, please consider adding universal file action to process OCR image file and copy the content to clipboard.

Added to version 1.2.0

OCR Light v1.2.0

Add File Action to extract text from images
Fix for macOS Sonoma (Compiles the script en passant to compensate for the failure to link objc symbols on macOS 14).

Edited November 11, 2023 by zeitlings

AlfredOCR - Optical Character Recognition

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in