Jump to content
nicooprat

OCR: extract text from snapshot

Recommended Posts

Hi there,

 

Just sharing my first workflow. Some OCR workflow already exist but are relying on some obscur chinese API with exposed personal credentials... This one use your system own installation of `tesseract`. Just take a snapshot and paste the text. The script usually takes no more than a few seconds.

 

https://github.com/nicooprat/alfred-ocr

 

alfred-ocr.png

 

PR welcome.

Hope it helps!

Edited by nicooprat

Share this post


Link to post

Mega useful for me. Thnx.

p.s. I excluded the warning message by Alfred's utility "Replace" with regex:

Warning: Invalid resolution.*?\nEstimating resolution.*?\n

 

 

image.png.9956fa8143dd365d0cb4736756b4e829.png

Edited by bikeNik

Share this post


Link to post

It doesn't work for me.. 

 

I get the error 

 

Quote

Error, unknown command line argument '-l'

 

Reinstalled tesseract and still have this problem. I am on tesseract 4.0. 

Share this post


Link to post

I recently added the `lang` parameter with `-l` flag:

tesseract /tmp/ocr_snapshot.png stdout -l {query} 2>&1

It should allow you to run the command with `OCR fra` for french for example, but it should still work without any parameter. Tried just now and it works (Mac High Sierra & tesseract 3.05.01).

 

Possible solutions:

 

* add a way in the bash script to remove `-l` flag if `{query}` is undefined (don't know how to do this)

* downgrade to tesseract 3 as it seems to behave differently in 4

* remove this part in the bash script if you don't need it (see screenshot attached)

 

Hope it helps.

 

 

Capture d’écran 2018-12-06 à 11.27.55.png

Share this post


Link to post

I tried the solution No.3. Thanks for sharing. btw I am wondering if it is normal to have "warning" in clipboard. 

1107155509_ScreenShot2018-12-15at22_42_31.png.2d38b068bed050f815e6d8b9bbdd0696.png

 

It's very useful for me while recognizing English now. How can I make this  work with Chinese?

 

 

Share this post


Link to post
1 hour ago, ReinaSuo said:

我尝试了解决方案No.3。感谢分享。顺便说一句我想知道在剪贴板中是否有“警告”是正常的。 

1107155509_ScreenShot2018-12-15at22_42_31.png.2d38b068bed050f815e6d8b9bbdd0696.png

 

现在识别英语对我来说非常有用。我怎样才能用中文做这个工作?

 

 

I tried `brew install tesseract --with-all-languages`  again.😭

Here is what I got:

Error: An exception occurred within a child process:

  Errno::ENOENT: No such file or directory @ rb_sysopen - 

 

Edited by ReinaSuo

Share this post


Link to post
On 12/15/2018 at 3:43 PM, ReinaSuo said:

I tried the solution No.3. Thanks for sharing. btw I am wondering if it is normal to have "warning" in clipboard. 

1107155509_ScreenShot2018-12-15at22_42_31.png.2d38b068bed050f815e6d8b9bbdd0696.png

 

It's very useful for me while recognizing English now. How can I make this  work with Chinese?

 

 

The warning is certainly because you're taking the screenshot on a different monitor than your main one. Can't find anything in Tesseract to avoid this... Simplest workaround is to drag your app window to your main monitor and take the screenshot from there. Results should be much better!

 

By default, Tesseract try to guess language from results, so I guess it should work as is. In the last version of my Alfred script (see https://github.com/nicooprat/alfred-ocr/blob/master/README.md#usage), you can type "OCR [lang]" where "[lang]" is in this list: https://github.com/tesseract-ocr/tesseract/blob/b67ea2c1a70c56053e142a5fb7cc18fb29cdc4b8/src/training/language-specific.sh#L21

 

On 12/15/2018 at 5:11 PM, ReinaSuo said:

I tried `brew install tesseract --with-all-languages`  again.😭

Here is what I got:


Error: An exception occurred within a child process:

  Errno::ENOENT: No such file or directory @ rb_sysopen - 

 

 

I can't help with this one, you should seek help on the Tesseract project: https://github.com/tesseract-ocr/tesseract/issues

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×