zeitlings Posted March 2 Share Posted March 2 I noticed that Apple's Vision framework finally produces some usable results. This means: OCR without external dependencies! AlfredOCR Description: The workflow allows you to copy text from images using optical character recognition. Take a snapshot with your mouse or trackpad and the recognized text is copied to the clipboard. No external dependencies are required to perform the OCR. ‣ Download on Github xilopaint 1 Link to comment
vitor Posted March 2 Share Posted March 2 Nice! This would be a quick and useful one to add to the Gallery. Just two notes: The icon is very low resolution, it even looks pixelated in the editor. When exporting from SF Symbols you can pick the size, the recommended for workflows is 256x256px. Your repo is quite organised, but has workflows which can go in the Gallery (like this one) and others which cannot (like the dictionary workflow, due to the unsigned binary). That is OK, but because they are shared as releases (as opposed to files in the repo) it becomes harder to check for updates because GitHub only provides a unified releases feed. The more you release, the more difficult it’ll become to separate them. To be clear, posting to releases is the preferred method, just not when the repo has many unrelated workflows. Would you consider having them on their own repo, or having some files checked-in which are modified when the corresponding workflow is updated, for example? Basically the idea is to provide something which can be checked for changes. Also, may I recommend adding a Hotkey Trigger? I can see myself adding ⌘⇧6 as a natural shortcut to this one. Acidham 1 Link to comment
zeitlings Posted March 3 Author Share Posted March 3 (edited) Sure, ⌘⇧6 feels like a natural extension. There's already an updated version that also bundles a higher resolution icon. As for the dictionary workflow, it works by calling some cryptic API endpoints that are only accessible via Objective-C. Unfortunately, there is no way to do this in plain swift that I know of. I'll send you a message about the rest. Edited March 3 by zeitlings Link to comment
xilopaint Posted March 5 Share Posted March 5 On 3/2/2023 at 1:27 PM, zeitlings said: I noticed that Apple's Vision framework finally produces some usable results. This means: OCR without external dependencies! AlfredOCR Description: The workflow allows you to copy text from images using optical character recognition. Take a snapshot with your mouse or trackpad and the recognized text is copied to the clipboard. No external dependencies are required to perform the OCR. ‣ Download on Github Nice workflow. Would it be possible to make it work in PDF files via a File Action? Link to comment
zeitlings Posted March 5 Author Share Posted March 5 4 hours ago, xilopaint said: Nice workflow. Would it be possible to make it work in PDF files via a File Action? I guess so. My first experiments, from which the workflow is derived, were actually with PDF documents. I'll play around with that sometime. sepulchra and xilopaint 2 Link to comment
zeitlings Posted March 5 Author Share Posted March 5 Ok, here's a follow-up. I was thinking about converting PDFs to searchable PDFs by embedding a hidden text layer. Turns out PDFKit doesn't provide any access to the underlying PDF content streams at all, and no alternative way to embed text layers. At best, the information can be inserted as annotations, which are not embedded statically, but as objects that you can change at will. This is rather annoying, because the Preview app shows that PDFKit is very much capable of embedding text layers. Example: When you open a PDF with no text or an image in the Preview app, the "Live Text" feature lets you select and copy recognized text as if OCR had been fully performed. When exporting the PDF you can even enable "Embed Text", which does exactly what we're trying to accomplish here. (And they do sell it as a feature of PDFKit). Anyway, as it stands now, it's a convoluted process I haven't made sense of yet. Pulling the plain text out of PDFs without an OCR layer isn't a problem, though. But I'm not convinced how useful that is ¯\_(ツ)_/¯ Link to comment
sepulchra Posted March 6 Share Posted March 6 This would be a great addition if it was possible and thank you for the super useful workflow in the meantime. I've find this command line tool really useful for OCR on existing PDFs and have alfred set up to trigger with a workflow but obviously would be far more convenient if PDFKit was able to do the work instead. Link to comment
zeitlings Posted March 7 Author Share Posted March 7 😱 Try this! I managed to get some acceptable results. The internal font handling and bounding box scaling works with some heuristics for now, though. Also, since there's no progress tracking and the code is completely synchronous, it's best to test the workflow on small documents. Still, the debugger will log some landmarks that you can review after the fact. @sepulchra You're welcome 🤗 "OCRmyPDF" will most likely give you better results, and should probably remain your go-to if it is already set up. But at least here's a few steps towards a native solution 😁. sepulchra 1 Link to comment
sepulchra Posted March 7 Share Posted March 7 (edited) Hey this is great. Would it be possible to have a modifier used and give the option of overwriting the existing file instead of exporting to another location? Edited March 7 by sepulchra Link to comment
xilopaint Posted March 10 Share Posted March 10 (edited) On 3/7/2023 at 2:34 PM, zeitlings said: 😱 Try this! Wow! This is impressive and promising! Would you consider to add a suffix to the name of the OCR’d document? I think it could be an option in the User Configuration. Also, I think /tmp is not a good export location. I’d suggest to use ~/Desktop. That’s what ~/Desktop is for, so the user can later decide where to put the file. Edited March 10 by xilopaint Link to comment
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now