Skimmer: PDF actions for Skim

DrLulz · January 19, 2015

I’ve been trying to clean up my fork of Skimmer. I’ve added the option to export to OmniOutliner or Evernote, extract images, clean up text, word counts, etc.

Description & Screenshots

GitHub Repository

If you don’t mind I have a few questions.

I’m having difficulty adding images sequentially in Evernote. It seems that I’m only able to append a note if it directly follows note creation. This makes it near impossible to loop through the annotations adding them in sequence. Instead, I have to make a list of images (from grabbing boundaries of a box note), create the note with html, and then repeat through the list of images adding them all at the end of the note. I found this post which describes this limitation. The line is 373 in the script. Do you know of a better way?

Although I advise everyone to try Alfred and Skimmer, some don’t want to install Alfred (I know, crazy). Is it possible to use the _skimmer.app here without the Skimmer Workflow? Can they just run the app once to use it for linking back to the page?

Edited January 19, 2015 by DrLulz

smarg19 · January 19, 2015

Dude, can I just say that your website is gorgeous and there are some seriously cool and clever things added to Skimmer in your fork. I would def be interested in merging them at some point. Again, really great stuff.

Onto the questions. First, I haven't been around this codebase in a while, so I may need to re-acquaint myself, but here are my original thoughts.

First, you can use the _skimmer.app without the Alfred Workflow. The only requirement is that you run it once for it to register the URL scheme with OS X.

As for the images problem, that required some research on my part, but I do think I have a solution. In short, you need to save the images to a temporary directory (as you do with your write_temp sub-routine, generate a URL to that file, and create an HTML image link with that URL. Then, you simply insert that image link in the appropriate place in the note HTML. As an example, here's some sample code:

set posix_path_to_image to my en_import(page_index)
set image_url to my encode_text(posix_path_to_image, false, false)
set image_html to "<div><img src=\"" & image_url & "\"/></div>"

Now you have plain HTML to add into the HTML note you are creating. It can go right after the annotation or wherever. This should give you enough to fix it up.

For the sake of testing, here's the code I used to test that this image html link would work:


tell application "Finder" to set sel to the selection as text
set image_url to "file://localhost" & my path2url(POSIX path of sel)
set image_html to "<div><img src=\"" & image_url & "\"/></div>"
tell application "Evernote" to create note with html image_html title "IMAGE TESTING" notebook my en_inbox()

on path2url(thepath)
return do shell script "python -c \"import urllib, sys; print (urllib.quote(sys.argv[1]))\" " & quoted form of thepath
end path2url

on en_inbox()
tell application "Evernote" to return name of (every notebook whose default is true)
end en_inbox

NOTE You should use the en_inbox() subroutine, since not everyone uses a notebook entitled "Inbox" as their default notebook.

Hope this helps, and hopefully we can integrate all this cool stuff together at some point.

DrLulz · January 21, 2015

Thanks for the information and the kind words.

Apparently Yosemite has an issue with getting files via file://localhost. The link is a bit off the mark, but describes my problem.

Evernote just crashes when I try to add anything with that scheme. OmniOutliner gives the message below when trying to get the image (that smiling finder icon makes me want to slap my computer). If I configure apache I can do http://localhost/~DrLulz/image.png, but I don't want to go that route because users may get stuck. Thoughts?

Edited January 21, 2015 by DrLulz

smarg19 · January 21, 2015

Hmm... I'm on Yosemite, and the test code I posted above worked fine for me. Did you try that test script and it fail? Or did that simple example work and a more real-world test fail? And what version of Evernote do you have?

DrLulz · January 21, 2015

I was hoping you wouldn't say that... Yep, the script crashes Evernote every time.

Edit: EN v6.0.5

I can do this:

set image_html to "<div><img src=\"http://cdn.searchenginejournal.com/wp-content/uploads/2013/09/google-panda-penguin.jpg\"/></div>"

tell application "Evernote" to create note with html image_html title "IMAGE" notebook my en_inbox()
on en_inbox()
tell application "Evernote" to return name of (every notebook whose default is true)
end en_inbox

But not:

set image_html to "<div><img src=\"file://localhost/Users/drlulz/Desktop/en_image/image.png\"/></div>"

Edited January 21, 2015 by DrLulz

smarg19 · January 21, 2015

Odd. I'm on the beta channel for Evernote, so I have version 6.0.6 Beta 1 (451237 Direct). What version exactly do you have?

I've re-tested and it still works on my machine.

DrLulz · January 21, 2015

6.0.5 (451190 App Store)

Did you do a clean install when you upgraded? I did, would that matter?

Edit: The scheme works in Safari.

Edited January 21, 2015 by DrLulz

DrLulz · January 21, 2015

I updated to 6.0.6 Beta 1 (451237 Direct)

Works like a charm. <_<

DrLulz · January 24, 2015

I thought I'd put this here, if it's not the right place let me know. The dropbox link below is a somewhat cohesive version of my fork.

A few more questions.

If I want to include _skimmer.app in the workflow do I have your permission, and if yes what is the correct way to credit you?

In your experience how often do new app releases break workflows? Initially I had everything setup correctly in the above posts, but I never thought Evernote itself was the issue, and I still think its odd that file://localhost will not run from OmniOutliner after Yosemite.

Dropbox

Edited January 26, 2015 by DrLulz

derekvan · January 24, 2015

Hi Stephen,

I've finally gotten around to experimenting with your export script. It's amazing! With devonThink's new support of Markdown, this makes it easy to dump MD into DT.

I've just got a question about customizing the sort in the script. Would it be possible to have the highlight's sorted by page number (instead of alpha or timestamp)?

Thanks again for making these tools, and packaging them for easy use.

Derek

smarg19 · January 24, 2015

I thought I'd put this here, if it's not the right place let me know. The dropbox link below is a somewhat cohesive version of my fork.

A few more questions.

If I want to include _skimmer.app in the workflow do I have your permission, and if yes what is the correct way to credit you?

In your experience how often do new app releases break workflows? Initially I had everything setup correctly in the above posts, but I never thought Evernote itself was the issue, and I still think its odd that file://localhost will not run from OmniOutliner after Yosemite.

Dropbox

I guess the best way to credit would be to alter the dialog box that pops up when you first run the app (to register the URL scheme). You could just put "Created by Stephen Margheim" there or something. I'm not really that picky.

As to apps breaking workflows, I'd say very infrequently. Clearly it happens, but I've had very, very few times where a different version of the app was at fault. It is, though, one reason why you should always have people give you the workflow version, app version(s), and debug output when they report an error.

smarg19 · January 24, 2015

Hi Stephen,

I've finally gotten around to experimenting with your export script. It's amazing! With devonThink's new support of Markdown, this makes it easy to dump MD into DT.

I've just got a question about customizing the sort in the script. Would it be possible to have the highlight's sorted by page number (instead of alpha or timestamp)?

Thanks again for making these tools, and packaging them for easy use.

Derek

Derek, I plan on one day (hopefully sooner, rather than later) really revamping this workflow and making it world-class, but I just don't have the time right now. However, this is a feature that I've wanted to implement for a while, and you caught me at a great time. I don't have the time to update the whole workflow and all that jazz, but you can use this version of the export.applescript to have all annotation types sorted by page number, not by timestamp. Just copy the code from this Gist and paste it into that file in the workflow directory. I've tested it some, so hopefully it'll just work, but I can't promise debugging speeds at anything like what this was.

derekvan · January 25, 2015

Derek, I plan on one day (hopefully sooner, rather than later) really revamping this workflow and making it world-class, but I just don't have the time right now. However, this is a feature that I've wanted to implement for a while, and you caught me at a great time. I don't have the time to update the whole workflow and all that jazz, but you can use this version of the export.applescript to have all annotation types sorted by page number, not by timestamp. Just copy the code from this Gist and paste it into that file in the workflow directory. I've tested it some, so hopefully it'll just work, but I can't promise debugging speeds at anything like what this was.

Tried this out a few times, and it looks like it works great. This workflow works great for me--no worries if you've not time to improve it! Thanks so much for making this small change for me. I was able to customize some elements of the script, but the sort mechanism is over my head.

wendellpbloyd · May 1, 2015

Hi all, someone else has some problems with the latest version of the script?

From the moment I update it starts to not recognise the textual category during the highlights export.

I tried to reinstall it, but nothing :\

I do not get any error, when I export a note it simply says

[INFO: alfred.workflow.input.keyword] Processing output 'alfred.workflow.action.script' with arg '' 
[INFO: alfred.workflow.action.script] Processing output 'alfred.workflow.output.notification' with arg 'Exported notes to Evernote as HTML

Seems that the color are set correctly, therefore I have no idea why it is not working.

In alternative, it would be possible to download a previous version somewhere?

rice.shawn · May 2, 2015

Since Packal uses Github as a backend for the repository, you can usually find older versions in the version history for the file. Here's the one for skimmer, if you want to try to downgrade and see if that fixes your problem.

wendellpbloyd · May 11, 2015

Thanks!

I actually downgrade at 2.2.1 and now it works perfectly

Cassady · May 13, 2015

Wow.

How I wish I had found your workflow 5 years ago, BEFORE I tried to OCR double page scans!

Thank you. It will prove invaluable going forward, and will save many hours of time!

Moses · November 7, 2015

Thanks!

I actually downgrade at 2.2.1 and now it works perfectly

Same issue ( not recognising the textual category ), followed the same steps and I am still not getting it working, still super useful, though. Thanks as always SMarg19

Moses · November 7, 2015

I am trying to Tweak the export script, currently working with:

--If user hasn't configured, set defaults

set highlight_rec to {{_title:"#", _color:{193, 255, 193}}, {_title:"##", _color:{152, 245, 255}}, {_title:"###", _color:{255, 222, 173}}, {_title:"Notes", _color:{238, 238, 0}}, {_title:"Support", _color:{139, 139, 0}}, {_title:"Contentious", _color:{255, 193, 193}}}

I could not identify which colour nomenclature was in use, so I used RGB, it is not working out, any suggestions please on which nomenclature is in use with AppleScript, I use this colour picker site: http://www.farb-tabelle.de/en/rgb2hex.htm?q=RosyBrown1,

Thanks

p.s. Setting up the skimmer_config_new.pdf did not work for me,

Moses · November 8, 2015

I am trying to Tweak the export script, currently working with:

--If user hasn't configured, set defaults

set highlight_rec to {{_title:"#", _color:{193, 255, 193}}, {_title:"##", _color:{152, 245, 255}}, {_title:"###", _color:{255, 222, 173}}, {_title:"Notes", _color:{238, 238, 0}}, {_title:"Support", _color:{139, 139, 0}}, {_title:"Contentious", _color:{255, 193, 193}}}

I could not identify which colour nomenclature was in use, so I used RGB, it is not working out, any suggestions please on which nomenclature is in use with AppleScript, I use this colour picker site: http://www.farb-tabelle.de/en/rgb2hex.htm?q=RosyBrown1,

Thanks

p.s. Setting up the skimmer_config_new.pdf did not work for me,

See, http://www.organognosi.com/skim-and-highlighting-color-codes/#codesyntax_1, for a method(s) on obtaining the Skim Colour Codes,

Moses · November 8, 2015

See, http://www.organognosi.com/skim-and-highlighting-color-codes/#codesyntax_1, for a method(s) on obtaining the Skim Colour Codes,

So, changing the defaults as I have just described is working for me, Great!

Yoyontzin · December 18, 2015

Hi,

Thanks again for this wonderful workflow! I really like it. I have't use it for a wile but today I was trying to use the split PDF script from this workflow and I got as a result a PDF with empty odd pages. I run the debug and here the result:

[iNFO: alfred.workflow.input.keyword] Processing output 'alfred.workflow.action.script' with arg ''

and that is it!!

I tried with different pdfs and the result was the same... empty every other page. Can you please help me? Thanks a lot!

kithairon · January 26, 2016

After a lapse of some months in the use of Skimmer I tried splitting a shortish pdf yesterday (v3.2.1 / El Cap / Skim 1.4.17). The script seems to run as usual but the pages end up simply doubled but not split. Are there known incompatibilites with latest OS or Skim versions? Any news on a fix? (Grateful user here: I've successfully tidied my huge pdf-library with it.)

dfay · June 29, 2016

This problem is b/c smargh's Applescript action_pdf-splitter.scpt in the workflow bundle calls /System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py , an embedded script in one of the stock Automator actions. Unfortunately this Automator action doesn't seem to work in El Cap.

if you open action_pdf-splitter.scpt in Script Editor and comment out the line

-- tell application "Finder" to delete (every item of (cache_path as alias))

you can look in the workflow cache directory ~/Library/Caches/com.runningwithcrayons.Alfred-2/Workflow Data/com.hackademic.skimmer and you'll find the PDFs for individual pages there. You can then combine them in Preview or using Automator's New PDF from Images action

* on further testing the script in the Combine PDF Automator action works fine with most PDFs, just not the ones generated by the Skim Applescript. But the New PDF from Images action works. Go figure.

I edited the (*HANDLERS*) section of the script as shown below:

on combinePDFPages(_title, cache_path, temp_path, orig_path)
	--convert AS paths to POSIX paths
	set orig_posix to (POSIX path of (orig_path as alias)) as string
	set temp_posix to (POSIX path of (temp_path as alias)) as string
	
	--prepare final PDF file path
	set _file to orig_posix & _title & "_split.pdf"
	
	(*	--combine ALL individual PDF pages into new, single PDF
	do shell script "\"/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py\" -o " & (quoted form of POSIX path of _file) & space & (quoted form of temp_posix) & "*.pdf"
	
	--delete the individual page PDFs
	tell application "Finder" to delete (every item of (cache_path as alias))
	
	--open new PDF
	set _file_ to ((POSIX file _file) as alias)
	tell application "Skim"
		open _file_
		go front document to page 1 of front document
	end tell *)
	
	tell application "Finder" to open (cache_path as alias)
	
end combinePDFPages

basically replacing the actual combining of PDFs with opening a Finder window to the folder where all the individual pages are stored. I use this infrequently enough that I'll just proceed manually from there.

Edited June 29, 2016 by dfay

FrancescoCaviglia · August 12, 2016

This solution works great, thanks Dfay!

A minor problem (which may be unrelated with your Skimmer) is that the resulting PDF-files sometimes may contain wrong information on the page orientation.

I OCRed a large file made with PDFScanner, which recognized the text well but saved the searchable text with a different orientation in a few pages.

I had not experienced this problem with other scans, but I have not tried this with many files)

As a workaround (thanks to people at PDFScanner) you can:

Select the pages and run edit->convert to black and white before performing ocr. That will rerender the pages and therefore correct the orientation information.

Skimmer: PDF actions for Skim

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

smarg19

rice.shawn

DrLulz

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in