Search the Community
Showing results for tags 'skim'.
Found 2 results
Skimmer Actions for PDF Viewer Skim To download, visit the Packal page Version: 2.2.1 Description This is a fairly simple workflow that works with the free Mac PDF app Skim. Skim is a fantastic app with great Applescript support. This workflow provides quick, easy access to a few custom Applescripts that I've written to deal with certain pesky problems I come across when dealing with PDFs.There are currently only 3 actions: Crop and Split PDFExtract Data and Search Google Scholar Search your PDFs First, Skimmer allows you to properly format those darned scanned PDFs. You know the ones I'm talking about, 2 books pages scanned into one, landscape-oriented PDF page. I want all of my PDFs in pretty, proper format with one PDF page corresponding to one portrait-oriented book/article page. In the past, it was quite the ordeal to crop the PDF so that the right- and left-hand margins were equal, and then to split each individual page and finally reconstruct the entire PDF. Skimmer makes this whole process as simple as π. You can use either a Hotkey or the Keyword split to activate this feature. Skimmer then does 3 things: Crop the PDF using a user-inserted Line Annotation (if necessary) (see image below)Split the two-page PDF into individual pagesRe-assemble everything and clean upLet me walk you thru the process. To begin, you will need to ensure that the two scanned book pages have equal margins. Skimmer will split the PDF page right down the middle, so we want the middle of the PDF to be the middle of the two pages. If the margins are unequal, you only need to use Skim's Line Annotation to create a border for Skimmer. Here's an example: Note the small, vertical line at the bottom of the page. Skimmer will crop off everything to the left of this line. You could put the line anywhere on the page. If you the right-hand margin were too big, you could put it to the right, and Skimmer would automatically crop the excess stuff to the right of that line. If both margins are too big, you can put two lines on each side and Skimmer will take care of the rest. Note, Skimmer will crop every page at this point, so find the farthest extremity on any page and use that as your guide. Skimmer can tell what page you are looking at, so it'll make things work (note that in the image above, this is one of the middle pages being used as the cropping template). Skimmer does not crop Top or Bottom Margins, so you will need to manually crop PDFs with wacky top and/or bottom margins. Once Skimmer has cropped the PDF, it will go thru and split each page into two separate pages. Depending on the length of the PDF, this can take a bit (appr. 0.67 seconds per original PDF page). This is all done invisibly tho, so that's a bonus. In order to ensure that Skimer splits the PDF properly, regardless of orientation, the script will split the first page and ask you what portion of the page you are seeing (left-hand, right-hand, top-half, or bottom-half). Your choice will ensure that Skimmer does the splitting just so. After it splits all the pages, Skimmer will save a copy of your original PDF and then close it as it opens the new, split PDF. This new PDF will be properly formatted and saved in the same folder as the original PDF. Here's an example of the PDF above after it was automatically cropped and split: For anyone who deals with lots of scanned PDFs, I can promise you, this is a godsend. The second feature will take OCR'd PDFs and try to extract relevant search information and then search Google Scholar (which will make it easy to then add citation information to your citation manager of choice. Users of ZotQuery will immediately see where I'm going with this...). This feature can be activated by a user-assigned Hotkey or by the Keyword extract when the desired PDF is open in Skim. This feature will look for three possible things in the currently viewed page: a DOI (Digital Object Identifier)an ISBN (for books)JSTOR title pageIf it cannot find any of these things, it will present the user with a list of Capitalized Words from the currently viewed page. You then select whichever words you want to be the Google Scholar query. Once the query is chosen (whether automatically as one of the 3 types above, or user-chosen keywords), Skimmer will automatically launch your default browser to Google Scholar using the query. What you do from there is up to you. Finally, you can also search through all of your PDFs and open any one of them right in Skim. Use either the keyword `skimmer` or the shorter `sk` to begin the query. Then enter your query term. The results will update as you type. You can hit `return` to open any item directly in Skim, or you can `right-arrow` to enter Alfred's file browser for that item. As I said, these are the only two functions for Skimmer currently, but I will be adding at least one more (for exporting notes) soon enough. If you have any killer Applescripts for Skim app, let me know and maybe we can add them in. Here's to PDF management, stephen
First off, I am new to programming. My interest in such things was sparked by the discovery of Alfred, and the Skimmer workflow which I use daily. This project began as a simple format change of the Evernote export Skimmer offers to suite outlines. I’ve looked at code from some workflows here, and I’m not fooling myself, in many ways what I have is still simple and is still very much based on Skimmer. What I have thus far is a functioning Applescript which builds off of the Skim export to include exporting to OmniOutliner, as well as Evernote. I take my school powerpoint lectures, convert to pdf, and mark it up with annotations. When I’m finished I export with options specific to the lecture pdf. For example, I’ve included image extraction, text correction, word frequency, etc.. Description & Screenshots GitHub Repository Short Term Goals: 1. Move the Applescript into an Alfred Workflow where I can pass option selections directly. Examples: export -oo = Export to OmniOutliner export -en = Export to Evernote export -oo -i = Export to OmniOutliner & Extract Images 2. Shorten / Clean up the Applescript I can’t help feeling like there is a better way to accomplish some of ideas in the script. The problem is that I’m blind to these areas. I don’t know there is a more optimal way to do xxx. 3. Evernote Images I’m having difficulty adding images sequentially in Evernote. It seems that I’m only able to append a note if it directly follows note creation. This makes it near impossible to loop through the annotations adding them in sequence. Instead, I have to make a list of images (from grabbing boundaries of a Skim box note), create the note with html, and then repeat through the list of images adding them all at the end of the note. I found this post which describes this limitation. The line is 373 in skim-2-oo-n-en.scpt (repository). Does anyone know a way around this? Long Term Goals: 1. Speed up the option “Find Spaces” Some PDF’s, converted from .pptx, have mangled text. In some instances when text is copied & pasted from the PDF all the spaces are removed. I’ve tried many different ways of converting the .pptx (Mac & PC PowerPoint, Online Converters, Office Online, etc.), but the issue persists. I’ve implemented this python code into Applescript with a few additions, but I imaging that it would be faster if I had the .py in its own file and pass arguments to it from the script. Is there a simple example, using Alfred, that I can study? The whole process is dependent on a word list sorted by frequency. Without factoring computer specs, the speed is a function of the number of words in the list and the amount of annotated text, and the quality is dependent on the type of words. I’ve pieced together a medical word list using Corpora whereby I made individual searches of nouns, verbs, etc. filtered by Medical, Speech and Academics. The results output a list with the word and the overall frequency of the word. I combined all the searches in a spreadsheet, sorted by frequency, and then removed duplicates. This list is good for PDF’s with medical terms, but not so good if the text is not medical. For those I used these extensive lists here. I would like to call up Alfred, type in export -oo -fs, and then in the Alfred dropdown be able to select which list I want to use. Also, the .py function breaks the string if number are contained in the annotation. Every character after a number is separated by a space (There are 7 d e a d l y s i n s). I need to figure a way to pass over numbers and resume after the number. 2. Find a new way to determine word frequency in the PDF This frequency is not related to the above. This is the top 50 words contained in the entire PDF presentation. I use it to get the gist of lecture. Currently I’m using Applescript to accomplish this, but even though my journey began with Applescript I’m finding it less appealing every day. Good for some things, but unnecessary for most. I would like to use something different, preferably python because in my uneducated state seem to think it is intuitive. However, I still need to incorporate a list of words to ignore, so that I don’t get “The word “the” appeared 147 times in document X.” 3. Change export format via Alfred Similar to my short term goal, yet different, and not needed immediately. I’ve noticed many workflows allow the user to set default options by doing something like export -d which brings up a different menu. In this menu I could see Define Default Font or Define Default Font Size then select my option and type in the font I’d like to use. I’m looking for any ideas, suggestions, examples, documentation, or forums for python similar to macscripter.net. Alfred has really changed the way I work and study, and I’m still surprised more people don’t use it or know about it.