Jump to content

Recommended Posts

Reference Importer: search for an article/book from a variety of sources and import the corresponding reference data (BibTex, PDF) into BibDesk, copy BibTeX to clipboard, go to the landing page for the article, or copy a formatted reference. Also supports reference lookup from a PDF file. (This workflow was formerly known as "Citation Search" and "AIAA Search")

 

(follow link to GitHub for a more detailed README and download)

 

Update 5/22/2013: added conversion of non-latin characters to proper LaTeX.  also removed default cite-key so that BibDesk can populate citekey with user defined style.

Update 5/28/2013: fix due to slight change of crossref.org API

Update 12/4/2013: major update including improved parsing, getting PDFs in addition to BibTeX, Google Books, Google Scholar, reverse PDF lookup.  more details in comments below and in README

Update 12/6/2013: improvement to DOI parsing from PDF text

Update 3/15/2014: minor fixes and improvements

Update 4/2/2014: unicode fix

Update 4/3/2014: another unicode fix, this one for google books

Edited by andrewning

Share this post


Link to post

what database(s) does this search?

 

never mind...just saw the read me.  This is pretty sweet.

Edited by dfay

Share this post


Link to post

This is a real time-saver, thanks again.

 

Would it be possible to add an option to harness BibDesk to produce a formatted citation on the clipboard? (through a modifier key or separate command).

 

This should just involve adding a few lines to the applescript that calls BD

 

 

set thePub to selection of document 1

 

export document 1 using template "template name" to clipboard for thePub

Share this post


Link to post

Good suggestion.  The API I'm using through doi.org actually has the ability to return formatted citations instead of bibtex, so I thought about including that functionality.  However, to use it you have to know the exact name of the citation style and the although some were abbreviated many were ridiculous long (and it wasn't a feature I would use very often) so I left it out.  I didn't think about accomplishing it through BibDesk.  I'll look into it later this week.

Share this post


Link to post

@dfay I've starting moved in the direction you suggested.  I like your idea of going through BibDesk, but I was hoping to come up with a more generally useful solution for people who don't use BibDesk and I'm not sure I like opening BibDesk just to format a reference for the clipboard.  I added a feature to optionally copy a formatted reference using the API through doi.org, but there are some limitations I'm still looking into (read more here).  I'll continue to investigate.  I also added an option to go directly the article's landing page.  

Edited by andrewning

Share this post


Link to post

I updated this workflow and changed the scope to more than just BibTeX related actions so I've renamed the workflow and moved the discussion to here (because it doesn't look like I can change the title of this discussion.

 

For future reference, after you choose “edit” on the post, you have to pick “use full editor”, to change the title.

Share this post


Link to post

For future reference, after you choose “edit” on the post, you have to pick “use full editor”, to change the title.

Ah, thanks.  I'll stay in this thread then and remove the new one I just created.

Share this post


Link to post

Excellent, thanks.  I will (probably) mostly use this to grab a quick reference for an email, so the restriction to APA format is not a big deal.

Share this post


Link to post

Update to convert non-latin characters to proper LaTeX.  If you come across a reference that doesn't work let me know.

I also removed the default citekey so that BibDesk can populate citekey with what style you've defined.

Share this post


Link to post

Hmmm....this was working great until this morning...now every search is resulting in an error - but the error message includes a doi URL that works fine when copied and pasted.  I'm wondering if there was a change in the doi API that broke the workflow?

 

 Here's an example of what's returned if I use the "copy formatted reference" command:

 

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 
<html>
<head>
<title>DOI Naming Authority [http:] Not Found</title>
</head>
<body bgcolor="#ffffff">
 
<table border="0" width="680" cellspacing="2" cellpadding="2">
<colgroup> <col width="10%"> <col width="90%"> </colgroup>
<tr>
<td align="center" colspan="2">
<img src="http://www.doi.org/images/banner2.gif" alt="The DOI Logo" width="680" height="53" border="0" align="middle">
<h3 align="center">Error - DOI Naming Authority [http:] Not Found</h3>
</td>
</tr>
 
<tr>
<td align="left" valign="top">     </td>
 
<td align="left" valign="top">
<hr noshade width="80%" size="1">
 
<p>The DOI you requested -- </p>
<p><b>http://dx.doi.org/10.1080/13504639851663</b></p> <p> -- cannot be found in the Handle System.</p>
<p>Possible reasons for the error are:</p>
<ul>
<li>the DOI has not been created </li>
<li>the DOI is cited incorrectly in your source</li>
<li>the DOI does not resolve due to a system problem</li>
</ul>
 
 
 
 
 
</td>
</tr>
 
<tr>
<td align="left" valign="top">    </td>
<td><hr noshade width="80%" size="1">
<p>If you believe you have requested a DOI that should be found, you may report this error by filling out the form below:</p>
<form action="http://notfound.doi.org/DoiError/servlet" method="POST" enctype="application/x-www-form-urlencoded" name="notFoundForm">
 
<table width="100%" border="0" cellspacing="2" cellpadding="2">
<tr>
<td>
<table width="100%" border="0" align="center" cellpadding="2" cellspacing="2">
<tr>
<th width="35%" align="right" scope="row"><label>Missing DOI:</label></th>
<td width="65%"><input name="missingHandle" type="text" value="http://dx.doi.org/10.1080/13504639851663" size="35" readonly="true"></td>
</tr>
<tr>
<th align="right" scope="row"><label>Referring Page:</label></th>
<td><input name="referringPage" type="text" value="" size="35" readonly="true"></td>
</tr>
<tr>
<th align="right" scope="row">E-mail address:</th>
<td><input name="userEmailAddress" type="text" value="Please enter your email address" size="35"></td>
</tr>
<tr>
<th align="right" scope="row">Comments:</th>
<td><textarea name="comments" cols="25" rows="3"></textarea></td>
</tr>
</table>
</td>
</tr>
<tr>
<td align="center"><p><input name="send" type="submit" value="Submit"></p></td>
</tr>
</table>
 
</form>
 
</td>
</tr>
<tr><td align="left" valign="top">
    </td> <td>
<hr noshade width="80%" size="1"> <p align="center"> </td>
</tr>
</table>
 
<p><strong><em><small><a href="http://www.doi.org/">DOI Web Site</a><br></small></em></strong></p>
 
</body></html>

Share this post


Link to post

Thanks for the report.  You're right, looks like there is a minor change in the json results that are being returned from the crossref API.

 

I've fixed it, but won't be able to update the repository until tonight.  

If you want a fix right now, in doi.py change the following 2 lines

line 33: doi = j['doi'].split('dx.doi.org/')[1]  
line 36: attributes={'uid': doi, 'arg': doi},

That should do the trick.  I'll push out the update tonight.

Share this post


Link to post

Yep, that fixed it!  Thanks a bunch.  It was only when it broke that I realized how much this has become an essential part of my research and reading.

Share this post


Link to post

Yep, that fixed it!  Thanks a bunch.  It was only when it broke that I realized how much this has become an essential part of my research and reading.

 

Glad to hear it.  I've pushed the fixed version to github.

Share this post


Link to post

I've just pushed a major update to GitHub (and changed the name of the workflow).  Some of the changes include:

  • in additional to getting the BibTeX it can auto download PDFs and link them to the BibDesk entry where possible
  • Google Books -- can now get BibTeX for textbooks
  • Google Scholar -- get BibTeX and PDFs from Google Scholar
  • parsing improvements for the main CrossRef search
  • Reverse PDF Search -- starting with a PDF on your computer you can use a keyboard shortcut or a File Action to scan the PDF for a DOI (or some relevant search terms if it can't find a DOI) and start a reference lookup search,  It will then import the BibTeX and link the PDF in BibDesk.
  • AIAA Search -- A search for papers from the American Institute of Aeronautics and Astronautics.  Can import BibTeX and PDFs (AIAA subscription required for PDFs).  Formerly this was a separate workflow but since it shares so much of the same code I've put it in here.

 

More detailed README and downloads on Github

Edited by andrewning

Share this post


Link to post

A couple of minor updates and fixes

- fix for crossref extra spaces (thanks @axidio)

- remove crossref unnecessary italitics tags

- check aiaa server status

- more reliable unicode handling

Share this post


Link to post

This workflow is so great!

But for a few weeks it does not copy the bibtex entry. Last week it always said "Bibtex not available". Today I updated to the latest version and now nothing is copied. Also BibDesk is not opened anymore. Or could this also be related to my update to OSX 10.9 on the weekend?

Edited by manzel

Share this post


Link to post

I love this workflow so much. I'm using it to build my reference section as I write (I know, manually building it is not the most efficient method but I'm too close to the end to change my habits now :) ). I'm not sure if it's possible or not, but the following would make my life a little easier:

 

1. When I grab a formatted reference, it comes out like this:

 

Lehman, D. R., Chiu, C., & Schaller, M. (2004). Psychology and Culture. Annual Review of Psychology, 55(1), 689â714. doi:10.1146/annurev.psych.55.090902.141927
 
Is there someway to allow rich text formatting so that italicized text gets italicized and to get the dash between page numbers to come out properly? 
 
2. APA formatted citations for the other methods... I know this is probably too complicated.
 
And these things don't really matter, I'm already saving time by using this workflow. 
 
Thank you so much for sharing!!
 
Katie

Share this post


Link to post

Without getting too deep into the details, the best way to get RTF text is to use the Mac's built in textutil function. This is what I usein ZotQuery to get Rich Text citations. I've posted below a snippet of code from ZotQuery which takes HTML (uref variable) and sets the clipboard to the RTF version of it. 

import applescript
# Convert to proper HTML
html_ref = uref.encode('ascii', 'xmlcharrefreplace')

# Write HTML to temporary file
with open(wf.cachefile(u"temp.html"), 'w') as f:
	f.write(html_ref)
	f.close

# Convert HTML to RTF and copy to clipboard
a_script = """
do shell script "textutil -convert rtf " & quoted form of "{0}" & " -stdout | pbcopy"
""".format(wf.cachefile(u"temp.html"))
applescript.asrun(a_script)

Hope this helps,

stephen

Share this post


Link to post

Hi Katie, I'm glad to hear that you like the workflow!

 

The en-dash between the page numbers looks like something I may need to fix.  I'll look into that.  Regarding the rich text formatting, I could probably add that also---thanks for the tip on that Stephen.

 

For your second question, my suggestion would be to use BibDesk (http://bibdesk.sourceforge.net/).  It's free and open-source, and this workflow is designed to work well with BibDesk.  If you use BibDesk, you won't have to keep re-downloading references for ones you already used, and if you fix up a reference (sometimes data is not exactly correct) your corrections will be retained for future papers.  But the benefit for your question, is that in BibDesk's preferences you can set the output to "apalike" and then you can select the publications you want and copy them as rich text.  This is maybe not the most ideal usage for your setup because it requires an extra step, but it would be the easiest and most reliable way forward.

 

The formatted reference feature was something I just added on at the end, because crossref happened to have an API for it.  It's not something I ever use, but glad to hear its useful to you.  I could probably add the ability to convert the BibTeX to a formatted reference for any of the methods, but for now I'd suggest using BibDesk.

 

- Andrew

Share this post


Link to post

Thanks, Andrew!

 

I keep all of my papers in Papers2 but the reference and citation information is so bad and unreliable that it's pretty much useless so I don't use their magic manuscript (citation) functions. They don't support major Psychology databases so I had to rely on their Google Scholar import, which had some problems (e.g. the last author of every reference is not imported). I've "been meaning to" go back and make sure the citation information is correct but the truth is that I will probably never get around to it: I have a large library and I'm close to the end. I've been thinking about moving my papers to another program, so thanks for the tip! I'll check it out to see how much work/time it would take to move my library over. 

 

Even though some references I get via this workflow require some fixing at times, it's never been outright wrong about the paper. So I love it! I've already started showing it off to my supervisor  ;) 

 

Thank you for sharing it!!

 

Katie

Share this post


Link to post

Katie,

 
Glad to hear it works well for you.  I've thought about adding integration to Papers and/or Mendeley down the road.  For now though, my suggestion was just that you could use BibTeX to handle to conversion to APA format, regardless of where you end up storing (or not storing) your database.  I'm referring to converting to APA for the other methods that aren't using crossref which already does that.
 
Even still, I will plan to fix your issue with en- and em-dashes, and look into adding rich text copying.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...