Reference Importer

andrewning · April 27, 2013

Reference Importer: search for an article/book from a variety of sources and import the corresponding reference data (BibTex, PDF) into BibDesk, copy BibTeX to clipboard, go to the landing page for the article, or copy a formatted reference. Also supports reference lookup from a PDF file. (This workflow was formerly known as "Citation Search" and "AIAA Search")

(follow link to GitHub for a more detailed README and download)

Update 5/22/2013: added conversion of non-latin characters to proper LaTeX. also removed default cite-key so that BibDesk can populate citekey with user defined style.

Update 5/28/2013: fix due to slight change of crossref.org API

Update 12/4/2013: major update including improved parsing, getting PDFs in addition to BibTeX, Google Books, Google Scholar, reverse PDF lookup. more details in comments below and in README

Update 12/6/2013: improvement to DOI parsing from PDF text

Update 3/15/2014: minor fixes and improvements

Update 4/2/2014: unicode fix

Update 4/3/2014: another unicode fix, this one for google books

Edited April 4, 2014 by andrewning

dfay · April 27, 2013

what database(s) does this search?

never mind...just saw the read me. This is pretty sweet.

Edited April 27, 2013 by dfay

dfay · April 30, 2013

This is a real time-saver, thanks again.

Would it be possible to add an option to harness BibDesk to produce a formatted citation on the clipboard? (through a modifier key or separate command).

This should just involve adding a few lines to the applescript that calls BD

set thePub to selection of document 1

export document 1 using template "template name" to clipboard for thePub

andrewning · April 30, 2013

Good suggestion. The API I'm using through doi.org actually has the ability to return formatted citations instead of bibtex, so I thought about including that functionality. However, to use it you have to know the exact name of the citation style and the although some were abbreviated many were ridiculous long (and it wasn't a feature I would use very often) so I left it out. I didn't think about accomplishing it through BibDesk. I'll look into it later this week.

andrewning · May 5, 2013

@dfay I've starting moved in the direction you suggested. I like your idea of going through BibDesk, but I was hoping to come up with a more generally useful solution for people who don't use BibDesk and I'm not sure I like opening BibDesk just to format a reference for the clipboard. I added a feature to optionally copy a formatted reference using the API through doi.org, but there are some limitations I'm still looking into (read more here). I'll continue to investigate. I also added an option to go directly the article's landing page.

Edited May 5, 2013 by andrewning

vitor · May 5, 2013

I updated this workflow and changed the scope to more than just BibTeX related actions so I've renamed the workflow and moved the discussion to here (because it doesn't look like I can change the title of this discussion.

For future reference, after you choose “edit” on the post, you have to pick “use full editor”, to change the title.

andrewning · May 5, 2013

For future reference, after you choose “edit” on the post, you have to pick “use full editor”, to change the title.

Ah, thanks. I'll stay in this thread then and remove the new one I just created.

dfay · May 5, 2013

Excellent, thanks. I will (probably) mostly use this to grab a quick reference for an email, so the restriction to APA format is not a big deal.

andrewning · May 23, 2013

Update to convert non-latin characters to proper LaTeX. If you come across a reference that doesn't work let me know.

I also removed the default citekey so that BibDesk can populate citekey with what style you've defined.

dfay · May 28, 2013

Hmmm....this was working great until this morning...now every search is resulting in an error - but the error message includes a doi URL that works fine when copied and pasted. I'm wondering if there was a change in the doi API that broke the workflow?

Here's an example of what's returned if I use the "copy formatted reference" command:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

<title>DOI Naming Authority [http:] Not Found</title>

</head>

<tr>

<h3 align="center">Error - DOI Naming Authority [http:] Not Found</h3>

</td>

</tr>

<tr>

<p>The DOI you requested -- </p>

<p><b>http://dx.doi.org/10.1080/13504639851663</b></p> <p> -- cannot be found in the Handle System.</p>

<p>Possible reasons for the error are:</p>

<ul>

<li>the DOI has not been created </li>

<li>the DOI is cited incorrectly in your source</li>

<li>the DOI does not resolve due to a system problem</li>

</ul>

</td>

</tr>

<tr>

<p>If you believe you have requested a DOI that should be found, you may report this error by filling out the form below:</p>

<tr>

<td>

<tr>

<th width="35%" align="right" scope="row"><label>Missing DOI:</label></th>

</tr>

<tr>

<th align="right" scope="row"><label>Referring Page:</label></th>

</tr>

<tr>

<th align="right" scope="row">E-mail address:</th>

</tr>

<tr>

<th align="right" scope="row">Comments:</th>

</tr>

</table>

</td>

</tr>

<tr>

</tr>

</table>

</form>

</td>

</tr>

</td> <td>

</tr>

</table>

</body></html>

andrewning · May 28, 2013

Thanks for the report. You're right, looks like there is a minor change in the json results that are being returned from the crossref API.

I've fixed it, but won't be able to update the repository until tonight.

If you want a fix right now, in doi.py change the following 2 lines

line 33: doi = j['doi'].split('dx.doi.org/')[1]  
line 36: attributes={'uid': doi, 'arg': doi},

That should do the trick. I'll push out the update tonight.

dfay · May 29, 2013

Yep, that fixed it! Thanks a bunch. It was only when it broke that I realized how much this has become an essential part of my research and reading.

andrewning · May 29, 2013

Yep, that fixed it! Thanks a bunch. It was only when it broke that I realized how much this has become an essential part of my research and reading.

Glad to hear it. I've pushed the fixed version to github.

andrewning · December 5, 2013

I've just pushed a major update to GitHub (and changed the name of the workflow). Some of the changes include:

in additional to getting the BibTeX it can auto download PDFs and link them to the BibDesk entry where possible
Google Books -- can now get BibTeX for textbooks
Google Scholar -- get BibTeX and PDFs from Google Scholar
parsing improvements for the main CrossRef search
Reverse PDF Search -- starting with a PDF on your computer you can use a keyboard shortcut or a File Action to scan the PDF for a DOI (or some relevant search terms if it can't find a DOI) and start a reference lookup search, It will then import the BibTeX and link the PDF in BibDesk.
AIAA Search -- A search for papers from the American Institute of Aeronautics and Astronautics. Can import BibTeX and PDFs (AIAA subscription required for PDFs). Formerly this was a separate workflow but since it shares so much of the same code I've put it in here.

More detailed README and downloads on Github

Edited December 5, 2013 by andrewning

andrewning · December 6, 2013

small improvement for pulling out the DOI of a PDF automatically

andrewning · March 16, 2014

A couple of minor updates and fixes

- fix for crossref extra spaces (thanks @axidio)

- remove crossref unnecessary italitics tags

- check aiaa server status

- more reliable unicode handling

manzel · March 21, 2014

This workflow is so great!

But for a few weeks it does not copy the bibtex entry. Last week it always said "Bibtex not available". Today I updated to the latest version and now nothing is copied. Also BibDesk is not opened anymore. Or could this also be related to my update to OSX 10.9 on the weekend?

Edited March 21, 2014 by manzel

andrewning · March 21, 2014

Thanks for the report, I've fixed the issue. Grab the latest from GitHub.

manzel · March 22, 2014

That was really fast! Thank you very much. Working great again.

katie · March 25, 2014

I love this workflow so much. I'm using it to build my reference section as I write (I know, manually building it is not the most efficient method but I'm too close to the end to change my habits now ). I'm not sure if it's possible or not, but the following would make my life a little easier:

1. When I grab a formatted reference, it comes out like this:

Lehman, D. R., Chiu, C., & Schaller, M. (2004). Psychology and Culture. Annual Review of Psychology, 55(1), 689â714. doi:10.1146/annurev.psych.55.090902.141927

Is there someway to allow rich text formatting so that italicized text gets italicized and to get the dash between page numbers to come out properly?

2. APA formatted citations for the other methods... I know this is probably too complicated.

And these things don't really matter, I'm already saving time by using this workflow.

Thank you so much for sharing!!

Katie

smarg19 · March 25, 2014

Without getting too deep into the details, the best way to get RTF text is to use the Mac's built in textutil function. This is what I usein ZotQuery to get Rich Text citations. I've posted below a snippet of code from ZotQuery which takes HTML (uref variable) and sets the clipboard to the RTF version of it.

import applescript
# Convert to proper HTML
html_ref = uref.encode('ascii', 'xmlcharrefreplace')

# Write HTML to temporary file
with open(wf.cachefile(u"temp.html"), 'w') as f:
	f.write(html_ref)
	f.close

# Convert HTML to RTF and copy to clipboard
a_script = """
do shell script "textutil -convert rtf " & quoted form of "{0}" & " -stdout | pbcopy"
""".format(wf.cachefile(u"temp.html"))
applescript.asrun(a_script)

Hope this helps,

stephen

katie · March 26, 2014

Thanks, Stephen! I'll look more closely at it when I have more time!

Katie

andrewning · March 26, 2014

Hi Katie, I'm glad to hear that you like the workflow!

The en-dash between the page numbers looks like something I may need to fix. I'll look into that. Regarding the rich text formatting, I could probably add that also---thanks for the tip on that Stephen.

For your second question, my suggestion would be to use BibDesk (http://bibdesk.sourceforge.net/). It's free and open-source, and this workflow is designed to work well with BibDesk. If you use BibDesk, you won't have to keep re-downloading references for ones you already used, and if you fix up a reference (sometimes data is not exactly correct) your corrections will be retained for future papers. But the benefit for your question, is that in BibDesk's preferences you can set the output to "apalike" and then you can select the publications you want and copy them as rich text. This is maybe not the most ideal usage for your setup because it requires an extra step, but it would be the easiest and most reliable way forward.

The formatted reference feature was something I just added on at the end, because crossref happened to have an API for it. It's not something I ever use, but glad to hear its useful to you. I could probably add the ability to convert the BibTeX to a formatted reference for any of the methods, but for now I'd suggest using BibDesk.

- Andrew

katie · March 27, 2014

Thanks, Andrew!

I keep all of my papers in Papers2 but the reference and citation information is so bad and unreliable that it's pretty much useless so I don't use their magic manuscript (citation) functions. They don't support major Psychology databases so I had to rely on their Google Scholar import, which had some problems (e.g. the last author of every reference is not imported). I've "been meaning to" go back and make sure the citation information is correct but the truth is that I will probably never get around to it: I have a large library and I'm close to the end. I've been thinking about moving my papers to another program, so thanks for the tip! I'll check it out to see how much work/time it would take to move my library over.

Even though some references I get via this workflow require some fixing at times, it's never been outright wrong about the paper. So I love it! I've already started showing it off to my supervisor

Thank you for sharing it!!

Katie

andrewning · March 27, 2014

Katie,

Glad to hear it works well for you. I've thought about adding integration to Papers and/or Mendeley down the road. For now though, my suggestion was just that you could use BibTeX to handle to conversion to APA format, regardless of where you end up storing (or not storing) your database. I'm referring to converting to APA for the other methods that aren't using crossref which already does that.

Even still, I will plan to fix your issue with en- and em-dashes, and look into adding rich text copying.

Reference Importer

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

andrewning

katie

giovanni

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in