Jump to content

ZotQuery: an Alfred workflow for Zotero


Recommended Posts

Posted
2 minutes ago, deanishe said:

I give up, tbh.

[...]

 

As before, I'm prepared to help, but I don't want this to be "my" workflow. I don't use Zotero, don't know how to use Zotero, and have no need for it. So somebody else will have to take ownership of and responsibility for the workflow. You up for that @lutefish?

 

 

I don't know whether to laugh or cry!

 

I don't think there's any point me trying to take responsibility. Hopefully @lutefish will see this as a great opportunity to get guidance coding from @deanishe ! What an opportunity!

Seriously though, thanks to both of you for trying. If I was confident of making my rent this month I would buy you beers. But I'm not. :(

 

let me know if there's anything I can do, otherwise, we just wait for an academic coder who uses zotero!

Posted

If somebody can sketch out how the workflow is supposed to work, I can write a "stub" version that's at least sanely structured.

 

Like, what do you do with it?

 

  • Search for entries (by title, author, tag, all fields)
  • Open entries in Zotero
  • Copy bibliography entries

 

Anything else?

Posted

That's largely all I've ever done with it.

A typical use for me is I'm writing in say, Scrivener, for example, and I want to cite a book. I load Alfred, type zot and a few letters, the book comes up and enter copies it to the pasteboard, in the citation style of my choice, then paste it in.

If I recall correctly the citation style is set in Zotero because there are thousands of different ways it could be done, including personalised styles.

 

Another use I could think of is accessing an attachment, but you can do that by opening an entry in Zotero. That's it really. No inputting of entries, no writing notes, just accessing a massive library of entries.

 

I think it may treat 'citations' differently than 'bibliography', that may be important.

Posted
57 minutes ago, Damoeire said:

I think it may treat 'citations' differently than 'bibliography', that may be important.

 

That's just my ignorance showing through. "Citations" is what I meant.

 

58 minutes ago, Damoeire said:

the citation style is set in Zotero

 

I don't think so. The workflow asks you to choose a style when you first run it. These are the choices it offers:

  • chicago-author-date
  • apa
  • modern-language-association
  • rtf-scan
  • bibtex
  • odt-scannable-cites
Posted
15 minutes ago, deanishe said:

"Citations" is what I meant.

Well Zotero does treat them differently, for example you can add citations to a word doc and have it build the bibliography, but I don't think any of that affect the workflow.

17 minutes ago, deanishe said:

The workflow asks you to choose a style when you first run it.

That's interesting. I do remember that.

In zotero, under preferences/export there are a lot more styles, including personalised ones, that can be done with quick copy. I must have used the workflow to open the entry and then do a quick copy back into scrivener, as I used my own style at one point I'm sure. Eventually I settled on scannable cite style.

 

Anyway, I am probably just confusing issues.  

Posted

Hi, all - I wish I had the time and the expertise to do more with this, but I don't. My hand-hacked version of the script works for me (for now), so I've been very reluctant to change anything, and my tech skills are shaky enough that I worry about even fiddling around with the subsequent versions that @deanishe has very generously assembled, in case I break what's not broken. 

 

All I use zotquery for is to search the Zotero database (not even by field - just the "general query" default works fine, and the rest is just added and rather unnecessary complexity), and then either 1) opening the linked attachment, 2) copying the citation  (either full or short) to the clipboard, and very occasionally 3) opening Zotero to that particular item.  

 

May thanks to everyone who has spent time and effort on this (and beers all around sounds like a solid plan), but this may be it for this workflow for a while.

Posted

Anybody else, then?

 

I can't do it on my own because I don't know the first thing about citations.

 

The rest is all pretty straightforward (search, open attachments, open in Zotero etc.).

Posted

If you're going to start over, refactor, clean up, etc., I'd suggest the following:

  • Search for entries (by title, author, tag, all fields)
  • Open entries in Zotero
  • open the attachment(s) for an entry
  • select a CSL file for export format - see http://citationstyles.org
  • output a single citation in the selected format's in-text style ( see http://docs.citationstyles.org/en/stable/primer.html )
  • output a single citation in the selected format's note style (ditto)
  • same for selected entries in Zotero and/or contents of a group in Zotero

The advantage to using CSL is that the workflow then becomes extensible to practically any citation format anyone can throw at it, with no intervention required by the developer.  There are 1000s of CSL templates out there.

 

Having reworked quite a few of smargh's AppleScripts for Skim, I suspect there is a lot that can be trimmed down.  

 

 

 

Posted

Thanks for the input.

 

I have the first bits working (search, open in Zotero, open attachments).

 

Citations are totally the sticking point. I can integrate a CSL library (which seems straightforward enough), but I know literally nothing about citing things.

 

I can't do the citation stuff when I'm not in any position to tell whether it's working correctly.

 

1 hour ago, dfay said:

I suspect there is a lot that can be trimmed down

 

So, so much. There are epic amounts of crap in there. Hundreds and hundreds of lines of code to search the database on different columns, when SQLite supports that natively. You just have to do column:<term> in the query…

 

I honestly can't figure out what half of the code is even supposed to do. There are barely any comments, the structure is insane, and it relies heavily on global state and import-time side effects.

 

I threw the lot in the bin.

Posted

Yeah he had written his own export template scheme for Skim even though there’s a native one that can be implemented in a few lines of AppleScript...

 

Basically you should just need to pipe into the CSL library and let it do its thing.  Once you get it to that point post a draft & I can look at it.

 

Posted
5 hours ago, dfay said:

Basically you should just need to pipe into the CSL library and let it do its thing.

 

Pipe what into the CSL library? So far, I've learnt that there are long and short citations. Are there also multiple citations? If so, how do they differ?

 

And what needs to come out? Markdown? HTML? RTF? Plaintext?

Posted

Sorry I've been away but should be able to answer some of these questions either much later tonight or tomorrow at the latest. It's always more complicated than I imagine. 

Posted (edited)

I haven’t dug into the code to see what’s coming out of the queries, but I think the easiest would be to map it into csl-json https://github.com/citation-style-language/schema/blob/master/csl-data.json based on the zotero field mappings here: https://aurimasv.github.io/z2csl/typeMap.xml

 

The library should handle csl-json & spit out plain text or html - user can choose to use a markdown csl format like this https://github.com/philipbelesky/Markdown-Citation-Style-Languages but the elegance of this approach is that that will be up to the user.

Edited by dfay
Posted
3 hours ago, dfay said:

I haven’t dug into the code to see what’s coming out of the queries, but I think the easiest would be to map it into csl-json https://github.com/citation-style-language/schema/blob/master/csl-data.json based on the zotero field mappings here: https://aurimasv.github.io/z2csl/typeMap.xml

 

Thanks. I'd come to the same conclusion, but it's great to have confirmation that I'm not barking up completely the wrong tree. I found that typeMap.xml, too, but haven't figured out how best to extract and use the data from it.


I guess I'll filter the Zotero data on import and add a csl member to the Entry class and a method to generate csl-json, which the formatter can pass to the CSL library along with the style.

 

I found the repo for the style definitions, but that's ~50MB and there are, exactly as you said, many thousands of them. So I'm going to read the styles from Zotero's styles directory instead. Presumably, that contains the styles the user is interested in, and they can use Zotero's UI and search engine to fetch new ones.

 

I'm currently pondering how to structure the citation-styling infrastructure.

 

In any case, what would be very helpful would be another Zotero database with a small, but varied, selection of entries. Like, something with all the common types of entries with common fields filled in.

Posted

 

On 16/12/2017 at 4:39 PM, deanishe said:

Anybody else, then?

 

I can't do it on my own because I don't know the first thing about citations.

 

The rest is all pretty straightforward (search, open attachments, open in Zotero etc.).

 

I don't even know if this is being done through coding outside of alfred, or using alfred's workflow construction. In I way I'd love to take it on, but I'd be more of a hindrance than a help I'm sure. I might be able to help with citations but seeing the posts following the one I coded, you seem to have more of a coders grasp of that than me too.

 

18 hours ago, deanishe said:

So far, I've learnt that there are long and short citations. Are there also multiple citations? If so, how do they differ?

 

I really hope I'm not giving out inaccurate information but AFAIK the workflow only ever dealt with single citations. The CSL (as you know, but for clarity) deals with how the citation is displayed in the document, so the workflow shouldn't have too much work to do there. That also should mean (as I understand it), that the workflow doesn't need to concern itself with long and short citations. Each citation style has its own rules - for example, my home institution dictates that the first citation of a work in a chapter should be full, and following that short. I used scannable cite in the end as that meant I could export my document into LibreOffice and it would then deal with what was long and what was short, scannable cite was just a reference back to my Zotero library.

The workflow has never dealt with long and short or multiple entries, which again are dealt with by CSL. However, a short citation is usually a reduction of a long, say:

John A Agnew, Hegemony: The New Shape of Global Power (Philadelphia: Temple University Press, 2005).

Turns into 

Agnew, 2005.

and although to have that facility within the workflow would be great, it's also a secondary concern for me.

Multiple citations are usually separated by a semicolon but again, I don't think that needs to come within the concern of the workflow.

 

18 hours ago, deanishe said:

And what needs to come out? Markdown? HTML? RTF? Plaintext?

 

 

I presume its RTF. Not plaintext because getting the italics right is one of the important elements that its easy to mess up doing manually if you're not careful. Some people may have need for conversion to markdown, not me. Seems like another secondary concern, but that's because I don't use it, maybe others feel strongly about it.

 

13 hours ago, deanishe said:

In any case, what would be very helpful would be another Zotero database with a small, but varied, selection of entries. Like, something with all the common types of entries with common fields filled in.

Was my first one helpful? I'll get one.

Posted
13 hours ago, deanishe said:

In any case, what would be very helpful would be another Zotero database with a small, but varied, selection of entries. Like, something with all the common types of entries with common fields filled in.

@deanishe

I downloaded some random collections. I could only get 20 or so at a time. So they are in different RDF_Zotero files but can be imported in using zotero's import function.

 

Here's the link:

https://www.dropbox.com/sh/mk5tghsxvk5ojcw/AAAKBSBhMVOlbDCrNUl_JrV9a?dl=0

Posted (edited)

Thanks for the input guys. I'm close to having something functionally useful, I think. (I have the workflow generating CSL data for the Zotero entries, but I haven't hooked it up to the citation generation library yet).

 

Thanks for the extra data, @Damoeire. Non-ASCII data are always great for testing stuff.

 

On 18/12/2017 at 12:30 PM, Damoeire said:

I presume its RTF. Not plaintext because getting the italics right is one of the important elements that its easy to mess up doing manually if you're not careful.

 

That make a lot of sense. It looks like I'll have to write my own RTF formatter, but I've worked a bit with RTF before, and it shouldn't be an issue. ZotQuery uses textutil to convert HTML to RTF, but I've found the results of that to be suboptimal, as it adds styling to the RTF.

 

On 18/12/2017 at 12:30 PM, Damoeire said:

Some people may have need for conversion to markdown

 

Not really a priority, I think: HTML is perfectly valid in Markdown.

 

What I'm going to try is copying multiple formats to the pasteboard at the same time (with the plaintext type also set to the HTML version). Hopefully, that should provide each app with the format it wants.

 

I currently have the default item action set to "Open in Zotero" and ⇧↩ set to "List Attachments" (if there are any). My plan is to allow you to assign two "default" citation styles (presumably a short and a long version of the same style) which will be copied on ⌘↩ and ⌥↩^↩ would show you a list of all available citation styles to choose from.

 

Does that sound reasonable?

 

I have a few questions, however:

 

ZotQuery only accepts a very limited number of filetypes (DOC, DOCX, PDF and EPUB) as attachments and ignores the rest. Is that a desirable behaviour, or should I accept all filetypes (which seems more sensible to me), or ignore certain filetypes?

 

What would be good things to assign to hitting ⌘C and ⌘L on an item (copy and show in Alfred's Large Type window respectively)? They're both limited to plaintext, so copying a default citation (at least in the above multi-format sense) is right out, and is likely not realistic in any case due to the time it takes to generate a citation (unless I cache them with the rest of the data—which presents performance and cache-invalidation issues if the user changes the format, though neither is a particularly huge problem, I think). Currently, I have the Large Type set to the item's (concatenated) notes.

 

How to handle bibliographies/multiple citations. Do I need to worry about this, or is it something the user and/or Zotero takes care of? Do I need to implement "Groups" (which appear to be an API, online-only thing) or do people use "Collections" (which can be read from the local database) as their bibliography?


Zotero plugins. There are plugins for Word and LibreOffice. What do they do? How do they relate to the workflow? Do I need to use some special citation format for them to work?


I keep hearing about "scannable" citations. What are they?

 

Also, I'll upload the code in the next day or so when I have something worth showing. Currently, all it actually does is cache and search the Zotero database and open items in Zotero or attachments in the default application. Not entirely useless, but also not worth soliciting feedback on…

 

As compared to ZotQuery (the working name of my workflow is ZotHero with Zorro for the icon), the advances are:

  1. A sane, understandable, well-commented structure. No import-time side-effects, and the Zotero library has no idea it's part of an Alfred workflow. It might even make sense to distribute the Zotero interface as a separate library.
  2. Increased speed (much smarter caching, and SQL joins instead of dozens of queries).
  3. Support for attachments that are URLs, not just files.

Hopefully, plenty more to come, the integration with CSL styles being the biggest win, imo.

 

Edited by deanishe
Posted
On 18/12/2017 at 12:30 PM, Damoeire said:

Was my first one helpful? I'll get one.

 

To answer this question, both yes and no. A large-ish dataset like you provided is awesome for testing performance (Yay! My workflow loads it from Zotero way faster than Zotero could import it) , but for testing functionality (which is where I'm now at), you rather want a small, but extremely varied dataset. One that covers as many potential inputs as possible (empty items, all manner of partial items, complete items, different alphabets etc.), but without much duplication.

Posted
43 minutes ago, deanishe said:

 

To answer this question, both yes and no. A large-ish dataset like you provided is awesome for testing performance (Yay! My workflow loads it from Zotero way faster than Zotero could import it) , but for testing functionality (which is where I'm now at), you rather want a small, but extremely varied dataset. One that covers as many potential inputs as possible (empty items, all manner of partial items, complete items, different alphabets etc.), but without much duplication.

I've posted a request on the zotero functions for something that covers this.

 

51 minutes ago, deanishe said:

I currently have the default item action set to "Open in Zotero" and ⇧↩ set to "List Attachments" (if there are any). My plan is to allow you to assign two "default" citation styles (presumably a short and a long version of the same style) which will be copied on ⌘↩ and ⌥↩^↩ would show you a list of all available citation styles to choose from.

Sounds amazing.I hadn't thought of an approach like that. Makes great sense.

 

57 minutes ago, deanishe said:

How to handle bibliographies/multiple citations. Do I need to worry about this, or is it something the user and/or Zotero takes care of? Do I need to implement "Groups" (which appear to be an API, online-only thing) or do people use "Collections" (which can be read from the local database) as their bibliography?

You don't need to worry about this. Unless your workflow is going to keep track of the entire document the user is writing in (which it obviously isn't), then this is something for the user or the program being used to deal with.

 

1 hour ago, deanishe said:

Do I need to implement "Groups" (which appear to be an API, online-only thing) or do people use "Collections" (which can be read from the local database) as their bibliography?

You don't need to worry about groups. People use collections. Users can join groups, download the group library and add it to their collection. 

 

1 hour ago, deanishe said:

Zotero plugins. There are plugins for Word and LibreOffice. What do they do? How do they relate to the workflow? Do I need to use some special citation format for them to work?


I keep hearing about "scannable" citations. What are they?

The word and libreoffice plugins basically do a similar thing to the workflow. Plus a little more. In word, using a shortcut, the user brings up a search dialogue, finds the article needing citation, chooses it, and it is added either as an in-text citation or as a footnote (the user chooses). The plug in will (should?, I wrote in scrivener for the most part), keep track of entries, and so therefore, if the same document was cited twice in a row, the second would come up a Ibid. (if that matched the CSL style.) It always followed a CSL style. I beleive it also dealt with long and short.

 

Scannable citations were a plug in that allowed you to write in another program (eg. scrivener), and then import your citations into libreoffice as 'live' citations, that is, they related back to your zotero library. If you updated your zotero library, then the citation was updated in libreoffice. I beleive the above plugins worked the same way - they were connected to the zotero library, so if you fixed the spelling of something in zotero, that would be reflected in your document. I'm not an expert, but that's how I understood it. Here's a link to scannable cite's:

https://zoteromusings.wordpress.com/2013/05/06/announcing-rtfodf-scan-for-zotero/

For the purposes of the workflow, it's just another style.

 

1 hour ago, deanishe said:

Also, I'll upload the code in the next day or so when I have something worth showing. Currently, all it actually does is cache and search the Zotero database and open items in Zotero or attachments in the default application. Not entirely useless, but also not worth soliciting feedback on…

 

Can't wait. Actually, that does a lot more than my current version, which does nothing! Thanks! I've just realised I missed two of you questions. Will get to them soon I hope. I have pressing family issues right now ;)

Posted
57 minutes ago, Damoeire said:

You don't need to worry about groups.

 

But do I need to worry about collections? 

 

59 minutes ago, Damoeire said:

For the purposes of the workflow, it's just another style.

 

But is it a CSL style, i.e. does it have a CSL definition to sit alongside the others? Or is it something special that I have to define separately?

 

1 hour ago, Damoeire said:

I missed two of you questions.

 

No problem. Input on the above follow-up questions is much more important, as they impact the overall structure of the program more than the other stuff.

Posted
20 minutes ago, deanishe said:

But do I need to worry about collections? 

I don't believe so. According to zotero documentation: 

 

Collections allow hierarchical organization of items into groups and subgroups. The same item can belong to multiple collections and subcollections in your library at the same item. Collections are useful for filing items in meaningful groups (e.g., items for a particular project, from a specific source, on a specific topic, or for a particular course). You can import items directly to a specific collection or add them to collections after they are already in your library.

 

So as long as the workflow can access the users library, it can access the collections. No need for the workflow to concern itself with them.

 

22 minutes ago, deanishe said:

But is it a CSL style, i.e. does it have a CSL definition to sit alongside the others? Or is it something special that I have to define separately?

 

Hmm. My lack of understanding of the actual workings of zotero and CSL make me worry if I'm giving wrong information. In zotero/preferences/export there is an option for choosing your default quick copy. My understanding is that is how the workflow worked before, though it has been mentioned that there was an option to choose your style through z:config 

I loaded my own CSL style into zotero via preferences/cite - add style and then chose that as default - and as far as I remember that was an option I also used when not using scanable cite. Either way, scanable cite is an option in the export menu. Whether that means it is catagorically a CSL style, I can't say for sure. But the zotero forum reponses are very quick and would be able to answer that question. I'm happy to ask, but I'm not sure I'd do a good job of explaining it exactly, and be able to respond to any additional queries to my question!

 

There is an RTF Scan CSL style here: https://www.zotero.org/styles?q=RTF Scan

But it appears to give this as an output:

{Hisakata et al., "An adaptable metric shapes perceptual space", 2016}

whereas scannable cite has an id like this:

{ | Smith, (2012) | | |zu:2433:WQVBH98K}

 

I see now that scanable cite needs to be installed as an export option, it's not built in, as noted here:

https://zoteromusings.wordpress.com/2013/05/06/announcing-rtfodf-scan-for-zotero/

(Same link as last post)

 

I don't know if zotero's export options are all CSL styles for a fact, though I can't see how they wouldn't be. That's how zotero works AFAIK, using CSL styles.

 

The scanable cite plugin forum thread is here:

https://forums.zotero.org/discussion/29308/announcing-rtfodf-scan-for-zotero/

 

Where you'll find me asking a bunch of questions as per usual.

 

The more I look, the more it seems like perhaps scanable cite isn't a CSL style, despite the fact that it is in zotero's quick copy export options. Would there be a way for the workflow to just access the quick copy option? I wish I could be more help. 

 

3 hours ago, deanishe said:

ZotQuery only accepts a very limited number of filetypes (DOC, DOCX, PDF and EPUB) as attachments and ignores the rest. Is that a desirable behaviour, or should I accept all filetypes (which seems more sensible to me), or ignore certain filetypes?

 

 

image files would be very useful.

 

3 hours ago, deanishe said:

What would be good things to assign to hitting ⌘C and ⌘L on an item (copy and show in Alfred's Large Type window respectively)? They're both limited to plaintext, so copying a default citation (at least in the above multi-format sense) is right out, and is likely not realistic in any case due to the time it takes to generate a citation (unless I cache them with the rest of the data—which presents performance and cache-invalidation issues if the user changes the format, though neither is a particularly huge problem, I think). Currently, I have the Large Type set to the item's (concatenated) notes.

 

Unless you needed Ctl-C for something else, then yes, why not, because often this is used in sharing data in emails and ctl-C is intuitive, and plain text wouldn't matter for that sort of sharing, but if it's going to be easy to share it as formatted, then perhaps why bother confusing things. I think your call as coder.

 

I hope the above is some help. I can try to expand on any of it, but I have limited understanding of the internal workings. If you have a question for the zotero forum that you want asked, I'm happy to do that. I'm not doing it straight away because I'm thinking it probably needs context.

Posted

Perhaps the RTF/ODF-Scan issue is just confusing things. If the workflow open zotero to the correct article, then the user can use the quick copy shortcut to copy the scanable cite to the clipboard and the paste it. Also RTF - scan is a different thing, I conflated the two above. It might be better to ignore scanable cites.

Posted (edited)
31 minutes ago, Damoeire said:

So as long as the workflow can access the users library, it can access the collections. No need for the workflow to concern itself with them.

 

You're kind of missing the point. Collections aren't the same thing as regular entries any more than a recipe is the same as a single ingredient. If you want to see collections in the workflow, they need to be explicitly coded for.

 

31 minutes ago, Damoeire said:

Would there be a way for the workflow to just access the quick copy option?

 

No. The workflow can't access anything but the raw data stored in Zotero's database, which are extremely fragmented, as is the way for normalised databases.

 

31 minutes ago, Damoeire said:

image files would be very useful.

 

This is trivial to do: I can just turn off the filtering of file extensions.

 

31 minutes ago, Damoeire said:

Unless you needed Ctl-C for something else, then yes, why not, because often this is used in sharing data in emails and ctl-C is intuitive, and plain text wouldn't matter for that sort of sharing, but if it's going to be easy to share it as formatted, then perhaps why bother confusing things. I think your call as coder.

 

I honestly don't understand what you're saying here. What do you think should be copied by ⌘C?

 

31 minutes ago, Damoeire said:

I have limited understanding of the internal workings.

 

I'm not looking for input regarding the internal workings, but rather concrete input regarding what should come out of the workflow.

 

In particular, "that shouldn't concern the workflow" comments aren't constructive: they're based on assumptions that the workflow can just grab Zotero output, which isn't the case. 

 

Edited by deanishe
Posted
17 minutes ago, deanishe said:

You're kind of missing the point. Collections aren't the same thing as regular entries any more than a recipe is the same as a single ingredient. If you want to see collections in the workflow, they need to be explicitly coded for.

 

I don't need to see collections in the workflow.

 

18 minutes ago, deanishe said:

I honestly don't understand what you're saying here. What do you think should be copied by ⌘C?

 

I have no preference. Sorry for being confusing. If the workflow gets the citation in the preferred style to the clipboard, I'm happy.

 

18 minutes ago, deanishe said:

In particular, "that shouldn't concern the workflow" comments aren't constructive: they're based on assumptions that the workflow can just grab Zotero output, which isn't the case. 

 

Again, sorry. What I meant was that I didn't need that in the workflow, collections specifically.

Posted (edited)
46 minutes ago, Damoeire said:

Again, sorry.

 

No apology wanted or necessary.

 

It's just that I need concrete suggestions regarding exactly what the workflow should output.

 

The workflow uses the same database and styles as Zotero, but it doesn't use Zotero itself: what Zotero does/can do has no bearing on the workflow.

Edited by deanishe

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...