Jump to content

Workflow development


Recommended Posts

Ok, I am in the process of re-tooling my workflow development ...erm... workflow. I use Sublime Text 3 and write all my workflows in Python (with some AppleScript occasionally thrown in). My basic question to the community is this: How is your environment set up for Alfred workflow development?

But, as that question is perhaps overly broad, here are some specific sub-questions:

 

  • How do you do version control?
    • do you make the workflow directory the repo?
    • do you do this in Alfred's auto-generated workflow dir (~/[parents]/Alfred.alfredpreferences/workflows/XXXXX-11111/)?
  • How do you structure complex workflows?
    • is the structure similar to the standard Python package structure, with the workflow dir acting as the package dir?
    • if you have multiple scripts, how do you organize them?
  • How do you run/test the workflow as you are building it?
    • in Sublime, with Python, I have problems with imports as soon as I create any structure beyond a flat collection of scripts.
    • I run the scripts from Sublime, not the command line. I need a Custom Build System, does anyone use one for Python alfred workflow devel?

There are lots of other specific questions, but my general aim is to just hear from people. Right now, I am re-writing my ZotQuery workflow, which if anyone uses it or has looked at it, you would know that it is big and gnarly. I have most all of the code written, but the workflow's organization is shit. It's far too big to all sit in one script (and still allow me to navigate it). I need some organizational inspiration. So hit me with your setups.

stephen

Link to comment

I start off with a project directory in ~/Code (where I keep my, well, code), usually named alfred-<worklflowname>.
 
This directory is a git repo, which will probably go on GitHub.
 
The actual source code goes in a src subdirectory. That way I can keep things that don't belong in the distributed workflow in the project (README.md, the compiled .alfredworkflow file, helper scripts or source graphics).
 
I create an empty workflow in Alfred and add the icon. Then I copy the contents of it over to the src subdirectory I made. Then I delete the workflow in Alfred.
 
I have two scripts to help with writing workflows. workflow-build.py creates the .alfredworkflow file from the specified directory. It's aimed at Python workflows and ignores .pyc files. workflow-install.py installs the workflow to Alfred's workflow directory (using the bundle ID as its name). Most importantly workflow-install.py -s symlinks the source directory instead of copying it, so I can keep the source code with my other code, not in Alfred's settings dir.
 
I also write workflows in Sublime Text 3 (usually). I run them in iTerm, though, not in ST's terminal. I find ST's terminal isn't very reliable and chokes on a lot of output. It's also not great when the script needs arguments, which is often.
 
The downside of using iTerm is that it uses my shell environment, not Alfred's, but I'm careful about keeping my system Python fairly pristine.
 
I generally try to write workflow scripts as command-line apps, i.e. one main script that takes options and arguments, rather than a bunch of separate scripts. I find this helps me design the app in a more sensible way. If the workflow needs a whole bunch of action scripts, I also prefer to keep these all in the same script and Run Script action (I add the options to {query}). That way I don't have dozens of elements in Alfred's UI, which is a PITA to manage.
 
The thing to watch out for when doing this is your imports. If everything is in one script, you'll likely end up importing a bunch of libraries that won't be needed, which slows the script down. If I'm using libraries that are slow to import (OS X-native ones, like Foundation or AddressBook are the worst), I import them in functions that use them, not at the top of the script.
 
With something as big as ZotQuery, I'd still go with one main script and one main application class, but add sub-applications behind keywords, much like git or pip works.

Link to comment

Can you put the workflow-build.py script as a Gist, I see that the workflow-install.py script is already a Gist.

I think that workflow makes a lot of sense. Fiddling with Alfred's preferences directory and the random UID folder names is a PITA. Can I ask why you don't use virtualenv? I know that we talked a bit about this in the past, but it seems that building it all in the ~/Code/ directory as CLI tools and then building the .alfredworkflow file after the fact could take advantage of virtualenv. Specifically, I'm thinking that using virtualenv (and virtualenvwrapper) with Sublime Text 3 would allow me to creat a Custom Build System within a Sublime Text project to run the code within the Sublime Text console (for simplicity and speed) while still ensuring the control of the environment. However, as you are much wiser than I am, I'm interested to hear your thoughts on this possible approach.

Also, (this is a general Python question) I'd love to hear your thoughts on organizing code when taking a functional (clean architecture) approach. I confess that I am a bit OCD about my code. I like things to be cleanly organized. My issue is that Python really only allows for classes as containers for related functions. But, AFAIK best-practices define classes as objects with state. I want to use them as "function buckets", an object without state. For example, I have a class in ZotQuery called ResultsFormatter; this class contains all the methods used to format a Python dictionary containing a Zotero item's information into an Alfred-ready dictionary to be plugged directly into Workflow.add_item(). This object has not state really. It is really acting as one big pure function (input -> output) with a bunch of sub-functions to organize/decouple the function bits. Is this best-practice? Is there another way to do this cleanly? As an other example, I like to use classes to represent collections of information. So, I have a class right now called ZotQueryBackend; this contains all of the information for the data files ZotQuery uses (a clone of Zotero's sqlite database, a JSON representation of that data, a sqlite FTS database of that data, and an ASCII-folded FTS database of that data), as well as containing the methods for updating those data files. Should each of these data files have its own class, then have the state of update-needed or update-not-needed? If so, how could I organize these classes cleanly? And I will state at the end that part of what I mean by "cleanly" is being able to use Sublime Text's code-folding functionality, which requires meaningful indentation.

I know this final bit is slightly outside the purview of the original question, but my Googling isn't really helping me get at some of this "tribal knowledge", and you're the best Python programmer I actually know...

Link to comment

I usually just create a directory in my workspace directory (~/workspace/alfred-<workflow-name> for me) and move the files from the Alfred Workflows directory to the workspace directory. After that I replace the Alfred Workflow with a symlink (ln -Fs ~/workspace/alfred-<workflow-name> alfred-workflow-dir-with-uuid).

 

Works perfectly :)

 

 

Can I ask why you don't use virtualenv? I know that we talked a bit about this in the past, but it seems that building it all in the ~/Code/ directory as CLI tools and then building the .alfredworkflow file after the fact could take advantage of virtualenv. Specifically, I'm thinking that using virtualenv (and virtualenvwrapper) with Sublime Text 3 would allow me to creat a Custom Build System within a Sublime Text project to run the code within the Sublime Text console (for simplicity and speed) while still ensuring the control of the environment. However, as you are much wiser than I am, I'm interested to hear your thoughts on this possible approach.

 

Virtualenvs are a great advantage when you're using libraries, within Alfred there's no clean method (yet) to install Python libraries so a virtualenv has very little advantage. It's still not a bad idea but Alfred won't use it anyhow so there's little advantage either. The only advantage I can think of is that you can't accidently use locally installed libraries (if you install libraries outside of your virtualenvs).

 

 

Also, (this is a general Python question) I'd love to hear your thoughts on organizing code when taking a functional (clean architecture) approach. I confess that I am a bit OCD about my code. I like things to be cleanly organized. My issue is that Python really only allows for classes as containers for related functions. But, AFAIK best-practices define classes as objects with state. I want to use them as "function buckets", an object without state. For example, I have a class in ZotQuery called ResultsFormatter; this class contains all the methods used to format a Python dictionary containing a Zotero item's information into an Alfred-ready dictionary to be plugged directly into Workflow.add_item(). This object has not state really. It is really acting as one big pure function (input -> output) with a bunch of sub-functions to organize/decouple the function bits. Is this best-practice? Is there another way to do this cleanly? As an other example, I like to use classes to represent collections of information. So, I have a class right now called ZotQueryBackend; this contains all of the information for the data files ZotQuery uses (a clone of Zotero's sqlite database, a JSON representation of that data, a sqlite FTS database of that data, and an ASCII-folded FTS database of that data), as well as containing the methods for updating those data files. Should each of these data files have its own class, then have the state of update-needed or update-not-needed? If so, how could I organize these classes cleanly? And I will state at the end that part of what I mean by "cleanly" is being able to use Sublime Text's code-folding functionality, which requires meaningful indentation.

I know this final bit is slightly outside the purview of the original question, but my Googling isn't really helping me get at some of this "tribal knowledge", and you're the best Python programmer I actually know...

 

 

What gave you that idea?

You can store functions in modules, files, classes or even within a function :)

 

Giving a class is state is generally recommended, but because of the inheritance possibility it can be a good idea to use classes even when you're not using instances. I wonder though, why aren't you using an instance?

Take a look at my alfred-convert project for example: https://github.com/WoLpH/alfred-converter

 

From what you are describing I would recommend a making a "result = Result()" type of thing where that result object has the methods for storing the results until you output them.

 

As for having separate classes for every data file... it all depends on your case of course. If the objects are virtually identical in definition and needed methods than I would just create a single class with multiple instances. But if they contain separate methods (or are expected to have separate methods in the future) than I would recommend having a base class with a simple inherited class for the specifics (i.e. "class SpecialFoo(Foo): pass").

Edited by wolph
Link to comment

Like wolph says, there's no real benefit to using `virtualenv` with workflows. Its purpose is to isolate your apps dependencies, and seeing as your workflow dependencies should probably be installed in the workflow itself, there's little point to `virtualenv`.

 

There's generally no need to use classes to group together related functions. In Python, you can do that with modules. It still makes sense sometimes, though (Alfred-Workflow uses stateless classes for its default serialisers).

 

 

I usually just create a directory in my workspace directory (~/workspace/alfred-<workflow-name> for me) and move the files from the Alfred Workflows directory to the workspace directory. After that I replace the Alfred Workflow with a symlink (ln -Fs ~/workspace/alfred-<workflow-name> alfred-workflow-dir-with-uuid).

 

Works perfectly :)

I generally disagree with doing it this way. You end up with files in the workflow that shouldn't be in the workflow, often including another copy of the workflow.

I think all the files that belong in the finished workflow are better kept in a subdirectory. It makes it easier to compile and distribute the .alfredworkflow file without including unnecessary stuff, and it keeps the .git directory out of Dropbox, which is not a great place for repos.

I've seen quite a few workflows that include an additional copy of themselves because they were organised this way.

Link to comment

I generally disagree with doing it this way. You end up with files in the workflow that shouldn't be in the workflow, often including another copy of the workflow.

I think all the files that belong in the finished workflow are better kept in a subdirectory. It makes it easier to compile and distribute the .alfredworkflow file without including unnecessary stuff, and it keeps the .git directory out of Dropbox, which is not a great place for repos.

I've seen quite a few workflows that include an additional copy of themselves because they were organised this way.

 

That's good point, didn't really think about that. But I don't think just moving the source files helps enough (at least, not in my case). For me there are other files like tests, test coverage files, gitignore files, temp stuff and more. Having a proper build command which filters the non-needed files is always a good idea :)

 

Guess I'll start making one for my workflows.

Link to comment

That reminds me. I uploaded my build script as smargh requested. It's fairly basic and only designed to work with workflows structured the same way as mine, i.e. all the files that should go in the distribution are in a single directory. It will automatically name the generated .alfredworkflow with the workflow's name grabbed from info.plist.
 
Regarding other types of files:
 
You can keep all the non-workflow files in your project root directory. For testing purposes, you can add a test runner to the same directory that creates the requisite test environment (e.g. adjusting sys.path, simulating Alfred's execution environment).
 
I've found that the problem with relying on your build system to filter out any files that shouldn't end up in the workflow distribution is that it's one more thing that you have to remember to update when you change something. It's very easy to forget to exclude some big file you've added or include an essential one (at least, it's something I find very easy to forget).
 
As a result, I try to organise my workflows so that there's a subdirectory that corresponds precisely to the distributed workflow. That way, it doesn't require me to work actively to stop things breaking.

Link to comment

That approach would mostly already work for me, all of my source is already in a separate directory due to using python modules.
 
Having that said, I understand your rationale but I don't see too much benefit. Simply ignoring some extensions and ignoring everything in the .gitignore should do the trick already. Having looked through half a dozen of your repositories ( https://github.com/deanishe?tab=repositories ) I haven't found a single one that wouldn't work by having a single list of ignored files/extensions for all of them.
 
And if you're afraid of filtering out files that should be there, blacklisting instead of whitelisting usually does the trick :) Just removing all hidden files takes care of most, adding the .gitignore and .alfredworkflow should take care of the rest.
 
Just in case you are still afraid of missing files, if you hook up Travis or something it would be pretty easy to automatically test the build. You would have travis run the tests on the .alfredworkflow file instead of your normal repo of course but at least you have fully automated testing.

Link to comment

But I don't have to bother messing around with blacklists or whitelists or travis because I keep all the workflow files in a separate directory…

 

I certainly wouldn't want to ignore the files in .gitignore. That's for files that don't belong in the repo, which is a subset of files that don't belong in the compiled workflow.

 

It all sounds very much like a solution in search of a problem to me.

 

The reason a simple list of ignored files/extensions would work for my workflows is precisely because I keep them organised into subdirectories. If I dumped everything in the root, it wouldn't work because a bunch of scripts designed to help with development would need special-casing for each workflow.

 

On top of all that, the src subdirectory makes my repos much more understandable. You don't need to mentally parse scripts and dotfiles to figure out which bits actually belong to the workflow.

 

I just don't see any benefit is whipping up some all-singing, all-dancing build system to let me make my repos less organised.

Link to comment

I just don't see any benefit is whipping up some all-singing, all-dancing build system to let me make my repos less organised.

 

You've always been pretty good about keeping things organized from the get-go, which is, really, the best practice that anyone could have.

 

While it is fun to build convoluted scripts for problems that exist either barely or theoretically (or is that just me?), it is probably a better use of your time to work on something more substantial than the build scripts.

 

---

 

As for me, I'm not a Python guy, and my organization isn't nearly as good as Dean's. But, here's what I'd do if I had a standard way of doing things:

 

Each of my workflows would have two main directories:

1. lib

2. scripts

 

And each would (usually) have two scripts in the root:

1. filter.{EXT}

2. action.{EXT}

 

`filter` is just whatever script filter it is in, and the `action` should probably be called router because it actually just parses the action query and calls the right script (in the scripts directory) from there. All library files go in `lib`.

 

That works well enough for workflows written in Php, Bash, and Ruby.

 

Ideally, I'd keep all the code in a separate folder and use some symlinks like Dean does, but I'm not well-organized to do that. It'd also help me keep all the ST3 projects laid out well.

 

Damn. What this thread says is that I need to re-organize my computer (take the "re-" off that).

Link to comment

Yeah, I'm actually working on a system to nuke and pave my comp every 6 months or so. My thinking is that I can automate the parts of my setup that I have organized, and then reset and try to organize the parts that aren't on a clean install. So, instead of organizing only the mess itself, I am constructing a automatic system to rebuild the organized parts so I can try again to build up an organized system for some other part of my setup.

So, I just nuked my Mac a week ago. Used homebrew and homebrew cask to install binaries and apps, and I'm now trying to organize my development setup. That was the impetus for this thread. Once I get it organized, I'm going to use cookiecutter to automate building new projects and add my organized old projects to my automatic build system.

It's a bit extreme, but it makes my head feel good, so I'm doing it :)

Link to comment

You've always been pretty good about keeping things organized from the get-go, which is, really, the best practice that anyone could have.

My guiding principle is to try to keep things as foolproof as possible because when I don't I always end up the confused fool who broke stuff.

Simple > complex.

 

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

— Brian Kernighan

While it is fun to build convoluted scripts for problems that exist either barely or theoretically (or is that just me?), it is probably a better use of your time to work on something more substantial than the build scripts.

 

Exactly. I appreciate a challenge as much as the next man, but I'd rather there were a point to it.

 

Yeah, I'm actually working on a system to nuke and pave my comp every 6 months or so.

I have a modified version of that on my Mac. Some clever stuff there. Unfortunately, I'm not good about keeping track of what I've installed on my Mac, so nixxing it and reinstalling always takes far too long.

It'd be awesome to have a single script to set up a Mac just the way I like it.

Link to comment

I have a modified version of that on my Mac. Some clever stuff there. Unfortunately, I'm not good about keeping track of what I've installed on my Mac, so nixxing it and reinstalling always takes far too long.

It'd be awesome to have a single script to set up a Mac just the way I like it.

One other reason for my change is that it forces me to keep track of what I have installed. If I add an app or binary or do something, I add it to the git repo.

As for setting up a Mac just the way I want automatically, I'm currently trying to add custom icon support, but dots and mackup gets me pretty far along.

Link to comment

But I don't have to bother messing around with blacklists or whitelists or travis because I keep all the workflow files in a separate directory…

 

I certainly wouldn't want to ignore the files in .gitignore. That's for files that don't belong in the repo, which is a subset of files that don't belong in the compiled workflow.

 

It all sounds very much like a solution in search of a problem to me.

 

The reason a simple list of ignored files/extensions would work for my workflows is precisely because I keep them organised into subdirectories. If I dumped everything in the root, it wouldn't work because a bunch of scripts designed to help with development would need special-casing for each workflow.

 

On top of all that, the src subdirectory makes my repos much more understandable. You don't need to mentally parse scripts and dotfiles to figure out which bits actually belong to the workflow.

 

I just don't see any benefit is whipping up some all-singing, all-dancing build system to let me make my repos less organised.

Git is by definition a source control management system, hiding your source in a separate directory seems a bit backwards to me. It's actually a bit weird to store binary files in your git repository, admittedly I do the same since I was too lazy to upload it as a release in Github but that should be the proper flow.

As I said previously though, you don't have to bother with blacklists or whitelists to make it work. I looked at your repositories and there wasn't any need for fancy stuff. No need to agree with me but it still doesn't make the structure needed.

Perhaps I don't understand what you are saying though, because I don't really understand what a list of ignored filenames and extensions have to do with a directory structure.

Link to comment

Git is by definition a source control management system, hiding your source in a separate directory seems a bit backwards to me. It's actually a bit weird to store binary files in your git repository, admittedly I do the same since I was too lazy to upload it as a release in Github but that should be the proper flow.

As I said previously though, you don't have to bother with blacklists or whitelists to make it work. I looked at your repositories and there wasn't any need for fancy stuff. No need to agree with me but it still doesn't make the structure needed.

Perhaps I don't understand what you are saying though, because I don't really understand what a list of ignored filenames and extensions have to do with a directory structure.

 

There is a big difference between hiding your source and organizing your source. Neither of these is needed, and this does come down to preference, but organizing the source well via a directory structure starts with organization rather than organizing after the fact; it's a more planned, deliberate approach.

 

The upshot for organizing the source well is that it's far more understandable for others to read, and that becomes especially important when you look back at older projects and have to try to figure out what the hell you were thinking.

 

Basically, you could use a .gitignore file, but that sort of thing promotes bad coding practices (which I'm guilty of and often find myself trying to figure out what the fuck I was doing with something).

Link to comment

True, perhaps hiding is not the right word. It's indeed just a matter of preference, I generally prefer to keep things simple. Source straight up in the source control, if it's not source, it shouldn't be in the source control imho :)

Apologies if I appear a bit hostile, I've had a bad couple of days. Was just looking for an interesting discussion :)

Link to comment

True, perhaps hiding is not the right word. It's indeed just a matter of preference, I generally prefer to keep things simple. Source straight up in the source control, if it's not source, it shouldn't be in the source control imho :)

Apologies if I appear a bit hostile, I've had a bad couple of days. Was just looking for an interesting discussion :)

You must not have seen enough of @deanishe's comments :) Otherwise, you would know that Shawn and I would never consider any of that "hostile". Dean likes to get a bit drunk and speak his mind. I'm quite fine with it.
Link to comment

True, perhaps hiding is not the right word. It's indeed just a matter of preference, I generally prefer to keep things simple.

It's not a question of source code vs non-source code. It's a question of files that belong in the distributed workflow vs files that don't belong in the distributed workflow.

I have plenty of source code that belongs in the repo and belongs to the workflow project, but has no place in the distributed workflow (SVG files used to generate PNG icons, test scripts, scripts to generate data files for the workflow, README.md).

If you don't keep the two kinds of files (source code or not) separate, you're just making work for yourself.

Source straight up in the source control, if it's not source, it shouldn't be in the source control imho :)

So you don't put your workflows' icons in the repo? So your repos don't contain complete, working copies of your workflows?

Link to comment

It's not a question of source code vs non-source code. It's a question of files that belong in the distributed workflow vs files that don't belong in the distributed workflow.

I have plenty of source code that belongs in the repo and belongs to the workflow project, but has no place in the distributed workflow (SVG files used to generate PNG icons, test scripts, scripts to generate data files for the workflow, README.md).

For the svg files I would argue that you simply put ".svg" in the blacklist for workflows, there's no point in having them.

Tests are usually in a tests directory, so that's just a simple blacklist as well. Scripts to generate data files... thats a gray area, for the units library I actually generate the data file on the fly at the first run of the extension. So the actual data file is included but the generated binary version (which is pickled so it can differ per system) is not.

If the png files are generated from the svg files you could argue that the png files shouldn't be in the source control either and should simply be generated when building the package.

So you don't put your workflows' icons in the repo? So your repos don't contain complete, working copies of your workflows?

Currently (and mostly for testing) I keep the png versions of the icons in the repo. But generally I wouldn't keep any file that's generated by another file in the repository.

That would mean the png files will be generated from the svg files upon building the "alfredworkflow" file.

Perhaps I'll start working on a simple build script to help a little with the initial setup as well. Yours looks really nice so that would be a great start :)

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...