Jump to content
rice.shawn

Alfred Dependency Downloader Framework

Recommended Posts

Do you want to have versioned libraries and utilities live side-by-side? Do you want to make your workflows smaller but keep or expand the functionality with helper apps? Do you want to make sure that everything works regardless of whether Gatekeeper is active on other users' computers? Then you might consider implementing the Alfred Bundler* dependency framework in your workflows.

 

To use it, all you have to do is include a small file in your workflow and make, usually, one call per utility/library you want to use, and, that's it. The bundler will automatically download any utility that you attempt to use that the user doesn't have installed, and, what's even better, is that it will keep everything in the Alfred 2 data directory instead a workflow directory or a user's Applications directory.

 

If you want to see a _very_ basic implementation, then download an example workflow from Packal and open it up to see the workflow's anatomy.

 

The libraries / utilities mentioned below are just pre-defined assets that you can load with no additional work. You can actually use this framework for any asset** by including a small JSON file with your workflow.

 

Take a look at the documentation page on Github for more information.

 

* The bundler is intentionally poorly named because it makes it so that you don't have to bundle dependencies.

** Don't try this with python or ruby libraries/modules/gems yet. A future version of the framework might support those languages, but, for now the idiosyncratic nature of package management with those languages makes this sort of approach difficult.

 

---

 

So, for a variety of reasons, I wrote a framework to use with Alfred to manage dependencies for some types of libraries as well as commonly used "utilities" like terminal notifier.

 

I'll push the release shortly, but I want to make sure that I include some good defaults with the initial version.

 

Right now, I've included:

 

PHP

  1. Workflows.php (v0.3.3)
  2. CFPropertyList (v2.0)

Bash

  1. BashWorkflowHandler

Utilities

  1. SetupIconsForTheme
  2. GlyphManager
  3. Terminal-Notifier (v1.5.0)
  4. CocoaDialog
    • v3.0.0-beta7
    • v2.1.1
  5. Pashua
  6. PHP5.5.5-CLI
  7. Viewer (an automator application that shows a pop-up HUD)

I can't include Python or Ruby libraries (modules/gems) yet because they work entirely differently and call for a much more complex and intense setup. But, are there any other utilities or libraries that I should include?

 

Let me know, and when you do, post a link.

Edited by Shawn Rice

Share this post


Link to post

Stephen, let's move this conversation about Python libs in the bundler to this thread: http://www.alfredforum.com/topic/4255-alfred-dependency-downloader-framework/.

Ok. Here we are. pip actually has an "ignore" flag (-I), which will install a package/module ignoring if it would version conflict with a pre-installed one. But this would probably be unnecessary, since I am thinking that the simplest course of action would be to install packages into the python directory, and then afterword move them into the proper sub-directory based on the package's version number. So, there would never be a bare package in the install directory to conflict with.

I'm toying with something now...

Share this post


Link to post

Check out my initial script in my fork of the project: https://github.com/smargh/alfred-bundler/tree/master

Right now, I only have the __load() function made. As of right now, you can only pass the package name and (optionally) a version number. The bundler first checks to see if the module is in the bundler directory tree ($USER/Library/Application Support/Alfred 2/Workflow Data/alfred.bundler-aries/assets/python/). If you pass a version number, it looks for that specific version, else it looks for the most up-to-date version. If it isn't there, or that particular version isn't there, the bundler will then use pip to install the package into the bundler directory. If you didn't pass a version number, the bundler uses a tmp directory, gets the version number, then renames the sub_directory (note: I had to use sudo mv since you need root access to fiddle with Python packages (at least on my machine)). If you pass a version number, however, the bundler will create the requisite sub-directories, and install the package there.

In the main() function, you can see an example of this would work for a package.

Thoughts?

Share this post


Link to post
Stephen, this looks like a great start.

 

Does every Python package on PyPi have an egg-info directory? I thought there was another format or two that might exist.

 

Sudo might be necessary because of the way that Pip is installed or where it might be trying to install to. But we might be able to get around that if we can get a standalone copy of Pip. This is possible, right? Basically, can we treat it like the way that the bundler treats a regular utility?

 

Besides that, we need to add a method to download libraries from places other than PyPi. Well, you could either rewrite what I've done in Python, or you could just hook into this script: https://github.com/shawnrice/alfred-bundler/blob/aries/includes/installAsset.php. To use it, you just send it a path to a JSON file and a version. Having this allows people to include their own utilities/libraries for re-use. There is a decent explanation of the JSON on the documentation page, and there are a bunch of different examples in the "meta/defaults" folder. If this gets working well enough, then we should create the JSON for Dean's library and the other Alfred python ones.

 

We'll want to move things around a bit more, hopefully, to have more separation between the wrapper and the bundler.

 

So, if you look at the Bash wrapper (https://github.com/shawnrice/alfred-bundler/blob/aries/wrappers/alfred.bundler.sh), then you'll see that the __load function, which takes up to four arguments as each load function should, calls another function called __loadAsset, which is defined in bundler.sh in the bundler directory. This separation makes it so that we can alter how the backend of the bundler works while still keeping the same wrapper.

 

The __installBundler function is simple in that it first makes the cache, data, and cache/installer directories, downloads the installer from Github into the cache/installer directory, and, lastly, executes the shell script. 

 

Here's the bash function.



# This just downloads the install script and starts it up.
function __installBundler {
  local installer="https://raw.githubusercontent.com/shawnrice/alfred-bundler/$bundler_version/meta/installer.sh"
  dir "$__cache"
  dir "$__cache/installer"
  dir "$__data"
  curl -sL "$installer" > "$__cache/installer/installer.sh"
  sh "$__cache/installer/installer.sh"
}


There are two other minor scripts that you'll want to tap into with the overall wrapper:

 


This ensures that an executable isn't denied by OSX's Gatekeeper. Basically, you need to run it whenever you load a utility

 


The registry doesn't have any functionality yet, but the framework is there. Basically, when a workflow uses an asset, any asset, it registers it with the bundler. The idea, later, is to have more of a "utility" workflow that will keep your data directory and caches clean (and maybe do other stuff), so people can uninstall "orphaned" assets. So, if the asset is registered only to a workflow that no longer exists, then you could uninstall it.

 

Registry.php is actually just more of a wrapper around the registry functions, but the nice thing is that you can call it from the command line with "php path/to/registry.php 'bundleid' 'assetname' 'version'" — so you don't need to rewrite that for Python. I decided to separate it out because the registry is JSON, and bash just doesn't work well with JSON.

 

You can either duplicate these scripts in Python, or you could just call them. The latter seems to be the more efficient way.

 

For the bash and php versions of the bundler (bundler.sh and bundler.php), I just put them in as part of the __loadAsset() function.

Share this post


Link to post

Hi guys,
 
Couple of points wrt the Python version.
 
sudo should never be necessary, as you aren't (well, shouldn't be) installing to any system directories. pip is designed to keep everything self-contained if you use --target.

You really shouldn't install anything to Python's site-packages directory: pip does a lot more stuff than just creating a subdirectory and installing the code in it. It might be creating .pth files, installing wrapper scripts to the appropriate bin directory etc. --target is the way to go.

As a result, subprocess would be a cleaner way to call pip, rather than going via AppleScript.

You mustn't use any system pip that is installed, as you have no way of knowing which version of Python it's associated with. If pip's installed, it's quite likely that it belongs to a Python 3 installation, so calling that will quite often install incompatible versions of packages. Even if it belongs to a different 2.7 install, it's possible pip will skip dependencies that are already installed in that version, but not the system Python that Alfred uses. What's more, pip is usually installed in /usr/local/bin, which isn't in PATH anyway when code is run from within Alfred, so it wouldn't even get found.

Use /usr/bin/python get-pip.py for any installs.

Don't call the main function __load. Double underscores are for "private" functions/methods that aren't meant to be called from outside the same module/class. You don't need to mangle the name of the function, as Python has great namespace support. Just name it load and use:
 

import bundler
bundler.load('package')

import package.thingy
...

bundler.load should also take care of the sys.path.insert(0, 'dirpath') itself.

We don't seem to be getting any closer to solving the problem of pip automatically installing dependencies. The requirements.txt and requires section of setup.py are complex beasts. They contain top-level dependencies, but not necessarily their dependencies in turn.

Versioning is also a PITA. The formats are not necessarily package==3.0 or package>=3.0, but can also be something like -e git://git.myproject.org/MyProject.git@v1.0#egg=MyProject. How do you name the subdirectory for that? :(

That's why my recommendation was (is) to look into hooking into/modifying pip to perform the installations. There's an awful lot of complexity to installing Python libraries, and to cover even the most common cases, you're going to end up duplicating an awful lot of pip's functionality. Seems to make sense to start with pip's own, well-tested code.

Personally, I'd just skip the whole versioned-subdirectories business, and go with a thin wrapper around pip. Python workflow authors can create their own requirements.txt and pass that to bundler, which uses pip to install everything in a workflow-specific directory and then adds the directory to sys.path.
 

import bundler

bundler.install('./requirements.txt')
# ~/Library/Application Support/Alfred 2/Workflow Data/alfred.bundler-aries/assets/python/com.myworkflow/ is now in sys.path

import requests
import feedparser
...

There'd still be loads of duplicate libraries, but they'd be on the big, fat harddrive, not cluttering up users' Dropboxes or bloating Packal's repo, and they wouldn't have to be redownloaded every time a workflow is updated.

If a workflow needs a newer version of a dependency, the author can update the requirements.txt file and bundler would update the library.

bundler would have to cache the modification time or MD5 hash of the requirements.txt file, so it isn't running pip every time the workflow is run (which would massively hit performance).

For additional flexibility (e.g. installing stuff that can't be done via pip), perhaps add a bundler.run function that would run a shell script once and again if it changes.

And, of course, bundler.util for installing/getting the path of utilities. That could be a simple wrapper around the bash version.

Edited by deanishe

Share this post


Link to post

Aesthetically, having multiple copies of the same library on the system is somewhat displeasing, but it might have to be a tradeoff.

 

I guess, then, the question would be whether or not to store those in the workflow's directory or in a subdirectory of the bundler. There is a good argument for each.

 

Instead of having to cache anything, by default the bundler could just look to see if the directory for the package is there, and, if so, not call pip. At least that's how I've been doing it so far. Otherwise, you could do a bunch of try statements and call pip on the errors. That might be easier than doing those sorts of hashes, or it might not.

 

So far, I've tried to prefix each function with the double underscore to make sure that it didn't clash with a workflow author's own function naming scheme. Since .load would always be called with bundler in front of it, it shouldn't affect how a python version would work.

Share this post


Link to post

Aesthetically, having multiple copies of the same library on the system is somewhat displeasing, but it might have to be a tradeoff.

There really isn't a simple way around multiple copies with Python. As noted in our emails on the subject, the super-smart folks in the Python community who write pip and do packaging etc. haven't gone near versioned libraries and have chosen virtualenvs and multiple copies instead (which is now included in Python 3).

 

I guess, then, the question would be whether or not to store those in the workflow's directory or in a subdirectory of the bundler. There is a good argument for each.

I think on balance the libraries should be installed in the bundler's directory. They're provided by the bundler, and workflow authors shouldn't really have to worry about what bundler is doing behind the scenes, which they would if it were putting stuff in their workflows' directories.

 

Instead of having to cache anything, by default the bundler could just look to see if the directory for the package is there, and, if so, not call pip. At least that's how I've been doing it so far. Otherwise, you could do a bunch of try statements and call pip on the errors. That might be easier than doing those sorts of hashes, or it might not.

That would require the bundler to understand Python packaging, however (Python libraries aren't necessarily stored in a directory with the same name as the library), and as we've discussed, there simply isn't a standard way to get the version of an installed library.

The bundler could write its metadata to a .bundler file in the directory where the libraries are installed.

Using try ... except would add a lot of clutter to workflow scripts. Asking authors to write a requirements.txt file is not ideal, but I think it'd be less work—and less error prone—than having to wrap all imports in try ... except clauses with bundler.install statements. In particular, you'd only have to specify the library and version once (in requirements.txt), not in each Python file, so there's less chance of forgetting to update the version number in one of them.

I suppose the API could work in different ways, depending on the complexity of the workflow.

If you only have one Python script, it would be simpler to do:

 

import bundler

bundler.install('requests==1.1')

import requests
And for larger workflows you could just do:

 

import bundler

bundler.install()

...
and it would use the requirements.txt in the workflow root directory (or throw an error if it's missing).

 

So far, I've tried to prefix each function with the double underscore to make sure that it didn't clash with a workflow author's own function naming scheme. Since .load would always be called with bundler in front of it, it shouldn't affect how a python version would work.

That makes complete sense with PHP/bash where namespaces are not really a thing, but it's un-Pythonic, especially using __ as the prefix, which is Python for "Don't use this function/method, it's private".

The API should be in line with the conventions of the language it's aimed at, not standardised across implementations, IMO.

Edited by deanishe

Share this post


Link to post

I'll cede the aesthetics, but I'll keep them as a pipe dream. Maybe this problem is the motivation that I need to actually learn Python and figure out a good way to do it.... or maybe I'll just write my dissertation.

 

I think on balance the libraries should be installed in the bundler's directory. They're provided by the bundler, and workflow authors shouldn't really have to worry about what bundler is doing behind the scenes, which they would if it were putting stuff in their workflows' directories.

 

That does make the most sense.

 

Using try ... except would add a lot of clutter to workflow scripts. Asking authors to write a requirements.txt file is not ideal, but I think it'd be less work—and less error prone—than having to wrap all imports in try ... except clauses with bundler.install statements. In particular, you'd only have to specify the library and version once (in requirements.txt), not in each Python file, so there's less chance of forgetting to update the version number in one of them.

 

 

What if all of those try/except statements were in the bundler logic rather than in the workflow? Then, the bundler could import the sys.path and try to import each package, and, on fail, install them. Or would that cause the script to crawl? Or, worse, would that screw with the script?

 

The API should be in line with the conventions of the language it's aimed at, not standardised across implementations, IMO.

 

I do agree. Or, at the very least, they shouldn't violate the language. The namespacing in Python makes the prefix unnecessary.

Share this post


Link to post

Running the try ... except clauses in the bundler probably wouldn't hit performance noticeably (it's very fast in Python). The importing would take much more time, but that has to be done anyway.

The problem I see with having the bundler try to import libraries itself and install them if they're missing is that the "install name" of libraries is often different from the "import name". For example, for the standard MySQL package, you use import MySQLdb but pip install MySQL-python. So you'd have to specify both names in the call to the bundler:
 

import bundler

bundler.load('requests==2.1', # "normal" package
             ('MySQL-python>=1.4', 'MySQLdb'), # tell bundler install/import names differ
             ('git+https://github.com/deanishe/flask-login.git', 'flask.ext.login') # GitHub fork with additional/fixed features
            )

import requests
import MySQLdb as mdb
from flask.ext.login import LoginManager
…

The equivalent using requirements.txt would be:

requirements.txt

requests==2.1
MySQL-python>=1.4
-e git+https://github.com/deanishe/flask-login.git

Python:

import bundler

bundler.load()

import requests
import MySQLdb as mdb
from flask.ext.login import LoginManager
…

I think that's a cleaner API and it would make bundler easier to implement.

 

Share this post


Link to post
Ok. I have a lot of catch up to do on this front, but I'm still interested in helping to make this work. While I am digesting everything above, I thought it might prove helpful to share this Gist which I found when perousing the Pythonista forums. This is a single Python script that mimics the functionality of the pip. I think it might provide a good starting point for thinking about integrating similar functionality in bundler

 

This is the original script. And this is a working 2.0 version. Finally, this is a fork that adds auto-unpacking of package. 

Share this post


Link to post

It doesn't handle dependencies.

That's kind of a big thing: Python libraries very often build on others and are made with the assumption that pip (or equivalent) will be used to install it and any dependencies.

If you want to throw dependency support in the bin, then that gist seems a good place to start. But that's a whole other can of worms to be opening.

It may well work—workflows generally don't tend to rely on libraries with a lot of dependencies—but you'll probably have a lot of support requests to field from people who don't understand why it doesn't work like they expect (Python libraries often aren't very upfront about their dependencies, as it's assumed pip will be taking care of all that), and you're basically offloading the parsing of setup.py or requirements.txt and dependency handling to the user (and both those files will have been hidden away somewhere deep in the bundler's data directory or deleted upon installation).

Share this post


Link to post

So I'm back considering this issue. I agree with deanishe on the organization of the python sub-directory within bundlers data directory: it should just have all the packages at one level of nesting in that directory (mirroring the structure of site-packages directories for Python instances):

Personally, I'd just skip the whole versioned-subdirectories business, and go with a thin wrapper around pip. Python workflow authors can create their own requirements.txt and pass that to bundler, which uses pip to install everything in a workflow-specific directory and then adds the directory to sys.path.

I've also found that it is fairly simple to use pip programmatically. This avoids having to download pip; instead, the bundler can come bundled with pip inside it. By this I mean, we can have the bundler install pip from GitHub on first run and place it in the python sub-directory, then the bundler script imports that pip package and runs it programmatically. This allows us to leverage the power and stability of pip without "installing" it in the normal sense.

To go forward, I think it would be helpful to write up exactly what we want the bundler to do, how we structure the API, and what is expected to happen at each point (e.g. first run of bundler, first import of new package, second et al. imports of old package, etc). Clearly defining what we want it to look like and function will help with writing the code. In general, I like deanishe's suggestions, but let's expand and be explicit. I'm willing to help with this and even take point on writing the code, but I think 3 heads are better than one at thinking through the setup and execution.

So, thoughts?

Share this post


Link to post

I've had a wee think about this myself. I'm taking the usage of pip and a single, per-workflow directory in which all dependencies can be installed with pip install --target=/path/to/dir as a given. The bundler would add that directory to sys.path.

I still see a few potential problems.

Obviously, you can't be calling pip every time the workflow is run: far too slow. So you need to do something else.

My current preference would be to require the author to create a requirements.txt file. bundler can hash this and store the hash (see below). If the hash is different to the stored one (or there is no stored one), it calls pip on requirements.txt. The workflow code might look something like this:
 

import bundler; bundler.init()

import this
import antigravity
…

(BTW, you should try both those imports if you haven't already.)

bundler.init() would automatically look for a requirements.txt file. (You could possibly skip the init() call and run the code at the module level, but that's frowned upon.) Alternately, a string argument could be passed that can be passed straight to pip, e.g. bundler.init('requests>=1 pytz') (which I don't think is always a wise option—see below).

It would throw up a dialog via AppleScript to ask for the user's permission/inform them what's going on.

If pip succeeds, we store the hash of requirements.txt (in the workflow's designated bundler library directory) so we can not only tell that the packages have been installed, but also if requirements.txt has been altered, i.e. the workflow has been updated.

I don't particularly like the alternative of passing a string to bundler.init(), as that would mean duplicating the call in every script that's an entry point to the workflow. Duplication is bad: it leads to errors. Still, we could offer it as an option (with the necessary caveat utilitor) for simpler workflows.

If requirements.txt has changed, we deep-six the existing library dir and run pip again, notifying the user etc. Running pip in update mode is possible, but the directory might end up accumulating a lot of cruft. Best to keep everything clean.

I see one final potential problem compared to authors' bundling packages with workflows: Some packages include C/C++ code, which needs to be compiled, and thus can't be installed if Xcode/the Xcode command-line tools aren't installed.

Asking users to install either of those is a bit much, imo, so we'd have to wash our hands of such packages and tell developers they need to include 'em directly in their workflows if they really, truly have to use them, 'cos we ain't gonna field support requests for that kind of crap.

I've seen a couple of workflows trying to install/bundle lxml, and that's a huge can of worms veiled in tears.

I think this model could allow us to create a fairly streamlined and sane Python bundler.

Whaddya reckon?

Share this post


Link to post

Addendum: bundler would treat pip as a utility to be installed in the same way as CocoaDialog etc. Including it with the bundler library would largely defeat the purpose of the whole exercise (get-pip.py is ~1.5MB).

Share this post


Link to post

I agree wrt bundler installing pip as a utility. I already have a script that does so from the GitHub repo. It doesn't install pip into the user's environment (as my previous script did); it merely installs the bare Python package into the bundler directory.

I'm a bit confused by your comments tho. I was never supposing we would call pip every run of the workflow. The bundler is (to my mind) a more robust thing than a wrapper around pip. Instead, pip is only called on the first run of the workflow (or an individual script within the workflow) to download the necessary dependencies. After that, the bundler will simply add the workflow dependency directory to sys.path without every touching pip.

I agree that an external requirements.txt file is probably best and simplest. One place to read (for bundler), one place to make changes (for workflow authors). We will need to establish a basic syntax. I confess that I don't exactly know what "hashing" means. I know the term from studying basic concepts of search algorithms, but what and how exactly one hashes a text doc is a bit over my head (related: what and how to "deep six"?).

I see one final potential problem compared to authors' bundling packages with workflows: Some packages include C/C++ code, which needs to be compiled, and thus can't be installed if Xcode/the Xcode command-line tools aren't installed.

Also, for clarification, how would a pip installed packaged via bundler be uncompiled C, but the same package bundled with the workflow would be? I know nothing about C in Python, so this is just for my edification.


So, the user-facing API would only include init? This will primarily take no args and look for a requirements.txt file, but can accept a string.

Share this post


Link to post

I agree wrt bundler installing pip as a utility. I already have a script that does so from the GitHub repo. It doesn't install pip into the user's environment (as my previous script did); it merely installs the bare Python package into the bundler directory.

I'm a bit confused by your comments tho. I was never supposing we would call pip every run of the workflow.

Nobody is, AFAIK, but just to be clear.

 

The bundler is (to my mind) a more robust thing than a wrapper around pip.

What do you mean by "robust"?

 

Instead, pip is only called on the first run of the workflow (or an individual script within the workflow) to download the necessary dependencies. After that, the bundler will simply add the workflow dependency directory to sys.path without every touching pip.

I agree that an external requirements.txt file is probably best and simplest. One place to read (for bundler), one place to make changes (for workflow authors). We will need to establish a basic syntax. I confess that I don't exactly know what "hashing" means. I know the term from studying basic concepts of search algorithms, but what and how exactly one hashes a text doc is a bit over my head (related: what and how to "deep six"?).

requirements.txt is a standard pip file. The syntax is fixed. If you develop your workflow using a virtual env, pip can generate requirements.txt for you.

"Hashing" means generating a short but unique "hash" from some data. The same data always produces the same hash, but you can't recreate the data from the hash. It's a way of checking if data is correct and/or has been changed. It's trivially easy to do.

By "deep six", I just mean delete.

Also, for clarification, how would a pip installed packaged via bundler be uncompiled C, but the same package bundled with the workflow would be? I know nothing about C in Python, so this is just for my edification.

Packages that use C code include the source code and are compiled on installation (whether by pip or using setup.py). A package that's bundled with a workflow was installed (and thus compiled) by the workflow author.

 

So, the user-facing API would only include init? This will primarily take no args and look for a requirements.txt file, but can accept a string.

That's how I think the dependency installer could work, yes. I presume there'd also be an API for using utilities.

Share this post


Link to post

The overall logic of the bundler is:

(1) it just tries to use the files, assuming that they're there;

(2) if not there, it does a fallback to downloading them.

 

The same logic would go for any language. So, to invoke the python bundler, we'd first assume that the files were there (a simple test), and, on failure, we'd install them.

 

The same goes for pip (treating pip as a utility: yes): we assume the bundler has already installed it, and, failing to find it, the bundler installs it itself. This check could be a double fallback in that it does a check for pip system-wide (which pip), and, receiving no output (or an exit code other than 0), it then tries to invoke the bundler pip, and, failing that, it downloads the bundler pip and tries again.

 

Hence, the approach with a requirements.txt could work very well, but the hash for requirements would also have to somehow include the directory in case the user screwed with the files that were there. (re: smarg — there are functions that create the hash for you; an md5 hash, for example, looks like: 0fbe71c455f5d34c7193797563b63b3f).

 

We can model the API for using utilities off of the bash and php scripts that are there now. Basically, the logic is the same: use Python's functions to check if the utility file exists (as defined in the utility's JSON metafile); if so, grab the invocation information (usually stored in the directory's 'invoke') to get the base call, and then do something like:

from subprocess import call
call(["/full/path/to/utility", "args for utility"])

The actual that should be included with the workflows should be as simple and lightweight as possible. The bash and php versions each have two functions. The first loads the asset, and the second installs the bundler. These are actually simpler than they sound because the install function just downloads the install script from GH and executes it, and the Load function just calls the bundler load function in a file in the bundler. The reason to abstract it as such is that we have the ability to update the way that things work in the background without having people's workflows break. That's the logic between the major/minor updates: minor updates (checked on calls -- randomly or once a week or so) update the bundler files in the bundler directory. Major updates require a new file to be included in the workflow, and, hence, a completely new bundler folder so that things don't clash.

 

I like the ideas here.

Share this post


Link to post

The overall logic of the bundler is:

(1) it just tries to use the files, assuming that they're there;

(2) if not there, it does a fallback to downloading them.

 

The same logic would go for any language. So, to invoke the python bundler, we'd first assume that the files were there (a simple test), and, on failure, we'd install them.

I'm not entirely sure what you're referring to there. That would work for utilities, but is essentially impossible for importable libraries on account of the way Python imports/package installations work.

WRT utilities, would it be better to re-implement it in Python or should there just be a simple wrapper around the bash version?

Share this post


Link to post

Ok. Soaking this all in, let me take a stab at outlining full functionality.

alfred_bundler.py (or just bundler.py?) is a single script that the user bundles with their workflow and calls in their other scripts.

The only user-facing API method is bundler.load(), which takes no necessary args but a few possible args:

  • Asset(s):
    • name of module/package as string
    • list of names of modules/packages
    • path to requirements.txt as string
  • Version:
    • if simple name, then simple string version
    • if list of names, then sub-list with [name, version]
       

 

So, some examples:

import bundler

# Simplest possible call
bundler.load()

# Simple name
bundler.load('xmltodict')

# Simple list of names
bundler.load(['xmltodict', 'requests', 'feedparser'])

# Simple path to requirements.txt
bundler.load('path/to/requirements.txt')

# Single name with version
bundler.load('xmltodict', '1.0')

# List of names with versions
bundler.load([['xmltodict', '1.0'], ['requests', '>=2.0', ['feedparser', None]]])
# None means version doesn't matter for that item

I think all the explicit examples speak for themselves. The only to expand on is the simplest one. If you don't have any args, then bundler looks for requirements.txt in the workflow directory.

The requirements.txt file follows the simple syntax used by pip. Each line is a dependency:

MyApp
Framework==0.9.4
Library>=0.2

Note that each line needs to be exactly what you would type if you were installing this item via pip on the command line. (Correct me if I'm wrong here Dean...) This means that you (the workflow author) need to know how this would be installed from pip standardly (Dean mentioned above about the differences between pip install names and import names). In essence, take whatever comes after $ pip .

This is the recommended use-case for bundler.

 


 

Now, under the hood, what happens?

bundler will attempt to import pip (this effectively means that the pip stuff happens on import, not on the load() call). If it fails, it will download pip and then import.

Assuming there are no args, bundler.load() will look for and read the requirements.txt file in the workflow directory (i.e. bundler's own directory). Next, load() will check to see if the storage directory tree has been created. This directory tree would look like:

~/Library/Application Support/Alfred 2/Workflow Data/alfred.bundler-aries/assets/python/
  pip
  [workflow bundle id]
    [dependency]
    [package]
    [library]
    [etc]
  [other wf id]
  [etc]

If this workflow's sub-directory in the python assets directory doesn't exist, it is created.

For each line of the text file, load() will first look in this workflow's bundler storage directory for that package/module. This can (will need to be?) a bit fuzzy. For example, if you were installing a package, there would be a folder named that package; but, if it were a simple single script module, there would be module.py in the workflow's directory. Also, there may be some minor discrepancies between the name given to pip and the name of the dir/file downloaded (I haven't seen any yet in my testing, but who knows). If the package/module is found, then pass; else, use the pip module (which is now in the python assets directory) to download that item.

If load() needs to construct the proper pip call, it will essentially create this pip call:

$ pip install --target ~/Library/Application Support/Alfred 2/Workflow Data/alfred.bundler-aries/assets/python/[workflow id] package

NOTE: package will be exactly what is printed on that line in the requirements.txt file.

Once (if) everything is downloaded to the workflow's bundler storage directory, that path will be added to sys.path and load() will return True.

This means that the code at the top of the workflow's Python script will look something like:

 


 

Now, this is as far as my brain has gotten. Obviously, I have said nothing about hashing the requirements.txt file, or reading that hash to check for updates. As it stands (in my vision), this seems superfluous. load() will read the requirements file each time; if it's changed, then load() will function differently on that call.

Also, I think we could make a mirror method init() for bundler that only functions without any args and so reads the requirements.txt file. I think that load() without any args looks a bit odd, but init('xmltodict') also looks odd, so I say have them both. Maybe even make load() require args.

Also, I have a pretty good bit of the code for this version of the bundler written. You can look at it (its not really organized nor is any individual part done) here.

Hopefully this is all explicit enough to set the foundation for very clear further discussion. If you have suggestions or disagreements, let me know.

stephen

Edited by smarg19

Share this post


Link to post

I'm not entirely sure what you're referring to there. That would work for utilities, but is essentially impossible for importable libraries on account of the way Python imports/package installations work.

WRT utilities, would it be better to re-implement it in Python or should there just be a simple wrapper around the bash version?

 

I think I asked you this before, but how much performance loss would try/fail statements be for the imports? I've seen them enough in setup.py files, so I know doing that is possible. Otherwise, we might just check for the existence of a folder for the package inside the custom directory that the bundler has setup for the workflow.

 

Utilities: I'm not sure what would be faster, honestly. My inclination is that wrapping a few commands in Python would be faster, but I'm not sure. Using the generic bash version inside a wrapper would be the easiest way to go, but, again, I'm not sure it would be the best. The only thing that the script should return is, basically, the path to the utility (so, for terminal notifier, it leads to ~/Library/Application Support/Alfred 2/Workflow Data/alfred.bundler-aries/assets/utility/terminal-notifier/default/terminal-notifier.app/MacOS/terminal-notifier.

 

Ultimately, the bash script already handles it, so it might just be easiest to pass it to that script with the arguments in the proper order.

Share this post


Link to post

WRT utilities, would it be better to re-implement it in Python or should there just be a simple wrapper around the bash version?

I vote yes to the Python wrapper around the bash. And being a bash newbie, I vote nose-goes (hand-to-nose immediately) :)

Share this post


Link to post

We could make the syntax a bit more restrictive and require the requirements.txt file. So, bundler.load() looks for a requirements.txt file in its own directory; if it isn't found and an info.plist isn't found in its directory then it looks for ../requirements.txt. After that, it just freaks out and throws an error. If you already have written a lot of the code to handle all of those arguments, however, let's go with what you laid out.

 

If we're treating pip as a utility, then we should put it in the assets/utility folder. It just makes sense to treat it as a duck if we call it a duck.

 

If you're worried about the naming of directories, then we have a couple of options:

(1) rename them all to something regular for the bundler to understand easily (not sure how well this plays with Python);

(2) generate a smaller 'init.py'-like file that goes in the assets/python/workflowbundleid directory. Then, we need have the bundler read only that file and execute that code/load those modules, which would save a lot of time. The bundler could generate this file when it sets up the directory.

Share this post


Link to post

The package would be called bundler, the PyPi name—if there is one—would be alfred-bundler, i.e. pip install alfred-bundler but import bundler.

@Stephen: you're overthinking things. Our options are very limited, as workflows need to execute fast. When a script calls bundler.init() (or whatever we decide to call the entry point), it can parse info.plist for the bundle ID, check for a requirements.txt, hash it, and compare the hash to the one stored in the workflow's bundler directory (probably, we'd first compare the modtime of the files, and only hash it if that had changed). If there's no cached hash, or it has changed, pip gets (installed and) called with requirements.txt or the string passed to init().

We can't try to intercept import calls or implement an alternative interface because it's very complicated with Python, and as noted, there is no guaranteed correspondence between the name of the library you import and the name of the package that provides it.

Installations would be to ~/Application Support/Alfred 2/Workflow Data/alfred-bunder/bundle-id-of-package. (It's so simple, there's little reason to create a new install dir if/when bunder is updated.)

It definitely makes sense to look for/expect requirements.txt to be in the same directory as `info.plist, i.e. the workflow root.

Edited by deanishe

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×