Jump to content

How do I speed up or cache the results of this workflow?


Recommended Posts

I have a script filter that looks like this:

#!/bin/bash
echo "<?xml version=\"1.0\"?>"
echo "<items>"
find ~/Documents -name '.git' -path '*{query}*' -exec dirname {} \; | while read -r file
do
base=$(basename "$file")
echo "<item type=\"file\" arg=\"${file}\" uuid=\"${file}\" autocomplete=\"${base}\">"
echo "<title>${base}</title>"
echo "<subtitle>${file}</subtitle>"
echo "<icon type=\"fileicon\">${file}</icon>"
echo "</item>"
done
echo "</items>"

The purpose of this script is to search all of my projects to be able to open them in Sublime. It looks for .git folders to accomplish this. This works great, but can be a bit slow as it is firing on every keystroke. The results either show up immediately or takes up to 1-3 seconds to appear. It's likely because I'm calling find over and over. The result doesn't change that often, so I figure that caching the results would help out a great deal.

 

Are there any existing solutions to this, or do I have to roll my own? If the ladder, is there any way to create a filter that makes an HTTP request? I was thinking I could make a daemon in Go that would keep a cache of the file listing that would periodically update itself.

 

Link to comment

 

I have a script filter that looks like this:

#!/bin/bash
echo "<?xml version=\"1.0\"?>"
echo "<items>"
find ~/Documents -name '.git' -path '*{query}*' -exec dirname {} \; | while read -r file
do
base=$(basename "$file")
echo "<item type=\"file\" arg=\"${file}\" uuid=\"${file}\" autocomplete=\"${base}\">"
echo "<title>${base}</title>"
echo "<subtitle>${file}</subtitle>"
echo "<icon type=\"fileicon\">${file}</icon>"
echo "</item>"
done
echo "</items>"

The purpose of this script is to search all of my projects to be able to open them in Sublime. It looks for .git folders to accomplish this. This works great, but can be a bit slow as it is firing on every keystroke. The results either show up immediately or takes up to 1-3 seconds to appear. It's likely because I'm calling find over and over. The result doesn't change that often, so I figure that caching the results would help out a great deal.

 

Are there any existing solutions to this, or do I have to roll my own? If the ladder, is there any way to create a filter that makes an HTTP request? I was thinking I could make a daemon in Go that would keep a cache of the file listing that would periodically update itself.

 

 

If they don't change very often you may be better served just making a workflow that has those paths saved and just updating it when necessary. You prevent having to have edit the xml all the time, just save the project name, description, path in an array and loop that.

Link to comment

I realized that I had a lot of files in my ~/Documents directory, especially since installing Parallels. So I made sure all my projects were in ~/Documents/Projects and then limited find's depth and the type to directories. Much much faster!

Edited by Luke
Link to comment

Yeah, I cache the results of calls to find (or mdfind — the two workflows work in different ways). They're both too slow to run directly and get acceptable workflow performance. Also, what Luke just said: find only looks one level deep by default.
 
The Git Repos workflow is configurable, so you can add different directories to search and specify the depth find should search. It caches the list of repos for 3 hours by default, but you can force an update at any time. The update runs in the background, so the workflow remains super-responsive. Also, the background update script runs 4 threads simultaneously to maximise update speed.
 
In my experience, that works just fine. I rarely have to force a manual update, as I typically never close a newly-created repo within 3 hours.

Edited by deanishe
Link to comment

How about considering `locate`? As long as you are not dealing with newly created files, locate is more convenient if you could make some preparation work...

 

Here is how I search files in my directory.

 

In crontab, the following line updates a locate database of my personal files.

5   *   *   *   *   SEARCHPATHS=$HOME FCODES=~/.locate.db updatedb

In my ~/.zshrc , the following first defines a function to make search like "locate_private A B" return anything matches "A*B". So if I type "l apple pdf" in command line, anything with apple and pdf in the middle of their full path would be returned immediately.

function locate_private() {
    pattern="*${*// /*}*"
    locate -i -d ~/.locate.db "$pattern"
}
alias l=' locate_private'

If your workflow is for yourself only. I think adding one extra line to your cron table is not too bad.

Link to comment

I've tried locate in workflows like this and it's a mixed bag.

 

The update is slow and infrequent. New items don't show up in your results for quite a while. Even if you trigger an update manually, it can take a couple of minutes for it to complete.

 

Also, depending on what you're searching for, locate can return a huge amount of unwanted results that need filtering, like files from within app bundles or ~/Library.

Link to comment

I've tried locate in workflows like this and it's a mixed bag.

 

The update is slow and infrequent. New items don't show up in your results for quite a while. Even if you trigger an update manually, it can take a couple of minutes for it to complete.

 

Also, depending on what you're searching for, locate can return a huge amount of unwanted results that need filtering, like files from within app bundles or ~/Library.

 

I agree the update is infrequent and slow, but if the directory structure is expected not to change too frequently I do not think it would be a big issue. As for the second problem, you can always prepend "document" in your search pattern to limit results to ones that have "document" in their full path.

Link to comment

Finding git repos is one of those situations where locate fails fairly hard, unfortunately.

locate finds all the repos everywhere, including submodules and anything you've installed using git, which if you're anything like me, will more than double the number of results you get.

locate was the first thing I tried when I wrote a workflow to find and work with git repos, but in the end it ran faster and was less insane to configure using find than using locate.

 

I couldn't figure out a good way to configure a locate-based solution to, say, ignore all of my vim plugins (git repos) except one.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...