Jump to content

Exclude folders and filetypes both globally and for file filters


deanishe

Recommended Posts

Alfred should have the ability to exclude specific folders and filetypes from global and file-filter results, much like the file-filter include functionality, but in reverse.

 

Alfred uses Apple's search API, which is include-only. This works well most of the time, but as a result, the only way to exclude files/folders from Alfred's results is to use Spotlight's privacy settings to exclude them from OS X's index, which means they're unavailable in Spotlight and any other app that also relies on the search system (HoudahSpot etc.). In many cases, this is not a viable option.

 

There are many situations where Alfred's whitelist approach is a PITA or useless compared to a blacklist approach.

 

For example, I have a lot of source code, which I regularly search using Spotlight, but don't want showing up in Alfred. It would be pretty simple to fix that with a global blacklist, but it's literally impossible without.

 

Perhaps you want to include a directory in Alfred's global search, but exclude it from a specific file filter.

 

Perhaps you want to use a file filter to search all filetypes or subdirectories bar one or two. But without blacklisting, you're forced to explicitly include every folder/filetype but the ones you want to ignore. If that isn't enough work, you need to update your file filter every time you add a new folder or new type of file.

 

The problem is compounded by Alfred only displaying a limited number of results: you don't even get the chance to train it to associate a certain file with a keyword because the file never makes it into the list of search results.

Link to comment
Share on other sites

Completely agree. There are many file/folder patterns that i *never* want in my results, yet they're often there. `node_modules` being a big one, and i'm sure any developer that uses a package manager deals with the same struggle. Hundreds of results, non of which he wants.

 

I would really appreciate a good way to deal with this :) 

Link to comment
Share on other sites

  • 3 weeks later...

I totally second this request.

 

I especially find it very useful in the "Default Results" search: I'm always annoyed by the contents of the Microsoft Office folder showing up, even if I never need them (Uninstall, Office Reminders, Solver, etc. and the worst: Microsoft Clip Gallery, which always shows up when using the clipboad function...). It'd be great to be able to exclude this folder (it could even be done by default, since I suppose nobody usually needs it in the Default Results).

Link to comment
Share on other sites

I agree with this, even though it's not a huge problem for me.  It would still be worth having.  (Personally, I get around

 it by limiting the default search scope to my usual documents folders, and then I have a separate file filter to search the entire system and everywhere on it.)

Link to comment
Share on other sites

  • 1 month later...

Yup, I think this is a must-have.

 

I've got loads of .ico files showing up in the results, which I don't really need.

 

The problem with doing something like this is that, since Alfred uses the internal metadata server for queries, there is no "exclude this location" option for that except for including it in the privacy section of Spotlight preferences which is currently the recommended method of excluding locations. In order to add this functionality, Alfred would have to query and then go back and parse every result to determine if it matched the user excludes and then remove them. While this may sound like its not a big deal and seem like something that would be easily done, the downside is that it would affect performance since Alfred would have to go back and touch every result to determine if it matched the excludes.

 

Alternatively, I would suggest looking at your workflow and see how you can create filters to narrow down things. A lot of people think it's better to just add lower level folders to the scope so that Alfred searches everything but it affects performance and just clutters your results to the point that you have so much in your search results that you can't find what you were actually looking for.

 

I personally don't want EVERY type of file/folder in my search results because I typically know what I'm looking for. I know if I'm looking for a picture, or a folder, or if I want to look in a specific folder that I work in frequently. I use workflows and filters to drastically reduce results to match exactly what I want.  For instance, my defaults are to only show contacts, preferences and applications in results. I have filters for searching for ONLY folders, bookmarks, mail, chat, etc because I know when I'm looking for those things. 

Link to comment
Share on other sites

Hi David, thanks for your reply.

 

The problem with doing something like this is that, since Alfred uses the internal metadata server for queries, there is no "exclude this location" option for that except for including it in the privacy section of Spotlight preferences which is currently the recommended method of excluding locations.

 

OK, that's effectively a problem.

But in fact, even the "exclude this location" (Spotlight preferences) doesn't work in Alfred! I excluded the "Office" folder inside the Apps folder (to avoid all the Office plugins showing up in apps, for ex. "Microsoft Clip Manager" showing when I want the clipboard function), but it doesn't work! Those apps still appear in my default results. (They don't appear in Spotlight searches, nor in Alfred when I do a file search).

 

 

I personally don't want EVERY type of file/folder in my search results because I typically know what I'm looking for. I know if I'm looking for a picture, or a folder, or if I want to look in a specific folder that I work in frequently. I use workflows and filters to drastically reduce results to match exactly what I want.  For instance, my defaults are to only show contacts, preferences and applications in results. I have filters for searching for ONLY folders, bookmarks, mail, chat, etc because I know when I'm looking for those things. 

 

 

I agree with you, and that's the same config as mine. But even then, yes, there's still some noise in the results.

 

What I do find very interesting is the "find specific file type" you're referring to: indeed, when I'm looking for a file, I know what kind of file it is. It's not the same to look for a .DOC, .PDF or an image... I haven't seen specific workflows to do that. And in fact, since it's so logical, why don't implement it directly in the file search function in Alfred?

Link to comment
Share on other sites

Alternatively, I would suggest looking at your workflow and see how you can create filters to narrow down things. A lot of people think it's better to just add lower level folders to the scope so that Alfred searches everything but it affects performance and just clutters your results to the point that you have so much in your search results that you can't find what you were actually looking for.

 

I personally don't want EVERY type of file/folder in my search results because I typically know what I'm looking for. I know if I'm looking for a picture, or a folder, or if I want to look in a specific folder that I work in frequently. I use workflows and filters to drastically reduce results to match exactly what I want.  For instance, my defaults are to only show contacts, preferences and applications in results. I have filters for searching for ONLY folders, bookmarks, mail, chat, etc because I know when I'm looking for those things. 

 

The problem is that this isn't necessarily related to just workflows; the search results in general can be overly broad. If I type 'fav' then I'd rather not have every fav.ico file show up in the search. I'd like some way to exclude '*.ico' or 'fav.*' from the default results.

 

I think this is one of the tradeoffs with using Spotlight. Most of the work (updating, indexing etc.) is handled for you, but the cost is flexibility. I know applications similar to Alfred don't share its shortcomings (difficult to control inclusions/exclusions; doesn't deal with networked drives) because they build and maintain their own indexes. On the other hand, these other apps don't look anywhere near as good, and can be a pig to set up  :mellow:

Link to comment
Share on other sites

The problem with doing something like this is that, since Alfred uses the internal metadata server for queries, there is no "exclude this location" option for that except for including it in the privacy section of Spotlight preferences which is currently the recommended method of excluding locations. In order to add this functionality, Alfred would have to query and then go back and parse every result to determine if it matched the user excludes and then remove them. While this may sound like its not a big deal and seem like something that would be easily done, the downside is that it would affect performance since Alfred would have to go back and touch every result to determine if it matched the excludes.

 

Checking every result to see if its path starts with the path of an excluded folder or ends with an excluded extension is a very trivial operation.

 

To add some actual numbers to the discussion, I benchmarked such a directory-blacklist filter on my ancient Mac Pro (this is running on one 2.8 GHz core):

| No. of filters | Files filtered per second |
| -------------- | ------------------------- |
|              1 |                   2611658 |
|              5 |                    949018 |
|             10 |                    535346 |
|             20 |                    283730 |
|             50 |                    118983 |
|            100 |                     61502 |

Certainly, that might perceptibly affect performance if there were dozens of excludes, so add a warning.

 

Let me choose my own performance/functionality trade-off. I'm a big boy :)

 

 

 

Alternatively, I would suggest looking at your workflow and see how you can create filters to narrow down things. A lot of people think it's better to just add lower level folders to the scope so that Alfred searches everything but it affects performance and just clutters your results to the point that you have so much in your search results that you can't find what you were actually looking for.

 

Except, as noted in the OP, a whitelist-only approach is often a poor (or no) substitute for blacklisting. Even using file filters, the same applies.

 

 

 

I personally don't want EVERY type of file/folder in my search results because I typically know what I'm looking for. I know if I'm looking for a picture, or a folder, or if I want to look in a specific folder that I work in frequently. I use workflows and filters to drastically reduce results to match exactly what I want.  For instance, my defaults are to only show contacts, preferences and applications in results. I have filters for searching for ONLY folders, bookmarks, mail, chat, etc because I know when I'm looking for those things. 

 

That doesn't help. You're talking about a different set of problems. Sure, sometimes you can get around the lack of blacklisting using whitelisting, but it's often not a satisfactory alternative.

 

What if I want to search a dozen different kinds of documents, but ignore the 13th and 14th kind? Or 20 subdirectories, but not 21 and 22?

 

That's a helluva lot of work to do with whitelists, and brittle: I'd have to remember to update my file filters every time I added a subdirectory or new kind of file.

 

Even if it runs at half the speed the current system does, I don't care. I accept the trade-off. Please let me make that choice for myself.

Link to comment
Share on other sites

To reiterate my stance on excluding folder trees - this is unlikely to happen in Alfred any time soon. I’ll outline the main reasons so you can understand the implications it would have for Alfred.

 

Fundamentally, Alfred uses OS X’s metadata index for file search instead of constantly maintaining an index of the file system himself. This has a huge number of benefits, summed up in allowing Alfred to remain extremely fast and lightweight.

 

OS X's metadata API only supports folder scope inclusions, not exclusions. The idea to bring back all files first and then subsequently filter (by just file name) is not a good solution for the following reasons:

 

1. Performance and Scalability. Alfred needs to work and be extremely fast in the worst case scenario. This means on slow Macs and with significant number of files. Bringing back all files first and then filtering won’t always be fast, and definitely isn’t scaleable. Databases, file indexes and internet search engines have very specific indexes and optimisations to ensure that the data you want is presented almost instantly without having to post-filter results - the same applies to Alfred in this case.

 

2. Memory. While Alfred has [a fair amount of] control over the amount of memory he uses, he doesn’t have control over OS X’s mds. Asking the metadata server to return hundreds of thousands of files for every key you type during a file search will certainly make mds bloat over time while it caches the queries. For Macs with low memory, the impact of this will be significant as it will slow both Alfred and your Mac to a crawl.

 

3. Accuracy. With Alfred using OS X’s Lucene-like index, you get word based, diacritic insensitive, localised, deep metadata searching. The subsequent post-filtering of results would have to be across all this metadata too, which means individual loading the deep metadata for every file you have found. The loading of this additional metadata for thousands of files alone would make the filtering unusably slow.

 

If you have a specific need for file search which doesn’t match the general need of Alfred’s user base, a script-based workflow allows for live feedback into Alfred’s results, so your best bet in this situation is to create your own file search which filters out the items you don’t want. Deanishe - if you have already done the code to benchmark this, then putting this into a live feedback workflow wouldn’t be much more effort, then you have all the flexibility you need in your file filter.

 

Cheers,

Andrew

Link to comment
Share on other sites

Hi Andrew,

 

Thanks for the answer. It's too bad but totally understandable.

 

Now, is it normal that the "Privacy/Exclude this location" in Spotlight Preferences (which works with Alfred file search) won't work with the default (Apps especially) search?

(I want to exclude for example MS Office subfolder from Apps searching, since a lot of useless things are there, but even excluding it from Spotligh, it still appears in my default results [with only Apps, Contacts and Preferences activated])

Link to comment
Share on other sites

Hi Andrew,

 

Thanks for the answer. It's too bad but totally understandable.

 

Now, is it normal that the "Privacy/Exclude this location" in Spotlight Preferences (which works with Alfred file search) won't work with the default (Apps especially) search?

(I want to exclude for example MS Office subfolder from Apps searching, since a lot of useless things are there, but even excluding it from Spotligh, it still appears in my default results [with only Apps, Contacts and Preferences activated])

 

You may just need to clear Alfred's app cache, type 'reload' into Alfred :)

Link to comment
Share on other sites

  • 2 years later...

Sorry to revive such an old thread.

I understand Andrew's points.

But, if it's possible to do the filtering via a script on a workflow for advanced users, why not add a more advanced filter UI in the File Filter workflow object?

Right now it's only possible to filter by file type. It would be very handy to be able to filter with names. For example to exclude "node_modules" folders.

Link to comment
Share on other sites

  • 9 months later...

I'm reviving this thread, sorry. Well, not really. I am a developer and I like being able to get to what I am developing through Alfred but it is searching into my node_modules folders and it really needs not to do that. It is giving me extraneous results when I am not even looking for files. That is also adding thousands of files to the index, if not more, that should not be there.

Edited by jbutz
Link to comment
Share on other sites

@jbutz Spotlight's metadata query doesn't allow you to exclude folders and Alfred limited the query returned results to 100, and let's say that the first 1000 results from a specific query were from within these folders, if you returned all of these results, then subsequently filtered, apart from being slow and using loads of memory, you still wouldn't see results from other folders.

 

The best bet would be to create a workflow with a few file filters which work FOR the files you wanted to find. Even perhaps replace the open / find keywords. Like one which only finds images, text files, pages documents etc... and not the file types within the node_modules folder.

Link to comment
Share on other sites

14 minutes ago, Andrew said:

The best bet would be to create a workflow with a few file filters which work FOR the files you wanted to find

 

That isn't a workable solution, I'm afraid. This case (npm + node_modules) is exactly the kind of situation I had when I started this feature request.

 

It can't be done using the File Filter whitelisting approach.

 

It's impossible to add a project's root directory to the search scope without including node_modules (which is a subdirectory). That's a bummer right there, as a lot of important files sit alongside node_modules. And you also have to remember to update your File Filter every time you add a new subdirectory. 

 

The only things that work satisfactorily well, imo, are adding your node_modules folders to Spotlight's Privacy pane or writing your own workflow.

 

Link to comment
Share on other sites

Mostly hundreds or thousands of JavaScript files (plus markdown, JSON etc.). Which is mostly what your own code will be, too, I think (unless you're writing in TypeScript or just using node for the asset pipeline of your Go webapp).

 

The node_modules folder goes right next to your code, typically alongside package.json in the project root (which is where npm will install stuff by default). If you require a JS module by name and it isn't a built-in, the first place node looks for it is in a node_modules directory next to the file being run.

 

You can get similar situations with Go (vendored libs) and Python (virtual envs) where you have a lot of 3rd-party files and code in a subdirectory of your project. The absolute number of files isn't anywhere near as crazy as with node projects, however.

 

Edited by deanishe
Link to comment
Share on other sites

@deanishe these have content type 'com.netscape.javascript-source'.

 

This is why I try to suggest creating file filters which have the file types you are looking for such as images, plain text etc (even replacing out the default file filters with ones with multiple types for general search). Then the node_modules folder doesn't need to be excluded.

 

I've just spent the last few hours looking at the MDQuery API again, to find if there is a graceful way to exclude the e.g. node_modules folder, and there is still nothing I can do without loading tonnes of files first and subsequently filtering.

 

One possibility could be that I provide an extremely open ended "advanced" file filter object which has e.g. the following fields:

  1. A raw metadata query with {query} placeholders.
  2. A regex to subsequently filter by path, allowing path matches to be rejected.
  3. A "Stop looking when x results have been found".

It would have the caveat that with great power comes great uhhh potential slowness.

 

Cheers,

Andrew

 

Link to comment
Share on other sites

On 14/03/2017 at 5:10 PM, Andrew said:

This is why I try to suggest creating file filters which have the file types you are looking for such as images, plain text etc (even replacing out the default file filters with ones with multiple types for general search). Then the node_modules folder doesn't need to be excluded.

 

Except that doesn't work at all in many cases.

 

@Andrew, you do exactly the same with Alfred workflows. info.plist has to be in the workflow's root directory, so there's literally no way to create a File Filter for the workflow that includes important files like info.plist or icon.png but excludes the folder with all your 3rd-party libs in.

 

TL;DR: Can we please stop pretending that, with sufficient workarounds, Alfred can handle "everything but X" situations? It really can't.

 

Edited by deanishe
Link to comment
Share on other sites

7 hours ago, deanishe said:

TL;DR: Can we please stop pretending that, with sufficient workarounds, Alfred can handle "everything but X" situations? It really can't.

 

I've never suggested this - I'm just trying to suggest solutions which work within the bounds and limitations of macOS and its metadata index which Alfred's file search and file filtering is built on.

 

This is also why I suggested a possible alternative workflow object.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...