Filtering JSON Output: A Very Basic Python Question

Jasondm007 · October 14, 2020

I have a very basic python question that I was hoping to get a little help with that involves filtering out list items in a script filter. At the moment, the script filter works great except that it includes a few items I'd prefer not to see in Alfred's output. Is there any easy way to remove items whose titles can be found in another list?

Admittedly, I normally do these sorts of things in AppleScript - which is pretty easy to do in this case - except that I’ve been trying to learn a little python, given all of the limitations with AppleScript (which @deanishe and @vitor have rightly reminded me of on numerous occasions, so hopefully this will make them proud 😃).

For example, let’s say that I have the following list:

titles_remove = {"Title A", "Title B", "Title C"}

And, before outputting my results in Alfred, I’d like to remove all items whose titles can be found in titles_remove.

At the moment, my script filter ends with the following line:

print(json.dumps(result))

And, it’s JSON output follows the usual format where each item has a title, subtitle, uid, and arg. Now, if I modify that last line so that it’s output is:

unfiltered_output = json.dumps(result)

What should I do next to remove items whose titles can be found in the titles_remove list?

I tried following several different python tutorials, but I kept receiving errors. I suspect that some of the methods weren't intended for dealing with strings. But I'm a complete newbie here.

Thanks for any help you can lend! I really appreciate it. And, if anyone has any advice for python newbies that might want to do things with Alfred, I'd greatly appreciate any recommendations on potential resources to check out, etc. Thanks!!

deanishe · October 14, 2020

1 hour ago, Jasondm007 said:

For example, let’s say that I have the following list

That syntax defines a set, not a list. It doesn’t matter here, but it’s important to know the difference. A set usually isn’t what you want.

1 hour ago, Jasondm007 said:

And, before outputting my results in Alfred, I’d like to remove all items whose titles can be found in titles_remove

You use a for loop or a list comprehension to remove the items you don’t want:

# assuming results is a list of Alfred feedback dicts:
# [
#   {
#       'title': '...',
#   },
# ]
results = [d for d in results if d['title'] not in titles_remove]

1 hour ago, Jasondm007 said:

print(json.dumps(result))

This is rather inefficient. It’s better to use json.dump(result, sys.stdout)

Jasondm007 · October 14, 2020

@deanishe Thanks a ton for getting back to me. This is super helpful!

Although I've run into this issue on several occasions, this post was prompted by my frustration trying to modify a portion of Charles Ma's workflow (which I had originally changed to read a different plist file in DEVONthink). I tried implementing your suggestions in a couple of different ways, but am running into similar errors as before.

I've included one such iteration below. The line that is commented out is essentially where the script used to stop. I assume I'm implementing the "for" loop incorrectly?

#!/usr/bin/python
# -*- coding: UTF-8 -*-
import plistlib
import os
import json
titles_remove = {"Title A", "Title B", "Title C"}
filePath = os.path.expanduser("~/Library/Application Support/DEVONthink 3/SmartGroups.plist")
if os.path.exists(filePath):
    result = {"items": []}
    plObjList = plistlib.readPlist(filePath)
    for plobj in plObjList:
        result["items"].append({
            "title": plobj["name"],
            "subtitle": "Open Smart Group",
            "uid": plobj["name"],
            "arg": plobj["sync"]["UUID"]})
    #print(json.dumps(result))
    unfiltered_output = json.dumps(result)
    filtered_output = [d for d in unfiltered_output if d["title"] not in titles_remove]
    print(json.dumps(filtered_output))
else:
    print('{"items": [{"title": "Error","subtitle": "No Smart Groups Found"}]}')

Thanks again for all of your help!

Relatedly, do you have a favorite editor for testing python scripts? I've always liked the simplicity of Apple's Script Editor and was hoping to find something equally as easy to work with. At the moment, I've usually just tinkered around in Alfred with python scripts or Atom & BBEdit but I find their outputs difficult to see. Thanks again!

deanishe · October 15, 2020

5 hours ago, Jasondm007 said:

unfiltered_output = json.dumps(result)

filtered_output = [d for d in unfiltered_output if d["title"] not in titles_remove]

Get rid of unfiltered_output. That line converts result to a string. Second line should be result = [d for d in result if d['title'] not in titles_remove]

And if you have any more questions, please upload your workflow somewhere.

5 hours ago, Jasondm007 said:

Relatedly, do you have a favorite editor for testing python scripts?

What do you mean "testing"? I mostly use Sublime or neovim as an editor, in any case.

Jasondm007 · October 16, 2020

@deanishe Thanks again for all of your help. And, for the editor suggestions, too! I haven't tried either of those editors yet, so I'll check them out. Much appreciated!

As for script itself, I'm afraid that I keep getting the same string indices error. I've included the script below, and uploaded a test workflow at the link provided. My apologies for not uploading the workflow earlier. I didn't think there'd be enough people with DEVONthink to have any interest (or people to test, for that matter). In any case, thanks again for any help you can lend!

Download Test Workflow >

#!/usr/bin/python
# -*- coding: UTF-8 -*-
import plistlib
import os
import json
titles_remove = {"Title A", "Title B", "Title C"}
filePath = os.path.expanduser("~/Library/Application Support/DEVONthink 3/SmartGroups.plist")
if os.path.exists(filePath):
    result = {"items": []}
    plObjList = plistlib.readPlist(filePath)
    for plobj in plObjList:
        result["items"].append({
            "title": plobj["name"],
            "subtitle": "Open Smart Group",
            "uid": plobj["name"],
            "arg": plobj["sync"]["UUID"]})
    #print(json.dumps(result))
    #unfiltered_output = json.dumps(result)
    #filtered_output = [d for d in unfiltered_output if d["title"] not in titles_remove]
    result = [d for d in result if d['title'] not in titles_remove]
    print(json.dumps(filtered_output))
else:
    print('{"items": [{"title": "Error","subtitle": "No Smart Groups Found"}]}')

deanishe · October 16, 2020

5 hours ago, Jasondm007 said:

result = [d for d in result if d['title'] not in titles_remove]

print(json.dumps(filtered_output))

Sorry, this bit was wrong (wrong data structure).

result['items'] = [d for d in result['items'] if d['title'] not in titles_remove]
json.dump(result, sys.stdout)

vitor · October 16, 2020

9 hours ago, Jasondm007 said:

I haven't tried either of those editors yet, so I'll check them out.

As a user of Neovim myself, I doubt @deanishe is recommending you use it. Vim’s a hard editor to grasp and you’ll be fighting to set it up and use it for a while, which is definitely not what you need right now.

Sublime Text is good, but it is paid (or it nags you). Consider Visual Studio Code. It’s simple, capable, free and open-source, actively developed by a major company (Microsoft), and insanely popular (which translates to a vast plugin system). Yes, it’s build on Electron, but it’s far from the worst use of it.

Jasondm007 · October 16, 2020

@deanishe Thanks for taking a look at it again. As usual, your suggestion worked perfectly! I'm looking forward to going back and updating some of my other scripts. Thanks a ton!!

@vitor Thanks for the editor suggestion, too. I just installed Visual Studio Code, and it's output is definitely a lot easier to see than working directly in Alfred or using Atom or BBEdit. Thanks!!

By chance, do either of you guys have any suggestions for resources or tutorials for python newbies who would like to do a little tinkering in Alfred, etc.?

deanishe · October 17, 2020

https://diveintopython3.net/

Jasondm007 · February 20, 2021

@deanisheI have a quick question that relates to an iteration of script above, but is, more generally, related to the way that script filters operate.

I was wondering it is possible to capture or pass-on the user's initial query/input (i.e., as the argument itself)?

For example, if you wanted to add a modifier/mods to the JSON output whose argument/arg is just the user's query, would this be possible?

I tried several versions of the following JSON lines, but the script filter always just passes the {query} on as a string (i.e., not the actual query that a user types into the script filter)

"mods": {
	"cmd": {
	"subtitle": "TEXT",
	"arg": "{query}",
	"icon": {"path": "fileName.png"},
	},
},

If you're wondering why in the world someone would want to do this: In circumstances where I can't find what I'm looking for with the script filter above, I was hoping to use a modifier that would just send the query to a different script filter. That way, my lazy @$$ doesn't have to type it in again.

Thanks again for all of your help with everything!

vitor · February 21, 2021

2 hours ago, Jasondm007 said:

I was wondering it is possible to capture or pass-on the user's initial query/input (i.e., as the argument itself)?

Yes, it’s trivial. But forget {query}, that’s something specific to Alfred and with input as {query}, which you’re not using. You’re (correctly) using with input as argv.

To do what you want, it should be something like "arg": sys.argv[0], (don’t forget to import sys).

Jasondm007 · February 21, 2021

Hi @vitor - Thanks for getting back to me!

When I tried your recommended approach, the goods news is that it didn't error out, like many of my previous attempts. Unfortunately, it's giving back a a file path to one of Alfred's caches:

/Users/USERNAMEHERE/Library/Caches/com.runningwithcrayons.Alfred/Workflow Scripts/73159C6F-2F8B-4397-AB51-EC5326BC69A8

I have "import sys" up at the top, as you indicated. Any other ideas what I might be missing?

Thanks again for all of your help!

deanishe · February 21, 2021

It should be sys.argv[1] for the first argument in Python. sys.argv[0] is the name of the script, as in $0, $1 etc.

Edited February 21, 2021 by deanishe

Jasondm007 · February 22, 2021

@deanishe Thanks for getting back to me.

Unfortunately, the script errors out when running it with 1 as the value.

Is there a certain type of run behavior, etc., that I should also use? Or, am I missing something else?

Thanks again for all of your help!

"mods": {
	"cmd": {
	"subtitle": "Favorites Only",
	"arg": sys.argv[1],
	"icon": {"path": "fam1.png"},
	},
},

deanishe · February 22, 2021

What does the error message say?

Edited February 22, 2021 by deanishe

Jasondm007 · February 22, 2021

@deanishe Sorry about that! I should have thought to include the error message.

Code 1: Traceback (most recent call last):
  File "/Users/jasonjohndumont/Library/Caches/com.runningwithcrayons.Alfred/Workflow Scripts/96A60515-7183-4C13-8C1F-1D650947A1C7", line 36, in <module>
    "arg": sys.argv[1],
IndexError: list index out of range

Thanks again!

deanishe · February 22, 2021

You’ve set up Alfred so that there might not be an argument. You need to check sys.argv[1] exists before using it.

Jasondm007 · February 23, 2021

@deanishe As usual, you're absolutely right! Unfortunately, this issue is probably beyond my limited python skillset.

After doing some research into what you suggested, I incorporated an IF statement that adds the query or just an empty placeholder for the argument:

if len(sys.argv) < 2:
	theQuery = ""
else:
    theQuery = sys.argv[1]

Then, I just set the mods as follows:

"mods": {
	"cmd": {
	"subtitle": "Text Here",
	"arg": theQuery,
	"icon": {"path": "iconhere.png"},
	},
},

Is this what you had in mind? Or is there a better approach?

The reason I ask is that I can't quite figure out to set Alfred's run behavior using this approach. Before I incorporated this change, I used to just have Alfred filter everything. But Alfred doesn't appear to update quick enough, as it always just gives the blank argument back. But when I try changing the different options, I can't find one that actually returns query while still filtering things.

To avoid scrolling up this thread, below you will find a simplified version of the script below that includes these amendments:

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import plistlib
import os
import json
import sys

titles_remove = {"- - - - - - - - - -", "- - - - - - - -"}
filePathSG = os.path.expanduser("~/Library/Application Support/DEVONthink 3/SmartGroups.plist")

if len(sys.argv) < 2:
	theQuery = ""
else:
    theQuery = sys.argv[1]

if os.path.exists(filePathSG):
    if os.path.exists(filePathF):
        if os.path.exists(filePathSR):

            result = {"items": []}

            plObjListSG = plistlib.readPlist(filePathSG)
            for plobj in plObjListSG:
                result["items"].append({
                    "title": plobj["name"],
                    "subtitle": "Global Smart Group",
                    "match": "smart group sg " + plobj["name"],
					"icon": {"path": "sg1.png"},
                    "uid": plobj["name"],
                    "arg": "x-devonthink-smartgroup://" + plobj["sync"]["UUID"],
                    "mods": {
                        "cmd": {
                        "subtitle": "Text",
                        "arg": theQuery,
                        "icon": {"path": "icon1.png"},
                        },
                    },
                })           
                result['items'] = [d for d in result['items'] if d['title'] not in titles_remove]
            print(json.dumps(result))
else:
    print('{"items": [{"title": "Error","subtitle": "No Global Smart Groups Found"}]}')

Thanks again for all of your help!

deanishe · February 23, 2021

13 hours ago, Jasondm007 said:

Is this what you had in mind?

Yes, exactly.

13 hours ago, Jasondm007 said:

But when I try changing the different options, I can't find one that actually returns query while still filtering things.

There isn't one. If Alfred is doing the filtering, it makes no sense to pass the query to your script. If you have “Alfred filters results” selected, Alfred will run your script once—without a query—to get all the items.

If you don't have "Alfred filters results" selected, then Alfred will pass the query to your script (if there is one) and run your script again when the query changes.

Jasondm007 · February 23, 2021

@deanishe Got it! Thanks for your patience, and excellent explanation above. This all makes perfect sense (now, anyways 🤦‍♂️️)!

So, if I'd like to pass my query, how do I get my script to filter the results as the user inputs their query (similar to Alfred)? In layman's terms, how do I get the script filter to remove items from Alfred's visible output as the user inputs their query (based on the "match" criteria)? As you correctly pointed out, my script just dumps all of the results into Alfred - meaning that they all just kind of sit there.

Is this update as easy as adding a line or two of code to the script above? Or is it something that is going to require a more fundamental rethinking of everything? And, relatedly, since I'm not even sure where to start researching this sort of thing, do you have any suggestions for my code above, or other python-related workflows that you could point me to, where I might be able to learn how to accomplish this? Surprisingly, I only have a small handful of python workflows installed on my machine, and none appear to operate in this manner (at least from what my neophyte eyes can tell, anyways).

Thanks again @deanishe!! You rock.

deanishe · February 24, 2021

14 hours ago, Jasondm007 said:

Is this update as easy as adding a line or two of code to the script above?

Well, you have to do the searching/filtering (and possibly ranking) yourself, so it's as simple or as complex as you'd like that to be.

You can have something as simple as if theQuery in plobj['name']: (or more realistically if theQuery.lower() in plobj['name'].lower():) or you can implement a full fuzzy matching and ranking algorithm, like in my Go library. Here's a (relatively) simple custom filtering and ranking function I built for a workflow.

What exactly do you want to do differently from the way Alfred filters? There's not much point doing your own filtering unless you want to do it differently: Alfred can do it much, much faster.

Edited February 24, 2021 by deanishe

Jasondm007 · February 24, 2021

@deanishe Thanks for getting back to me. I appreciate your suggestions, but I'm afraid this stuff is way over my head. Honestly, your "simple" example almost made my head explode. And I really don't understand where to even add the code that you suggested.

I really didn't want to try to do the filtering myself. But I was under the impression that I had to, if I wanted to capture the query? When Alfred does the filtering, the script only runs once - which makes it impossible to capture the query, right? (unless you require the argument beforehand, etc.)

Is there a way that Alfred can do the filtering and I still the final query (as an argument to pass on)?

Thanks for your patience with everything!

deanishe · February 24, 2021

1 hour ago, Jasondm007 said:

But I was under the impression that I had to, if I wanted to capture the query?

That's right. If you turn "Alfred filters results" on, it doesn't pass the query to your script.

What exactly are you trying to achieve?

1 hour ago, Jasondm007 said:

your "simple" example almost made my head explode

Not simple, only relatively simple This stuff can get ungodly complex.

The example I posted splits your query into words and gives each task 1 point per "word starts with" match and 0.5 points per "word contains" match. If the task itself doesn't match, it filters its subtasks for matches, and if there are any, gives the task 0.5 points.

Except it actually awards negative points (i.e. -2 is better than -1) because the tasks are already sorted by priority (1 is higher than 2), so you want the results sorted lowest first.

Jasondm007 · February 25, 2021

7 hours ago, deanishe said:

Not simple, only relatively simple This stuff can get ungodly complex.

Hahaha @deanishe Your script looked so "simple" to me, that I thought you published Google's search algorithm or IBM Watson's thoughts

7 hours ago, deanishe said:

What exactly are you trying to achieve?

In layman's terms, I wanted the script to perform exactly how it usually does, with the only exception being that I wanted to be able hit a modifier key ⌘ that would pass my current query to another script, in circumstances where I wasn't able to find what I was looking for. Because I haven't had any success implementing things, at the moment, I just let Alfred do the filtering and have my modifier pass an empty argument - which gets me where I want to go, but requires me to type the same thing into the next search.

From what now understand about script filters - thanks to your help - is that I'd like Alfred to keep rerunning the script as the user types. That way, I can actually feed the query into the modifier's JSON argument. In addition, I'd like my script to filter the output using the item's title or match criteria, if one is provided (similar to the way it usually does). So, when the user types, the stuff that doesn't match, simply goes away from Alfred's output ... just like it usually does.

To be honest, I'm a little surprised that there isn't any easier way of doing this already, given how common this request seems like it would be (e.g., an option in Alfred to recall the query, etc.).

deanishe · February 25, 2021

3 hours ago, Jasondm007 said:

given how common this request seems like it would be

But it doesn’t make sense, given the way filtering works. When Alfred filters your results, it only runs your script once—to get all the items—and that happens before the user has entered a query (or after they've entered the first character, depending on settings). Because Alfred never calls your script again, there's no opportunity to pass it the full user query.

3 hours ago, Jasondm007 said:

is that I'd like Alfred to keep rerunning the script as the user types

That's how a Script Filter works (provided you haven't turned on "Alfred filters results"). But it's up to you to filter the results based on the user's query. It's really not hard to add simple filtering. Just stick this at the top of your loop that creates the Alfred feedback:

for plobj in plObjListSG:
    match = "smart group sg " + plobj["name"]
    # ignore items that don't match query
    if theQuery and theQuery.lower() not in match.lower():
        continue

    result["items"].append({
        "title": plobj["name"],
        "subtitle": "Global Smart Group",
        "match": match,
        # ...
        # ...
    })

Filtering JSON Output: A Very Basic Python Question

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in