Suppressing Google, Amazon and Wikipedia

carlcaulkett · July 22, 2017

I've started to put together my first attempts at a workflow in Python 2 using @deanishe 's Alfred-Workflow package. My keyword is "lz". However, when I come to test it, I type in "lz" and for a split second, my Title and SubTest are displayed, only to be barged unceremoniously out of the way by entries for "Search Google for 'lz'', "Search Amazon for 'lz'" and "Search Wikipedia for 'lz'". How do I stop this from happening?

FroZen_X · July 22, 2017

1 hour ago, carlcaulkett said:

I've started to put together my first attempts at a workflow in Python 2 using @deanishe 's Alfred-Workflow package. My keyword is "lz". However, when I come to test it, I type in "lz" and for a split second, my Title and SubTest are displayed, only to be barged unceremoniously out of the way by entries for "Search Google for 'lz'', "Search Amazon for 'lz'" and "Search Wikipedia for 'lz'". How do I stop this from happening?

Share your workflow so that we can help you

Overall to test your workflows go to your workflow and press the little "Bug icon" at the top right which will open "debugging mode". That

way you can see what is going on with your workflow.

carlcaulkett · July 22, 2017

There are other issues as well. I'm trying to use BeautifulSoup to parse a web page but that's causing me probs as well, Anyhow here's the project as an attachment...

Erm... I'm trying to attach the project as a zip, but this forum only accepts images as attachments. What's the easiest way to get it to you?

carlcaulkett · July 22, 2017

OK. I am now transmitting program code through the medium of ~~modern dance~~ .png files. Bloody primitive if you ask me!

carlcaulkett · July 22, 2017

Of course, I could always copy and paste the python code...

import sys
from workflow import Workflow3, ICON_WEB, web
from bs4 import BeautifulSoup, SoupStrainer
def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url, stream=True)
    # throw an error if request failed
    # Workflow will catch this and show it to the user
    response.raise_for_status()
    # Parse the response
    for link in BeautifulSoup(response.text).find_all('a', href=True, "html.parser"):
        print link['href']
    # Loop through the returned posts and add an item for each to
    # the list of results for Alfred
    #for post in posts:
    #     wf.add_item(title=post['description'],
    #                  subtitle=post['href'],
    #                 icon=ICON_WEB)
    # Send the results to Alfred as XML
    wf.send_feedback()
if __name__ == u"__main__":
    wf = Workflow3()
    sys.exit(wf.run(main))

FroZen_X · July 22, 2017

23 minutes ago, carlcaulkett said:

There are other issues as well. I'm trying to use BeautifulSoup to parse a web page but that's causing me probs as well, Anyhow here's the project as an attachment...

Erm... I'm trying to attach the project as a zip, but this forum only accepts images as attachments. What's the easiest way to get it to you?

https://dropfile.to/ or https://www.dropbox.com/ work the best usually. You can easily export the workflow, that way we'd have what you have which makes

it easier to debug.

Edited July 22, 2017 by FroZen_X

carlcaulkett · July 22, 2017

https://dropfile.to/U9tFKpTsds

access key:

pcyoYoR

ds

carlcaulkett · July 22, 2017

Actually the intrusion by Google, Amazon and Wikipedia seems to happen when Alfred cannot find anything in its disk file indices - presumably because my workflow is bombing out early and not currently sending to Alfred. I've just added a dummy line of output to the workflow and that works!

FroZen_X · July 22, 2017

4 minutes ago, carlcaulkett said:

Actually the intrusion by Google, Amazon and Wikipedia seems to happen when Alfred cannot find anything in its disk file indices - presumably because my workflow is bombing out early and not currently sending to Alfred. I've just added a dummy line of output to the workflow and that works!

The workflow i downloaded doesn't have deanishes workflow library oO other than that you gotta have to use

python lazdoc.py "{query}" to send the query to the python script and change "with input as argv" to "with input as {query}"

FroZen_X · July 22, 2017

wait i guess i misunderstood what you wanna achieve....its getting pretty late here lol

carlcaulkett · July 22, 2017

I didn't attach Deanishe's workflow because I guessed that you would have it anyway. I have got it installed, I promise! As for the second part of your post, that doesn't seem to be affecting things unduly. Oh, I've just seen the latest post. It's pretty late round these parts as well! The things we do to enjoy ourselves!

carlcaulkett · July 22, 2017

I've also sussed out that if I have any print statements in the code to help to debug the python code from the console, that seems to have the effect of stopping the output to Alfred. Is there a way that my Python code can know whether it has been launched from Alfred or from the console?

Edited July 22, 2017 by carlcaulkett

FroZen_X · July 23, 2017

38 minutes ago, carlcaulkett said:

I've also sussed out that if I have any print statements in the code to help to debug the python code from the console, that seems to have the effect of stopping the output to Alfred. Is there a way that my Python code can know whether it has been launched from Alfred or from the console?

I've never used BeautifulSoup myself, but here is a working workflow that does what you want i guess:

https://www.dropbox.com/s/p4t8qqfr93evzqr/LazDoc.alfredworkflow?dl=0

import sys
import httplib2
from workflow import Workflow, ICON_WEB, web
from bs4 import BeautifulSoup, SoupStrainer

def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url, stream=True)
    
    http = httplib2.Http()
    status, response = http.request('http://lazarus-ccr.sourceforge.net/docs/')
    # throw an error if request failed
    # Workflow will catch this and show it to the user
    #response.raise_for_status()
    
    # Parse the response
    for link in BeautifulSoup(response,parseOnlyThese=SoupStrainer('a')):
        if link.has_attr('href'):
            wf.add_item(title=link['href'],arg=link['href'],valid=True,icon= ICON_WEB)
    

# Loop through the returned posts and add an item for each to
# the list of results for Alfred

#for post in posts:
#     wf.add_item(title=post['description'],
#                  subtitle=post['href'],
#                 icon=ICON_WEB)

# Send the results to Alfred as XML
    wf.send_feedback()

if __name__ == u"__main__":
    wf = Workflow()
    sys.exit(wf.run(main))

5973ed27e153c_ScreenShot2017-07-23at02_24_41.png.066b6824d3379b23610438179966fe12.png

Gonna get some sleep now ^^ @deanishe will probably have a better version tomorrow he's the king when it comes to python

Cheers,

Frozen

Edited July 23, 2017 by FroZen_X

carlcaulkett · July 23, 2017

That's brilliant! Thanks so much. I'll study that in the morning, or later this morning, should I say!

deanishe · July 23, 2017

Don't use print. That invalidates the JSON output. That's why logging is built into the library.

Alfred shows its "web search" fallback results when your script doesn't output any valid results.

Don't use libraries installed globally. Include them in the workflow itself. Otherwise it won't work for other people, and we can't help you with it because we don't have the same workflow.

carlcaulkett · July 23, 2017

1. Yep, I've learnt not to use print. I'll investigate the logging in the morning.

2. Yep, I guessed as much.

3. Yeah, that makes sense. I'll bear that in mind for the future.

Thanks for all your help so far!

deanishe · July 23, 2017

I've edited the script a wee bit. You were using workflow.web and httplib2 to fetch the URL twice.

import sys

from bs4 import BeautifulSoup, SoupStrainer
from workflow import Workflow3, ICON_WEB, web


def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url)
    # throw an error if request failed
    # Workflow will catch this and show it to the user
    response.raise_for_status()

    # Parse the response
    for link in BeautifulSoup(response.text, parseOnlyThese=SoupStrainer('a')):
        if link.has_attr('href'):
            wf.add_item(title=link['href'], arg=link['href'], valid=True,
                        icon=ICON_WEB)


# Send the results to Alfred as XML
    wf.send_feedback()


if __name__ == "__main__":
    wf = Workflow3()
    sys.exit(wf.run(main))

FroZen_X · July 23, 2017

53 minutes ago, deanishe said:

I've edited the script a wee bit. You were using workflow.web and httplib2 to fetch the URL twice.

lol haha was too tired last night to notice ^^

carlcaulkett · July 23, 2017

Here's my latest code:

import sys
import os
from workflow import Workflow3, ICON_WEB, web
from bs4 import BeautifulSoup, SoupStrainer
from urlparse import urljoin

log = None

def dbg(msg):
    log.debug(msg)

def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url, stream=True)
    laz_icon = os.path.join(sys.path[0], 'icon.png')

    # throw an error if request failed
    response.raise_for_status()

    # Parse the response
    soup = BeautifulSoup(response.text, "html.parser")
    links = soup.find_all('a')
    for link in links:
        href = link.get('href')
        if "sourceforge" not in href and link.string != "Documentation":
            href_abs = urljoin(url, href)
            wf.add_item(title=link.string,
                        subtitle=href_abs,
                        arg=href_abs,
                        valid=True,
                        icon=laz_icon)

    # Send the results to Alfred as XML
    wf.send_feedback()

if __name__ == u'__main__':
    wf = Workflow3()
    log = wf.logger
    sys.exit(wf.run(main))

I think I've gotten rid of the double fetch of the URL. This is OK as far as it goes, and the resultant pages show up nicely in the browser. Now, I want to feed the output back into the workflow so that the list of links shown in the results is similarly parsed until the only possible action to take is to display a web page. If that makes sense!

Edited July 23, 2017 by carlcaulkett

deanishe · July 23, 2017

Please, upload the entire workflow somewhere and post a link to that. It's unreasonable to expect us to try to recreate your workflow in order to help you.

41 minutes ago, carlcaulkett said:

[…] pages show up nicely in the browser. Now, I want to feed the output back into the workflow so that the list of links shown in the results is similarly parsed until the only possible action to take is to display a web page. If that makes sense!

I'm afraid I don't understand. You want to read a list of links from the page that showing in your browser?

As regards the code:

def dbg(msg):
    log.debug(msg)

Don't do this. Just use log.debug(). The proper way to use logging is like this, which you've broken:

url = 'http://www.example.com'
r = web.get(url)
log.debug('[%d] %s', r.status_code, url)

Also, don't stream the response. It's a web page, not a multi-megabyte zip file:

response = web.get(url, stream=True)
# should be
response = web.get(url)

This is unnecessary:

laz_icon = os.path.join(sys.path[0], 'icon.png')

Just use:

laz_icon = 'icon.png'

It would only make a difference if you were running the script from a different directory, which is not a good idea (for a workflow). If you must do that, the correct way is:

laz_icon = os.path.join(os.path.dirname(__file__), 'icon.png')

The reason is that sys.path is not guaranteed to have your script's directory at the front. If you've used Workflow(libraries=[...]), for example, your code won't work as expected.

Edited July 23, 2017 by deanishe
accidentally a word

carlcaulkett · July 23, 2017

I've been looking at the "Searching for Folders" example workflow which structurally is very similar to what I want to do (it's a very useful utility in its own right!). But I haven't been able to make it work yet.

carlcaulkett · July 23, 2017

The latest version of the project is here. Sorry if I'm not getting it right yet - the things that are deemed good etiquette on some forums are frowned upon in others, and vice versa!

https://dropfile.to/d0k78wK

ZtGU80R

deanishe · July 23, 2017

I think reading the stickied threads is fairly universally considered good

Which reminds, me. I'm moving this thread to the Workflow Help & Questions one, where it belongs.

deanishe · July 23, 2017

What you've uploaded is broken. The workflow and bs4 directories are empty.

Please export the workflow from Alfred (right-click on the workflow in Alfred Preferences and choose Export…) and upload the resulting .alfredworkflow file.

carlcaulkett · July 23, 2017

Alfred 3

macOS 10.12.6 Sierra

Python 2.7.10

Sorry, I didn't realise that the `zip` didn't work recursively. Exported workflow is here:

https://dropfile.to/S9v5YAz

bZ9jAyU

Edited July 23, 2017 by carlcaulkett

Suppressing Google, Amazon and Wikipedia

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in