Jump to content

Suppressing Google, Amazon and Wikipedia


Recommended Posts

Posted

I've started to put together my first attempts at a workflow in Python 2 using @deanishe 's Alfred-Workflow package. My keyword is "lz". However, when I come to test it, I type in "lz" and for a split second, my Title and SubTest are displayed, only to be barged unceremoniously out of the way by entries for "Search Google for 'lz'', "Search Amazon for 'lz'" and "Search Wikipedia for 'lz'". How do I stop this from happening?

Posted
1 hour ago, carlcaulkett said:

I've started to put together my first attempts at a workflow in Python 2 using @deanishe 's Alfred-Workflow package. My keyword is "lz". However, when I come to test it, I type in "lz" and for a split second, my Title and SubTest are displayed, only to be barged unceremoniously out of the way by entries for "Search Google for 'lz'', "Search Amazon for 'lz'" and "Search Wikipedia for 'lz'". How do I stop this from happening?

 

Share your workflow so that we can help you :) 

Overall to test your workflows go to your workflow and press the little "Bug icon" at the top right which will open "debugging mode". That

way you can see what is going on with your workflow.

Posted

There are other issues as well. I'm trying to use BeautifulSoup to parse a web page but that's causing me probs as well, Anyhow here's the project as an attachment...

Erm... I'm trying to attach the project as a zip, but this forum only accepts images as attachments. What's the easiest way to get it to you?

 

Posted

Of course, I could always copy and paste the python code...


 

import sys
from workflow import Workflow3, ICON_WEB, web
from bs4 import BeautifulSoup, SoupStrainer
def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url, stream=True)
    # throw an error if request failed
    # Workflow will catch this and show it to the user
    response.raise_for_status()
    # Parse the response
    for link in BeautifulSoup(response.text).find_all('a', href=True, "html.parser"):
        print link['href']
    # Loop through the returned posts and add an item for each to
    # the list of results for Alfred
    #for post in posts:
    #     wf.add_item(title=post['description'],
    #                  subtitle=post['href'],
    #                 icon=ICON_WEB)
    # Send the results to Alfred as XML
    wf.send_feedback()
if __name__ == u"__main__":
    wf = Workflow3()
    sys.exit(wf.run(main))

 

Posted (edited)
23 minutes ago, carlcaulkett said:

There are other issues as well. I'm trying to use BeautifulSoup to parse a web page but that's causing me probs as well, Anyhow here's the project as an attachment...

Erm... I'm trying to attach the project as a zip, but this forum only accepts images as attachments. What's the easiest way to get it to you?

 

 

https://dropfile.to/ or https://www.dropbox.com/ work the best usually. You can easily export the workflow, that way we'd have what you have which makes

it easier to debug.

Edited by FroZen_X
Posted

Actually the intrusion by Google, Amazon and Wikipedia seems to happen when Alfred cannot find anything in its disk file indices - presumably because my workflow is bombing out early and not currently sending to Alfred. I've just added a dummy line of output to the workflow and that works!

Posted
4 minutes ago, carlcaulkett said:

Actually the intrusion by Google, Amazon and Wikipedia seems to happen when Alfred cannot find anything in its disk file indices - presumably because my workflow is bombing out early and not currently sending to Alfred. I've just added a dummy line of output to the workflow and that works!

 

The workflow i downloaded doesn't have deanishes workflow library oO other than that you gotta have to use 

python lazdoc.py "{query}" to send the query to the python script and change "with input as argv" to "with input as {query}"

 

Posted

I didn't attach Deanishe's workflow because I guessed that you would have it anyway. I have got it installed, I promise! As for the second part of your post, that doesn't seem to be affecting things unduly. Oh, I've just seen the latest post. It's pretty late round these parts as well! The things we do to enjoy ourselves! 

Posted (edited)

I've also sussed out that if I have any print statements in the code to help to debug the python code from the console, that seems to have the effect of stopping the output to Alfred. Is there a way that my Python code can know whether it has been launched from Alfred or from the console?

Edited by carlcaulkett
Posted (edited)
38 minutes ago, carlcaulkett said:

I've also sussed out that if I have any print statements in the code to help to debug the python code from the console, that seems to have the effect of stopping the output to Alfred. Is there a way that my Python code can know whether it has been launched from Alfred or from the console?

 

I've never used BeautifulSoup myself, but here is a working workflow that does what you want i guess:

 

https://www.dropbox.com/s/p4t8qqfr93evzqr/LazDoc.alfredworkflow?dl=0

 

import sys
import httplib2
from workflow import Workflow, ICON_WEB, web
from bs4 import BeautifulSoup, SoupStrainer

def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url, stream=True)
    
    http = httplib2.Http()
    status, response = http.request('http://lazarus-ccr.sourceforge.net/docs/')
    # throw an error if request failed
    # Workflow will catch this and show it to the user
    #response.raise_for_status()
    
    # Parse the response
    for link in BeautifulSoup(response,parseOnlyThese=SoupStrainer('a')):
        if link.has_attr('href'):
            wf.add_item(title=link['href'],arg=link['href'],valid=True,icon= ICON_WEB)
    

# Loop through the returned posts and add an item for each to
# the list of results for Alfred

#for post in posts:
#     wf.add_item(title=post['description'],
#                  subtitle=post['href'],
#                 icon=ICON_WEB)

# Send the results to Alfred as XML
    wf.send_feedback()

if __name__ == u"__main__":
    wf = Workflow()
    sys.exit(wf.run(main))

5973ed27e153c_ScreenShot2017-07-23at02_24_41.png.066b6824d3379b23610438179966fe12.png

 

Gonna get some sleep now ^^ @deanishe will probably have a better version tomorrow :) he's the king when it comes to python

 

Cheers,

 

Frozen

Edited by FroZen_X
Posted

Don't use print. That invalidates the JSON output. That's why logging is built into the library. 

 

Alfred shows its "web search" fallback results when your script doesn't output any valid results. 

 

Don't use libraries installed globally. Include them in the workflow itself. Otherwise it won't work for other people, and we can't  help you with it because we don't have the same workflow.

Posted

1. Yep, I've learnt not to use print. I'll investigate the logging in the morning.

 

2. Yep, I guessed as much.

 

3. Yeah, that makes sense. I'll bear that in mind for the future.

 

Thanks for all your help so far!

Posted

I've edited the script a wee bit. You were using workflow.web and httplib2 to fetch the URL twice.

 

import sys

from bs4 import BeautifulSoup, SoupStrainer
from workflow import Workflow3, ICON_WEB, web


def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url)
    # throw an error if request failed
    # Workflow will catch this and show it to the user
    response.raise_for_status()

    # Parse the response
    for link in BeautifulSoup(response.text, parseOnlyThese=SoupStrainer('a')):
        if link.has_attr('href'):
            wf.add_item(title=link['href'], arg=link['href'], valid=True,
                        icon=ICON_WEB)


# Send the results to Alfred as XML
    wf.send_feedback()


if __name__ == "__main__":
    wf = Workflow3()
    sys.exit(wf.run(main))

 

Posted
53 minutes ago, deanishe said:

I've edited the script a wee bit. You were using workflow.web and httplib2 to fetch the URL twice.

 

lol haha was too tired last night to notice ^^

Posted (edited)

Here's my latest code:


 

import sys
import os
from workflow import Workflow3, ICON_WEB, web
from bs4 import BeautifulSoup, SoupStrainer
from urlparse import urljoin

log = None

def dbg(msg):
    log.debug(msg)

def main(wf):
    url = 'http://lazarus-ccr.sourceforge.net/docs/'
    response = web.get(url, stream=True)
    laz_icon = os.path.join(sys.path[0], 'icon.png')

    # throw an error if request failed
    response.raise_for_status()

    # Parse the response
    soup = BeautifulSoup(response.text, "html.parser")
    links = soup.find_all('a')
    for link in links:
        href = link.get('href')
        if "sourceforge" not in href and link.string != "Documentation":
            href_abs = urljoin(url, href)
            wf.add_item(title=link.string,
                        subtitle=href_abs,
                        arg=href_abs,
                        valid=True,
                        icon=laz_icon)

    # Send the results to Alfred as XML
    wf.send_feedback()

if __name__ == u'__main__':
    wf = Workflow3()
    log = wf.logger
    sys.exit(wf.run(main))

 

I think I've gotten rid of the double fetch of the URL. This is OK as far as it goes, and the resultant pages show up nicely in the browser. Now, I want to feed the output back into the workflow so that the list of links shown in the results is similarly parsed until the only possible action to take is to display a web page. If that makes sense!

Edited by carlcaulkett
Posted (edited)

Please, upload the entire workflow somewhere and post a link to that. It's unreasonable to expect us to try to recreate your workflow in order to help you.

 

41 minutes ago, carlcaulkett said:

[…] pages show up nicely in the browser. Now, I want to feed the output back into the workflow so that the list of links shown in the results is similarly parsed until the only possible action to take is to display a web page. If that makes sense!

 

I'm afraid I don't understand. You want to read a list of links from the page that showing in your browser?

 

As regards the code:

def dbg(msg):
    log.debug(msg)

Don't do this. Just use log.debug(). The proper way to use logging is like this, which you've broken:

url = 'http://www.example.com'
r = web.get(url)
log.debug('[%d] %s', r.status_code, url)

Also, don't stream the response. It's a web page, not a multi-megabyte zip file:

response = web.get(url, stream=True)
# should be
response = web.get(url)

This is unnecessary:

laz_icon = os.path.join(sys.path[0], 'icon.png')

Just use:

laz_icon = 'icon.png'

It would only make a difference if you were running the script from a different directory, which is not a good idea (for a workflow). If you must do that, the correct way is:

laz_icon = os.path.join(os.path.dirname(__file__), 'icon.png')

The reason is that sys.path is not guaranteed to have your script's directory at the front. If you've used Workflow(libraries=[...]), for example, your code won't work as expected.

Edited by deanishe
accidentally a word
Posted

I've been looking at the "Searching for Folders" example workflow which structurally is very similar to what I want to do (it's a very useful utility in its own right!). But I haven't been able to make it work yet.

Posted

I think reading the stickied threads is fairly universally considered good ;)

 

 

Which reminds, me. I'm moving this thread to the Workflow Help & Questions one, where it belongs.

 

Posted

What you've uploaded is broken. The workflow and bs4 directories are empty.

 

Please export the workflow from Alfred (right-click on the workflow in Alfred Preferences and choose Export…) and upload the resulting .alfredworkflow file.

 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...