yekong

August 17, 2014

I've got the code up and running, I think. This is from your first script using my library.

First of all, remove any print statements or make sure they write to STDERR (see previous post). STDOUT is where Alfred reads the XML from, so anything else printed to STDOUT will break the XML.

Secondly, the title, url and desc returned by parse_baidu_results() are all strings, not Unicode. You have to decode these yourself using .decode('utf-8') or wf.decode():
    title = wf.decode(title.replace('', '').replace('', ''))
    wf.logger.debug(title)
    url = u'http:' + wf.decode(part1.a['href'])
    part2 = table.find('div', {'class': 'c-abstract'})
    desc = wf.decode(part2.renderContents())
Working with strings and Unicode is a PITA in Python 2, unfortunately

Finally, the URL you're sending retrieving the results from isn't really correct. It won't work with multi-word queries, as query isn't being url-quoted. Here's a simple way to do it (web.py will correctly encode and quote it for you):
def request_baidu_search(query):
 url = u'http://www.baidu.com/s'
 r = web.get(url, {'wd': query})
Here's my full working code:
# encoding: utf-8

from workflow import Workflow, ICON_WEB, web
from BeautifulSoup import *
import sys


def request_baidu_search(query):
 url = u'http://www.baidu.com/s'
 r = web.get(url, {'wd': query})

 r.raise_for_status()

 return parse_baidu_results(r.content)


def parse_baidu_results(content):
 soup = BeautifulSoup(content)
 tables = soup.findAll('div', {'class': 'result c-container '})
 results = []
 for table in tables:
 part1 = table.find(attrs={'class': 't'})
 title = part1.a.renderContents()
 title = wf.decode(title.replace('', '').replace('', ''))
 wf.logger.debug(title)
 url = u'http:' + wf.decode(part1.a['href'])
 part2 = table.find('div', {'class': 'c-abstract'})
 desc = wf.decode(part2.renderContents())
 results.append((title, url, desc))
 return results


def main(wf):
 query = wf.args[0]

 def wrapper():
 return request_baidu_search(query)

 #results = wf.cached_data('results', wrapper, max_age=60)
 results = request_baidu_search(query)

 for result in results:
 wf.add_item(
 title=result[0],
 subtitle=result[1],
 arg=result[1],
 valid=True,
 icon=ICON_WEB)

 wf.send_feedback()

if __name__ == '__main__':
 wf = Workflow()
 sys.exit(wf.run(main))

Very patient and detail answer, THANK YOU SO MUCH !!! I've struggled for a whole day in vein...

August 17, 2014

Dear all,

I'm using latest alfred (v2.4(279)), and I have been trying to write my first alfred workflow with python.

Inspired by the article http://www.deanishe.net/alfred-workflow/tutorial.html#creating-a-new-workflow , I try using my little python knowledge to write it out. And I've seen two alfred python utils, https://github.com/deanishe/alfred-workflow and https://github.com/nikipore/alfred-python. But NONE of them gonna work, they all run out with ERROR as follows:

[ERROR: alfred.workflow.input.scriptfilter] XML Parse Error 'The operation couldn’t be completed. (NSXMLParserErrorDomain error 4.)'. Row (null), Col (null): 'Document is empty' in XML:

The strange thing is that the code works for very rare time where as it just doesn't work! And I think it should NOT be the problem of the two python utils. Or I'm missing something here...

And I've also seen the post, but I got no luck.

http://www.alfredforum.com/topic/4238-xml-parse-error-in-alfred-22-fixed-23/?hl=%2Berror%3A+%2Balfred.workflow.input.scriptfilter

Should anyone know the issue, please kindly post your reply, MUCH THANKS!

As the code I write is very short by now, so I just paste it here. One with the first python util, the other with the second python util. And the workflow scripts are the same: python baidu_now.py "{query}". And I'm using python 2.7.5

Sorry about the code formatting, seems that there're no code blocking here...

==============first===================

# encoding: utf-8

from workflow import Workflow, ICON_WEB, web

from BeautifulSoup import *

import sys

reload(sys)

sys.setdefaultencoding('utf-8')

def request_baidu_search(query):

url = u'http://www.baidu.com/s?wd=query.replace(query' query)

r = web.get(url)

r.raise_for_status()

return parse_baidu_results(r.content)

def parse_baidu_results(content):

soup = BeautifulSoup(content)

tables = soup.findAll('div', {'class': 'result c-container '})

results = []

for table in tables:

part1 = table.find(attrs={'class': 't'})

title = part1.a.renderContents()

title = title.replace('', '').replace('', '')

print title

url = u'http:' + part1.a['href']

part2 = table.find('div', {'class': 'c-abstract'})

desc = part2.renderContents()

results.append((title, url, desc))

return results

def main(wf):

query = wf.args[0]

def wrapper():

return request_baidu_search(query)

results = request_baidu_search(query)#wf.cached_data('results', wrapper, max_age=60)

for result in results:

wf.add_item(

title = result[0],

subtitle = result[1],

arg = result[1],

valid = True,

icon = ICON_WEB)

wf.send_feedback()

if __name__ == '__main__':

wf = Workflow()

sys.exit(wf.run(main))

=========== second ===============

# -*- coding: utf-8 -*-

import alfred

import requests

from BeautifulSoup import BeautifulSoup

import sys

reload(sys)

sys.setdefaultencoding('utf-8')

def request_baidu_search(query):

url = u'http://www.baidu.com/s?wd=query.replace(query' query)

r = requests.get(url)

return parse_baidu_results(r.content)

def parse_baidu_results(content):

soup = BeautifulSoup(content)

tables = soup.findAll('div', {'class': 'result c-container '})

results = []

for table in tables:

part1 = table.find(attrs={'class': 't'})

title = part1.a.renderContents()

title = title.replace('', '').replace('', '')

print title

url = u'http:' + part1.a['href']

part2 = table.find('div', {'class': 'c-abstract'})

desc = part2.renderContents()

results.append((title, url, desc))

return results

def main():

query = alfred.args()[0]

results = request_baidu_search(query)

items = [alfred.Item(

attributes = { 'uid': alfred.uid(0), 'arg': result[1] },

title = result[0],

subtitle = result[2])

for result in results]

xml_header = u'<?xml version="1.0" encoding="utf-8"?>';

xml = xml_header + alfred.xml(items)

alfred.write(xml)

if __name__ == u'__main__':

sys.exit(main())

Sign In

yekong

Posts

Joined

Last visited

Content Type

Blogs

Gallery

Downloads

Events

Profiles

Forums

Articles

Media Demo

Posts posted by yekong

[HELP] Alfred workflow dev with python always with ERROR 'Document is empty' in XML

[HELP] Alfred workflow dev with python always with ERROR 'Document is empty' in XML

Browse

Activity