Jump to content

Script Filter XML output in Unicode from Python 3


Recommended Posts

Hello.

 

I try to write my first script filter plugin, but I've encountered a unexpected trouble with Unicode (non-ASCII) characters.

 

This is a simplest example I could invent:

xmloutput = """
<?xml version="1.0" encoding="utf-8" ?>
<items>
  <item arg="testitem">
    <title>FooBar</title>
    <subtitle>foo bar is a test item</subtitle>
    <icon>icon.png</icon>
  </item>
</items>
"""

print(xmloutput)

(I call it from script filter as /usr/local/bin/python3 outputtest.py.)

 

It works fine until I want to add some non-ASCII characters:

xmloutput = """
<?xml version="1.0" encoding="utf-8" ?>
<items>
  <item arg="testitem">
    <title>FooBær</title>
    <subtitle>foo bær is a «test item»</subtitle>
    <icon>icon.png</icon>
  </item>
</items>
"""

print(xmloutput)

Then it fails:

[ERROR: alfred.workflow.input.scriptfilter] Code 1: Traceback (most recent call last):
  File "outputtest.py", line 12, in <module>
    print(xmloutput)
UnicodeEncodeError: 'ascii' codec can't encode character '\xe6' in position 88: ordinal not in range(128)

Why it happens? I can't believe that Alfred do not use Unicode in input/output. I think the root of problem in something else, but I can't find out where is it.

Link to post

This is a general problem with Python 3 that isn't restricted to Alfred. It's very unfortunate, because without this massive issue of only running properly in perfectly configured environments, Python 3 would be a very good language.
 
Alfred uses UTF-8 exclusively.
 
The problem is that Alfred runs workflows in an empty environment. As a result, software that follows the POSIX standard, like Python, defaults to ASCII encoding. That is to say, Alfred specifies the wrong locale by omission.
 
If you run the following in Alfred and in your shell, you'll see what I mean:
 
/usr/local/bin/python3 -c 'import sys; print(sys.stdout.encoding)'
 
Because Python 3 tries to do all the encoding/decoding for you and Alfred runs workflows in an incorrectly-configured (ASCII) environment, Py3 dies in flames with non-ASCII strings.
 
You have three options:

  • You can add a proper locale/encoding in your launchd environment. This will affect every program. This is the cleanest solution but has the problem that your workflows won't work on other people's machines.
  • Use Python 2. It doesn't suffer from this issue. Indeed, this is one of the reasons so many Python programmers won't use Py3.
  • Use Alfred's Script box like a shell. Set Language to /bin/bash and put export PYTHONIOENCODING=UTF-8 or export LANG=en_US.UTF-8 at the top:
export PYTHONIOENCODING=UTF-8
 
/usr/local/bin/python3 script.py "{query}"

If you're only ever going to run the code on your own machine, option 1 is the cleanest. If you plan to share your workflows, choose option 2 (Python 3 is not installed by default).

 

Option 3 is probably the best compromise. Python 3 without mucking around with your machine's configuration. This is what I do when I want to use a program sensitive to improperly-configured environments, like Py3 or pbpaste/pbcopy.

Edited by deanishe
Link to post

Thank you for the comprehensive answer, deanishe.

 

Well, third option is fine for me.

I use Alfred's Script box in bash mode anyway, because Alfred can't use custom interpreter (by the way this is kind of sad).

 

But why do Alfred use so inconsistent environment?

It seems to me that it is a bug, which is not too hard to fix. 

Or are there reasons for such behavior?

Link to post

It is a bug with Alfred (that won't be fixed until v3 because changing the environment might break a lot of existing workflow), but it is also a fundamental problem with Python 3.
 
Empty environments aren't good but they are very common (all Mac apps, launchd, cron, initd, systemd etc.). Unfortunately, Python 3 (and especially Python 3 fanboys) take the attitude that it's everybody else that's wrong. While this may be technically correct—and the world would be a better place if a UTF-8-clean OS like OS X specified a UTF-8 environment, not an ASCII one—that isn't how the real world works.
 
As a result, Python 3 is not a great language for use in the real world. It simply won't run in many common configurations, and even the very best Python programmers have been unable to work around the issue and deliberately terminate their programs instead.
 
It's a real shame because if Python 3 weren't so woefully fragile, it'd be an awesome language for things like workflows. Most of the encoding issues that bite less experienced users in Python 2 would just magically disappear.
 

Link to post

 

 

changing the environment might break a lot of existing workflow

D’oh, backward compatibility, I was afraid of such explanation. 

 

I utterly agree with you — it is a real shame for Python.

I often use Python 3 as scripting language and I've never encountered with the issue. After your first post in this thread I cannot firstly believe in that the same problem can exist in Python. Alack, I was apparently wrong.

 

The issue is not a big problem for me while I use Python 3 for my local workflows (due to your advices), but I hope eventually to see Alfred v3 with better compatibility with my favorite scripting language.

 

Thank for you help and explanations.

Link to post

You'll run into this same problem if you try to run Python 3 scripts from Launch Agents or Hazel or the like.

 

The only solution is to always use a wrapper script for your Python 3 programs that ensures a sane environment before calling python3.

 

Well, that or just ignore Python 3 until they fix it. That's what a huge number of Python developers are doing, and that's why Python 3 still hasn't really caught on after 8 years. Certainly, I get the impression that a very large proportion of Py3 devs are new to Python, not Py2 developers that have switched.

Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...