therockmandolinist Posted April 3, 2016 Share Posted April 3, 2016 I'm trying to pass in a character like γ into a python script through alfred, (as well as use it as a dict key within that script). I keep getting the error 'ascii' codec can't decode byte 0xce in position 0: ordinal not in range(128). Does anyone know what this is about, or how I can achieve unicode character input into a python script? Link to comment
deanishe Posted April 3, 2016 Share Posted April 3, 2016 (edited) I know the topic backwards, forwards and sideways, but I'm not psychic and can't debug code I haven't seen. http://www.deanishe.net/alfred-workflow/user-manual/text-encoding.html It's very general, but that's the best you're going to get without posting your code that's causing the issue. You might even be using Python 3 for all I know. Which would change all the answers compared to Python 2. Edited April 3, 2016 by deanishe Link to comment
therockmandolinist Posted April 4, 2016 Author Share Posted April 4, 2016 (edited) I know the topic backwards, forwards and sideways, but I'm not psychic and can't debug code I haven't seen. http://www.deanishe.net/alfred-workflow/user-manual/text-encoding.html It's very general, but that's the best you're going to get without posting your code that's causing the issue. You might even be using Python 3 for all I know. Which would change all the answers compared to Python 2. Thanks for the reply. I'm using python 2, very rough/working code is below. I'm working on a calculator/expression parser for alfred that uses sympy's parse_expr. I'm passing in arguments with a bash script as in 'python myscript.py "{query}"'. The dictionary 'var_dict' that holds the unicode characters in the script is for defining custom vars in sympy, so if I pass in 'γ' to alfred, (then going through "{query}" in the previous manner), it would recognize that as 1.4. This works when i pass in ascii chars, but not unicode ones. I probably don't need docopt here, but I've found it fun to use. '''calculate.py [args] Usage: calculate.py <query> Options: -h''' from sympy.parsing.sympy_parser import (parse_expr,standard_transformations, convert_xor,implicit_multiplication_application, split_symbols_custom,_token_splittable,TokenError) import sympy from sympy import N,SympifyError from workflow import Workflow import sys import re #reload(sys) #sys.setdefaultencoding('UTF8') # sympy.cosd = lambda x : sympy.cos( sympy.mpmath.radians(x) ) # sympy.sind = lambda x : sympy.sin( sympy.mpmath.radians(x) ) sympy.cosd = lambda x : sympy.cos( sympy.mpmath.radians(x) ) sympy.sind = lambda x : sympy.sin( sympy.mpmath.radians(x) ) sympy.tand = lambda x : sympy.tan( sympy.mpmath.radians(x) ) var_dict={u'R':287,u'gamma':1.4,u'gammae':1.3,u'γ':1.4,u'g':9.81} def can_split(symbol): if symbol not in (var_dict.keys()): return _token_splittable(symbol) return False transformation=split_symbols_custom(can_split) transformations = (standard_transformations +(transformation,convert_xor,implicit_multiplication_application)) def main(wf): from docopt import docopt args = docopt(__doc__,wf.args) query=args.get('<query>').decode('UTF-8') with open('history.txt') as historyFile: historyList=historyFile.read().splitlines() history=[tuple(x.split(',')) for x in historyList[::-1]] if 'v:' in query: quer=re.compile(query.split('v:')[1],re.IGNORECASE) ordered=sorted(var_dict.keys(), key=lambda s: s.lower()) for i in ordered: if quer.search(i) or quer.search(str(var_dict[i])): wf.add_item(i, unicode(var_dict[i]), autocomplete=query.split('v:')[0]+i) wf.send_feedback() return 0 elif 'h:' in query: quer=re.compile(query.split('h:')[1],re.IGNORECASE) for i in history: if quer.search(i[0]) or quer.search(i[1]): wf.add_item(i[0], i[1], icon='history.png', autocomplete=query.split('h:')[0]+i[0]) wf.send_feedback() return 0 try: parsed=parse_expr(query,local_dict=var_dict,transformations=transformations) result=unicode(N(parsed).round(10)) except TypeError: try: parsed=parse_expr(query,local_dict=var_dict,transformations=transformations) result=unicode(N(parsed)) except TypeError: result=u'...' except (TokenError,SyntaxError,SympifyError): result=u'...' #parsed=query result=result.replace('**','^') #parsed=(unicode(parsed).replace('**','^') if unicode(parsed)[0:2]!='0-' else unicode(parsed)[1:].replace('**','^')) query=(unicode(query) if unicode(query)!='0-' else unicode(query[1:])) wf.add_item(result, query, arg=result+','+query, icon='rightarrow.png', valid=True, largetext=result) for i in history: wf.add_item(i[0], i[1], autocomplete=i[1], icon='history.png') wf.send_feedback() if __name__==u"__main__": wf=Workflow() sys.exit(wf.run(main)) Edited April 4, 2016 by therockmandolinist Link to comment
deanishe Posted April 4, 2016 Share Posted April 4, 2016 Can you edit your post to indent the code properly? It's Python, so indentation matters and the code as you've posted it is invalid. Also, you're getting a line number in your traceback, so could you post it? On top of that, the input (if there is any), the expected output and the actual output. If I can't replicate the issue, it's usually almost impossible to fix. Link to comment
deanishe Posted April 4, 2016 Share Posted April 4, 2016 (edited) A couple of things that I can discern from the unindented code: Calling docopt with wf.args can lead to issues (though mostly when multiple arguments are permitted, IIRC). Docopt expects the raw, encoded-string arguments, not the Unicode objects in wf.args. That shouldn't be a problem here, but you're trying to decode a Unicode object with query=args.get('<query>').decode('UTF-8'). That's incorrect if you're passing wf.args to docopt. args.get('<query>') already returns a normalised Unicode string. It's also a better idea to use wf.decode() than str.decode('utf-8') because wf.decode() will also normalise the Unicode. Fundamentally with docopt, you should call docopt first and wf.decode() on the results. history contains encoded strings, not Unicode. Right there, anything from history is a potentially non-ASCII timebomb waiting to blow up. It should be: historyList = wf.decode(historyFile.read()).splitlines() Edited April 4, 2016 by deanishe Link to comment
deanishe Posted April 4, 2016 Share Posted April 4, 2016 You can't use γ as a dictionary key in your script because you haven't specified an encoding, therefore Python treats it as ASCII.Put # encoding: utf-8 at the top of the script.Other than that, I don't see any obvious issues other than the ones mentioned above. Link to comment
therockmandolinist Posted April 5, 2016 Author Share Posted April 5, 2016 You can't use γ as a dictionary key in your script because you haven't specified an encoding, therefore Python treats it as ASCII. Put # encoding: utf-8 at the top of the script. Other than that, I don't see any obvious issues other than the ones mentioned above. Thanks for all the advice - it's always appreciated. Your suggestions did actually clear up a couple of my initial issues, but I then found that getting sympy's parse_expr function to recognize unicode characters in python 2 (esp. as custom var names) is not really optimal/easily possible, so for now I'm just handling it with a substitution on input to that function and then re-substitution after output (replacing 'γ' with 'gamma' and back again). Sorta gives me more peace of mind that way anyway, no funky stuff. Thanks again for the help! Link to comment
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now