Sensible defaults or a clear explanation of Escaping options

deanishe · February 20, 2015

Getting the Escaping options right is notoriously difficult.

So many workflows fail to work or choke on certain input because the Escaping options are incorrect, and it's a pretty esoteric topic. It bites experienced coders as well as relative neophytes.

Even the built-in Google and Amazon Suggest example workflows have the wrong Escaping options (Backslashes should also be selected — the Google workflow chokes on, e.g., "what does \n mean?")

That said, for any given language, there is a correct set of Escaping options.

In my opinion, Alfred should automatically select the right Escaping options for the selected Language in Script Filters and Run Script actions and/or clearly document the correct Escaping options for each available language.

Edited February 20, 2015 by deanishe

Andrew · February 20, 2015

While I definitely agree with this, it's not quite as simple as a correct set of escaping options for a given language... it depends upon how people use {query} within the script (for example, they may wrap it in quotes, single or double).

In the future, I want to offer alternative methods for passing in the query argument (e.g. an environment variable or using stdin, in which case, they shouldn't have to be escaped).

Cheers,

Andrew

deanishe · February 20, 2015

While I definitely agree with this, it's not quite as simple as a correct set of escaping options for a given language... it depends upon how people use {query} within the script (for example, they may wrap it in quotes, single or double).

People do put {query} in single quotes, but that's just another thing they get wrong when it comes to {query} and escaping.

'{query}' is not a good alternative to "{query}" because Alfred doesn't have an option to escape single quotes. If you use '{query}', your workflow will choke on any input with an apostrophe and there's nothing you can do about it (except use double quotes, of course). So while single quotes work just fine in many situations, using '{query}' is a bad habit.

I dare say there are other things people do with {query} (in heredocs, for example), but pre-selecting sensible defaults based on the selected language would be a very helpful step in the right direction, imo.

Using stdin/envvars sounds like a more foolproof method, however.

Edited February 20, 2015 by deanishe

Andrew · February 20, 2015

I dare say there are other things people do with {query} (in heredocs, for example), but pre-selecting sensible defaults based on the selected language would be a very helpful step in the right direction, imo.

I do agree with this... One other thing which could be useful is if the script has yet to be populated, provide a default assigned variable for the escaping for that language... e.g.

$q = "{query}";

I already have a ticket to look into envs and stdin, but I may escalate that too.

Cheers,

Andrew

deanishe · February 20, 2015

I do agree with this... One other thing which could be useful is if the script has yet to be populated, provide a default assigned variable for the escaping for that language... e.g.

$q = "{query}";

I think that would be a very good idea.

IMO, you probably need a good reason to use any other form than "{query}".

If you end up working on envvars, could you drop a cheeky PYTHONIOENCODING=utf-8 in there? It's essential for Python 3 to work properly within Alfred (though I don't think Py3 is coming to OS X anytime soon), but should also mitigate a few of the problems folks have with text encoding in Python 2 workflows.

Edited February 20, 2015 by deanishe

Andrew · February 20, 2015

I think that would be a very good idea.

IMO, you probably need a good reason to use any other form than "{query}".

If you end up working on envvars, could you drop a cheeky PYTHONIOENCODING=utf-8 in there? It's essential for Python 3 to work properly within Alfred (though I don't think Py3 is coming to OS X anytime soon), but should also mitigate a few of the problems folks have with text encoding in Python 2 workflows.

I've been playing with environment variables today and it works very well indeed... having said that, unfortunately, when passing very large arguments you get a posix spawn error, so I think this may be a no-go.

I'll add the PYTHONIOENCODING=utf-8 though - From your experience, can I just blanket add that without adding another selectable option and it won't break previous stuff?

Cheers,

Andrew

deanishe · February 20, 2015

It should "just work". I will run a few tests when I get home, however.

Basically in Alfred, Python 2 and 3 default to ASCII encoding for STDIN/-OUT/-ERR and ARGV, instead of UTF-8 like in a properly configured shell. So when you try to print a Unicode string (Python tries to automatically encode it for you), scripts that work fine in a shell can blow up in Alfred (or other apps). Setting PYIOENCODING=utf-8 will just make Python scripts behave more like they would when run from a shell.

In Py2, decoding and encoding strings is left up to the coder, and a lot of people get it wrong because it's rather tricky. Setting PYIOENCODING will make a lot of print('my unicode') calls that should arguably be print('my unicode'.encode('utf-8')) work.

In Py3, the language tries to do all the encoding/decoding for you, which is fantastic when it works, but unfortunately it falls completely to bits in an ASCII environment and it's painful to work around when Py3 has chosen the wrong encoding for you. With PYIOENCODING=utf-8, Py3 should be much easier to write workflows in than Py2. Without it, it's a lot more difficult than Py2. You have to set it yourself in your workflow before calling python3 (stupidly, there's no way to change it from within a running Python program).

Edited February 21, 2015 by deanishe

Andrew · February 20, 2015

Thanks for the explanation... I've put the environment variable in for the next release and I'll play it by ear later

deanishe · February 21, 2015

I've been trying it out, and PYIOENCODING is basically a python-specific LC_CTYPE. Setting it fixes a lot of the tricky writing to STDOUT/STDERR, where a script works in a shell, but not in Alfred. It doesn't have a detrimental effect on code that previously worked, because that was ignoring the environment encoding and treating IO as UTF-8 anyway.

But the more I think about it, I think there's a broader, underlying issue in that Alfred isn't setting LC_CTYPE to UTF-8 when it would arguably be the correct thing to do, and would fix a lot of the encoding issues that come up (or rather, make them non-issues).

I mean, Alfred communicates with workflows exclusively via UTF-8-encoded strings, but tells its subprocesses by omission that they're in an ASCII environment (POSIX default).

A lot of the encoding issues that come up with python, ruby, pbcopy & -paste etc. result from the software doing what POSIX says it should and treating IO as ASCII, when it's actually UTF-8. The fixes/workarounds generally revolve around telling whichever interpreter/tool that the environment is in fact UTF-8, after which things go swimmingly.

So, I'd argue that Alfred should set LC_CTYPE=UTF-8 in its subprocesses. The IO is UTF-8 (because Alfred uses and requires it), and LC_CTYPE/LANG is how most of the interpreters Alfred runs expect to be told what kind of environment they're running in.

I've been mulling over the impact of this on existing workflows, and it shouldn't actually break anything, should it?

I mean, any existing code that does encoding correctly is already treating IO as UTF-8. Code that doesn't may be broken or just doesn't care, but if its works fine thinking the IO is ASCII, it'll work fine thinking it's UTF-8.

Thing is, I don't think it could work as a settings toggle (except on a workflow-by-workflow level): it's no help to coders if the environment can be marked as either ASCII or UTF-8. Better to just prefix every script with export LC_CTYPE=UTF-8 in that case.

Edited February 21, 2015 by deanishe

deanishe · February 21, 2015

Regarding environmental variables for input: how big do Alfred queries get?

Andrew · February 22, 2015

Regarding environmental variables for input: how big do Alfred queries get?

Thanks for your insight and research into this... I'll add LC_CTYPE as UTF-8 to the NSTask environment

As for queries, there is no hard limit on query size in 2.6 (for 2.6.1, I've actually added artificial limits for certain aspects which helps speed things up and prevent Alfred from bloating, e.g. sending hundreds of characters into an address book api query)... so basically, if somebody pasted the source code to the universe into Alfred, it would attempt to pass that to your script as {query}.

I think I may still do the environment variable for {query} but give a caveat that if that mode is selected, queries longer than e.g. 1k chars will automatically be discarded.

Cheers,

Andrew

deanishe · February 22, 2015

Would using environmental vars/STDIN for {query} be a workflow-specific setting, then, not 3 always-on alternatives?

Andrew · February 22, 2015

Would using environmental vars/STDIN for {query} be a workflow-specific setting, then, not 3 always-on alternatives?

It would be an option per script, strictly for memory reasons. The argument is passed into every currently matching workflow (system and user) for every character typed. Alfred would need to allocate 3 times the memory for setting the environment var, string replacing the {query} and passing into STDIN and if somebody were to paste in a huge chunk of text, this could make Alfred appear bloated until OS X purged inactive memory - this doesn't fit with my ethos for being as lightweight as possible at all times.

It would also allow highlighting the nuances between the different options e.g. escaping needed for {query} replacement, and [possible] ~~concat~~ query trimming or failure with env var.

deanishe · February 23, 2015

Duh. I forgot Alfred often calls multiple workflows at once. Using an envvar does sound like a cleaner way.

What do you mean by "[possible] concat"? Would Alfred append the query to any existing value rather than replacing it?

Andrew · February 23, 2015

Duh. I forgot Alfred often calls multiple workflows at once. Using an envvar does sound like a cleaner way.

What do you mean by "[possible] concat"? Would Alfred append the query to any existing value rather than replacing it?

Sorry, I meant trim, not contat... as there seems to be a forced size limit for environment variables through shell or NSTask where I'm getting the posix error on launch. The only way around this would be to either trim massive inputs or to just not run the NSTask when the input query is huge.

By having the user select the different mode for the query argument, the help text would update... or the escaping tickboxes would be hidden etc, just making things a bit more clear really.

deanishe · February 23, 2015

Sorry, I meant trim, not contat... as there seems to be a forced size limit for environment variables through shell or NSTask where I'm getting the posix error on launch. The only way around this would be to either trim massive inputs or to just not run the NSTask when the input query is huge.

AFAIK, 256 KB is the maximum total allowable size for all environmental variable combined on OS X.

My gut feeling is it would be better to not run the script and to log an error instead ("query too large"). Silently truncating the input sounds like a bad idea: presumably that would often lead to workflows silently producing incorrect output without informing the user that the output is probably not what he/she is expecting.

By having the user select the different mode for the query argument, the help text would update... or the escaping tickboxes would be hidden etc, just making things a bit more clear really.

For my money, the way you described above is a great solution: if the Script box is empty (or only contains one of Alfred's "suggestions"), Alfred should pre-populate it with something appropriate to the language, e.g. $q = "{query}"; or $q = $_SERVER['alfred_query']; for PHP or for Python, q = "{query}" or

import os
q = os.environ['alfred_query']

rice.shawn · February 23, 2015

If you do pass it as an environmental variable, please keep {query} itself so that it doesn't break every workflow in existence. That would be a nightmare.

I do think that language-contextual sensible defaults for escaping would be a good idea because it would remove a lot of bugs that people run into. Granted, we can still check/uncheck more boxes so it doesn't break anything. Often the bugs aren't caught until they're released into the wild because that's when people start using characters that you didn't expect to escape.

I'd be a fan of $query rather than $q because it matches the current syntax well, but it is also more informative for newer developers.

If you do start adding in variables to the scripts, maybe you could do a few other things that would help remove bugs. For instance, PHP throws a notice when you try to use date but haven't set a default timezone in your php.ini (which most users haven't created). So I always use:

// Set date/time to avoid warnings/errors.
if ( ! ini_get( 'date.timezone' ) ) {
	ini_set( 'date.timezone', exec( 'tz=`ls -l /etc/localtime` && echo ${tz#*/zoneinfo/}' ) );
}

in order to set the timezone from the system time in order to avoid those. Also, I often need to include:

// This is needed because, Macs don't read EOLs well.
if ( ! ini_get( 'auto_detect_line_endings' ) ) {
	ini_set( 'auto_detect_line_endings', true );
}

because problems occur when reading files.

Andrew · February 23, 2015

If you do pass it as an environmental variable, please keep {query} itself so that it doesn't break every workflow in existence. That would be a nightmare.

It's definitely going to be an option, and will default to {query} for current workflows.

I'd like to keep extra environment vars to a bare minimum... as I understand, the python ones are essential to stop things from breaking for P3.

Andrew · February 26, 2015

So these are the defaults I have, could you correct me if any are inaccurate, thanks!

[uPDATED]

bash, zsh:

Backquotes, Double Quotes, Backslashes, Dollars

query="{query}"

php:

Double Quotes, Backslashes, Dollars

$query = "{query}";

ruby:

Backquotes, Double Quotes, Backslashes, Dollars

query = "{query}"

python:

Double Quotes, Backslashes

query = "{query}"

perl:

Backquotes, Double Quotes, Backslashes, Dollars

$query = "{query}";

osascript:

Double Quotes, Backslashes

set theQuery to "{query}"

Cheers,

Andrew

rice.shawn · February 26, 2015

To be extra paranoid, you might consider adding backquotes (backticks?) to PHP. They can be used to execute shell commands.

Although, I'm not perfectly sure:

php -r "echo `ls`;"

gives PHP Parse error: parse error, expecting `','' or `';'' in Command line code on line 2

php -r 'echo `ls`;'

properly executes the ls command.

Maybe pound signs (hash?):

php -r 'echo "hello"; # echo "hello"'

prints 'hello' only once rather than twice.

Reversing the double and single quotes has the same behavior.

You can also run shell commands in Ruby with backquotes.

Andrew · February 26, 2015

To be extra paranoid, you might consider adding backquotes (backticks?) to PHP. They can be used to execute shell commands.

You can also run shell commands in Ruby with backquotes.

Hey Shawn,

Thanks! Alfred's defaults assume wrapped in " marks, so if I escape backquote, you just see \` in the output... I've added backquotes in ruby and perl now though which doesn't affect the output.

I'll update above!

Cheers,

Andrew

deanishe · February 26, 2015

PHP doesn't execute backticks within double quotes, so it's not necessary to escape backticks.

@Andrew: Python:

Double Quotes, Backslashes

query = "{query}"

(Spaces around the = sign.)

Andrew · February 26, 2015

PHP doesn't execute backticks within double quotes, so it's not necessary to escape backticks.

@Andrew: Python:

Double Quotes, Backslashes
query = "{query}"
(Spaces around the = sign.)

Thanks Dean!

Because we are wrapping in double quotes, can a space be put around the = sign for all of the language types without adding whitespace (to make things look more consistent).

Cheers,

Andrew

rice.shawn · February 26, 2015

Good to know. I didn't test the cases thoroughly. As a rule, now, I avoid backquotes to execute commands in every programming language in favor of the more explicit `exec` (!) commands for clarity, etc, so I don't know their behavior terribly well.

deanishe · February 26, 2015

To be extra paranoid, you might consider adding backquotes (backticks?) to PHP. They can be used to execute shell commands.

Although, I'm not perfectly sure:
php -r "echo `ls`;"
gives PHP Parse error: parse error, expecting `','' or `';'' in Command line code on line 2

Bash is executing the `ls` before it reaches PHP, which is why this works:

php -r 'echo `ls`;'
properly executes the ls command.

Maybe pound signs (hash?):
php -r 'echo "hello"; # echo "hello"'
prints 'hello' only once rather than twice.

Reversing the double and single quotes has the same behavior.

You can also run shell commands in Ruby with backquotes.

You're confusing embedding {query} in a shell command that runs PHP/Ruby, and directly within PHP/Ruby code (i.e. with Language set to /usr/bin/ruby etc.)

None of Python, PHP or Ruby will execute backticks enclosed within double quotes, so none of them needs Backquotes escaping.

Sensible defaults or a clear explanation of Escaping options

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in