kballard

Workflow arguments are always decomposed

13 posts in this topic

It looks like Alfred is automatically converting workflow arguments into decomposed form. I swear it didn't used to do this, but I can't be certain.

 

I've created a workflow you can use to test this. The workflow is invoked with the "char" keyword and shows the unicode codepoints for the workflow argument. If I paste in a precomposed character, the workflow shows me the info for the decomposed form. I've verified by running the workflow binary in the Terminal that the workflow does properly handle precomposed characters, so it must be Alfred decomposing it.

 

To test, install the workflow and type "char 각". It should return U+AC01 HANGUL SYLLABLE GAG but instead it returns U+1100 HANGUL CHOSEONG KIYEOK, U+1161 HANGUL JUNGSEONG A, U+11A8 HANGUL JONGSEONG KIYEOK.

Share this post


Link to post
2 hours ago, kballard said:

It looks like Alfred is automatically converting workflow arguments into decomposed form

 

It probably isn't Alfred. In this case, it's likely NSTask, but OS X fundamentally prefers NFD (which all HFS+ filenames are normalised to).

 

2 hours ago, kballard said:

It should return

 

Why? I know naff-all about Asian alphabets, but I'm having trouble understanding why OSX should return the composed Unicode form, not the decomposed one.

Edited by deanishe

Share this post


Link to post
12 minutes ago, deanishe said:

 

It probably isn't Alfred. In this case, it's likely NSTask, but OS X fundamentally prefers NFD (which all HFS+ filenames are normalised to).

 

What does HFS+ filenames have to do with passing arguments to the command-line?

 

It does appear, though, that NSTask does convert arguments to NFD, though I have no idea why that would be. I'm also not sure why that's particularly relevant here; I'm passing the input using {query}, not as arguments, so presumably Alfred is dynamically constructing a script that embeds my input and then running that script, which means NSTask doesn't ever see the input directly (and therefore cannot convert it to NFD).

 

Edit: Or is Alfred evaluating the script by passing it to /bin/bash -c, and therefore the whole script is handled as an argument?

 

Quote

Why? I know naff-all about Asian alphabets, but I'm having trouble understanding why OSX should return the composed Unicode form, not the decomposed one.

 

It should return whatever input I give it. It shouldn't be converting my input into either NFD or NFC form, just using it as-is. I'll grant you in most cases it won't really matter, but in some cases, like my workflow here, the difference is very important.

Edited by kballard

Share this post


Link to post
23 minutes ago, kballard said:

What does HFS+ filenames have to do with passing arguments to the command-line?

 

It's generally symptomatic of OS X's preference for NFD-normalised UTF-8 text.

 

24 minutes ago, kballard said:

It does appear, though, that NSTask does convert arguments to NFD

 

See what I mean?

 

24 minutes ago, kballard said:

though I have no idea why that would be

 

Because OS X prefers NFD.

 

24 minutes ago, kballard said:

which means NSTask doesn't ever see the input directly

 

How else would such a command be run? AFAIK, it all goes through NSTask.

 

25 minutes ago, kballard said:

Or is Alfred evaluating the script by passing it to /bin/bash -c

 

It still goes via NSTask.

 

26 minutes ago, kballard said:

It should return whatever input I give it. It shouldn't be converting my input into either NFD or NFC form, just using it as-is

 

Nah. Normalisation is a sensible default. I wish more platforms did it.

 

99% of the time, you're interested in the characters that make up a text, not the codepoints/bytes that represent them.

 

I mean, who the hell else but a programmer cares whether "ü" is represented as "ü" or as "u+¨"? OTOH, everyone expects the two to match.

 

Users couldn't care less about normalisation or bytes.

Share this post


Link to post
7 minutes ago, deanishe said:

It's generally symptomatic of OS X's preference for NFD-normalised UTF-8 text.

 

In most cases OS X does not normalize your text either way.

 

7 minutes ago, deanishe said:

Because OS X prefers NFD.

 

HFS+ prefers NFD. NSTask here is the only other case I can think of where it's forcing your text to NFD, and even that was a complete surprise to me. The only justification I can think of for why is if it's using -[NSString fileSystemRepresentation] to create the C strings that it passes to the underlying POSIX APIs, and the only real reason to do that is to handle the weird edge cases with programs that accept input and then do byte-wise comparisons against the filesystem (as opposed to passing the string to the filesystem APIs and letting them do the comparison).

 

But in general, OS X doesn't care if you're using composed or decomposed strings.

 

7 minutes ago, deanishe said:

How else would such a command be run? AFAIK, it all goes through NSTask.

 

If you write a script to a file, and then invoke that script via NSTask, the NSTask APIs never actually see the input string and therefore won't have a chance to decompose it.

 

7 minutes ago, deanishe said:

Nah. Normalisation is a sensible default. I wish more platforms did it.

 

Why? There's no need for normalization in most cases. There's certainly no benefit to normalizing the arguments passed to Alfred workflows.

 

7 minutes ago, deanishe said:

99% of the time, you're interested in the characters that make up a text, not the codepoints/bytes that represent them.

 

And so whether it's composed or decomposed doesn't matter. That's not an argument for decomposing strings. That's jut an argument for using unicode canonical equivalence when comparing strings.

Share this post


Link to post

Back when this first cropped up, I did quite a bit of research in this area, and the behaviour that Alfred has settled on is the outcome of that. Alfred keeps consistency normalisation between both {query} mode (whole script run as argument) and argv mode (script run as file with arguments passed) when running a script, which is why you're seeing the default NSTask's decomposition behaviour.

 

At this time, I also wrote a small utility to help normalise strings:

https://dl.dropboxusercontent.com/u/6749767/Alfred/normalise.zip
 
If you include this in your workflow itself, you should be able to run it directly like this:
usage: ./normalise -form NFC й
(You can add -verbose after NFC to see what is happening, or no arguments to see the options.)

 

If you're not happy with what's happening above, you could try using a write file output object to write the script to a location complete with your {query} objects, and then use a run script to run that specific script. That would customise the behaviour, but you'd need to escape the {query} yourself.

 

Cheers,

Andrew

 

[moving to Workflow help as the behaviour is Alfred is expected]

Share this post


Link to post

What's the point of deliberately normalizing, though? In nearly all cases it won't matter, it just screws with cases like my workflow where I explicitly care about the difference.

Share this post


Link to post

Also, I have no idea what you're suggesting wth Write Text File. My workflow is a Script Filter workflow. There doesn't appear to be any way I can possibly get the input argument passed to my workflow without normalization. Alfred's behavior here is completely ******* with my workflow for no good reason and I don't see a workaround.

Share this post


Link to post
7 hours ago, kballard said:

In nearly all cases it won't matter, it just screws with cases like my workflow where I explicitly care about the difference.

 

Because it does matter sometimes. Comparing a user's query to filenames is far more common than a workflow that needs unnormalised input.

 

 

 

7 hours ago, kballard said:

Also, I have no idea what you're suggesting wth Write Text File. My workflow is a Script Filter workflow. There doesn't appear to be any way I can possibly get the input argument passed to my workflow without normalization.

 

He means as a way to pass input around so Alfred doesn't normalise it.

 

For a Script Filter, I think the clipboard (i.e. pbpaste) is the only way to get your input past Alfred's filters.

Share this post


Link to post
13 hours ago, kballard said:

What's the point of deliberately normalizing, though? In nearly all cases it won't matter, it just screws with cases like my workflow where I explicitly care about the difference.

 

13 hours ago, kballard said:

Also, I have no idea what you're suggesting wth Write Text File. My workflow is a Script Filter workflow. There doesn't appear to be any way I can possibly get the input argument passed to my workflow without normalization. Alfred's behavior here is completely ******* with my workflow for no good reason and I don't see a workaround.

 

The point is, when using NSTask, you literally have no choice, macOS automatically normalises. When this first cropped up, I added options to force normalisation to allow the user to pick the normalisation type, and NSTask re-normalised and undid any changes. Any arguments passed to the script are normalised in the same way by macOS, and this is how Alfred / macOS has always worked.

 

Alfred works consciously within these constraints to give consistent behaviour across all the different modes of running scripts in Alfred.

 

Did you try using the normalise tool I provided? This should be able to set the normalisation to the type you need or require after being passed to your script.

 

Also, swearing is NOT tolerated on this forum, I have edited your post to remove this.

The community guidelines are here: https://www.alfredforum.com/guidelines/

 

Andrew

Share this post


Link to post
On 3/18/2017 at 3:31 AM, Andrew said:

The point is, when using NSTask, you literally have no choice, macOS automatically normalises. When this first cropped up, I added options to force normalisation to allow the user to pick the normalisation type, and NSTask re-normalised and undid any changes. Any arguments passed to the script are normalised in the same way by macOS, and this is how Alfred / macOS has always worked.

 

Alfred works consciously within these constraints to give consistent behaviour across all the different modes of running scripts in Alfred.

 

Your "consistent behavior" is my "broken behavior". You're telling me you're doing extra work to ensure there's no way for me to get un-normalized text, and that's extremely annoying.

 

Quote

Did you try using the normalise tool I provided? This should be able to set the normalisation to the type you need or require after being passed to your script.

 

I feel like you don't actually understand my problem. I don't want normalized text. If I needed a particular normalization, I'd do it. But I want to pass the input exactly as provided to my script, because my script behaves differently when providing composed vs decomposed characters, and that behavior difference is very important. If I try to pass it a composed character, it should be given that composed character. And if I try to pass it decomposed characters, it should be given decomposed characters.

 

I have a suggestion for an alternative workaround here. What if you added a third option for input, to pass it in via stdin (instead of argv or {query})? Then you could simply not normalize the stdin approach, because it's far less likely to be used for filenames than it is to be used for arbitrary text. And NSTask won't normalize for you here.

 

As an aside, I just tested and it appears that current script actions are executed without closing off stdin. I made a script that ran `cat` as part of its processing, and the script never completed. I would have expected Alfred to run scripts with stdin either closed directly or connected to /dev/null, so that way anything that tries to read from stdin won't hang forever.

Edited by kballard

Share this post


Link to post

Has something changed?

 

You had exactly this same conversation 4 years ago:

 

TL;DR: NSTask normalises strings. 

Share this post


Link to post

@kballard I am not doing extra work to normalise strings. I am using NSTask, and this is the documented behaviour of NSTask and Alfred, and works consistently across every way of running a script with Alfred.

 

I do understand your problem. The workflow requirement @deanishe linked to is one corner case which currently not possible with Alfred's use of NSTask and the script filter. One possible workaround could be to use a keyword -> write file -> run script -> large type workflow instead which would allow you to write the typed argument to a file and then load that in the script to give the output to large type.

 

It's possible that Alfred will offer stdin in the future, but this hasn't been required or requested often, so it's not high priority. I've attached this thread to the ticket.

 

Andrew

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now