Jump to content

possible php issue


chadv

Recommended Posts

UPDATE: I found a workaround for my issue. The problem was with with unicode normalization. See the last post.

 

I'm getting unexpected results while running a php script in alfred. This is a reduced version of my script:

echo "<items><item><title>".htmlentities(htmlentities('&é'))."</title></item></items>\n";

When I run this from command line php, I get the expected result, with the é encoded into &eacute;

<items><item><title>&amp;&eacute;</title></item></items>

When I run this in an alfred workflow, the é is not encoded. Here's the debug console output:

[INFO: alfred.workflow.input.scriptfilter] <items><item><title>&amp;é</title></item></items>

If I replace the htmlentities() functions with &eacute;, it outputs correctly, so I don't think there's a problem with the console.

 

I'm running Alfred v2.3 (264) on OS X 10.9.3. I have not customized the php setup on my system.

 

I appreciate any help in getting to the bottom of this. Thanks.

Edited by chadv
Link to comment
Share on other sites

I found two apparent issues (PHP is not one of them):

  1. You are calling htmlentities twice on the same piece of text. The first run produces "&é", which is valid. The second run re-encodes the ampersands and corrupts the output.
  2. Your XML is missing the <?xml version="1.0" encoding="UTF-8"?> header. I'm not sure if that's part of the issue.

Hope this helps!

Link to comment
Share on other sites

I found two apparent issues (PHP is not one of them):

  1. You are calling htmlentities twice on the same piece of text. The first run produces "&é", which is valid. The second run re-encodes the ampersands and corrupts the output.
  2. Your XML is missing the <?xml version="1.0" encoding="UTF-8"?> header. I'm not sure if that's part of the issue.

Hope this helps!

 

Hi Tyler, Thanks for taking the time to look at this.

 

I know about the double htmlentities(). That's intentional. When debug console output says &amp;&eacute;, it shows up in the Alfred prompt as , which is what I want.

 

I've tried adding the <?xml version="1.0" encoding="UTF-8"?>, but it makes no difference. I've also tried with an encoding value of ISO-8859-1, and it does not help either.

 

I'm at a loss for why this é isn't being encoded.

 

My initial suspicion was that it's related to the context that Alfred runs php in, since PHP's encoding functions change their behavior based on the system locale. However, based on the following tests, I'm not so sure:

  • When I compare the output of phpinfo(), the command line context has the LANG var set to en_US.UTF-8, while in the alfred context, the LANG is not set.
  • I tried adding $_ENV["LANG"] = "en_US.UTF-8" or putenv("LANG=en_US.UTF-8") or setlocale(LC_ALL, "en_US.UTF-8") to my script, but none of them make any difference.
  • I followed the advice in this thread, to set the system LANG and LC_ALL vars using launchd. Alfred does pick up these changes, and they are reflected in the phpinfo() output, but the é remains unencoded.

One more interesting difference is with this function: iconv('UTF-8', 'ASCII//TRANSLIT', 'é'). On the command line, it outputs 'e, but in alfred, it outputs an empty string.

Edited by chadv
Link to comment
Share on other sites

I did more testing.

 

I switched to using an external php file. When the é is written in the external file, it encodes as expected. When the é is passed in from Alfred, it does not get encoded.

 

I no longer think this is specific to php. My current hunch is that something is happening to the é in Alfred's string handling, and php is receiving a non-standard é.

 

Here is the setup I used to test, and the output I received.

 

bash script, written directly in Alfred:

LANG=en_US.UTF-8 /usr/bin/php x.php "alfred: é";

x.php, external file:

<?php
$query = $argv[1].', file: é';
$encode = htmlentities(htmlentities($query));
echo "<items><item><title>$encode</title></item></items>\n";
?>

outputs:

[INFO: alfred.workflow.input.scriptfilter] <items><item><title>alfred: é, file: &eacute;</title></item></items>
Link to comment
Share on other sites

I figured it out.

 

The issue is unicode normalization. Alfred is converting unicode input to decomposed characters.

 

In my test case Alfred is converting U+00E9 LATIN SMALL LETTER E WITH ACUTE into U+0065 LATIN SMALL LETTER E and U+0301 COMBINING ACUTE ACCENT.

 
My solution was to renormalize to precomposed characters using the tool that @Andrew provided in this thread
php encode.php "$(./normalise -form NFC "{query}")"
UPDATE: I had posted a different solution, using the iconv command, but it did not handle emoji. This latest solution handles everything I've tried so far.
Edited by chadv
Link to comment
Share on other sites

I figured it out.

 

The issue is unicode normalization. Alfred is converting unicode input to decomposed characters.

 

In my test case Alfred is converting U+00E9 LATIN SMALL LETTER E WITH ACUTE into U+0065 LATIN SMALL LETTER E and U+0301 COMBINING ACUTE ACCENT.

 
My solution was to renormalize to precomposed characters. using the unix iconv command. This is what my bash script looks like (escaping enabled for: backquotes, double quotes, dollars, and backslashes)
php encode.php "$(echo "{query}" | iconv -f utf-8-mac -t utf-8)"

Maybe it would be better for Alfred to normalize to precomposed characers or not do any normalizing at all. This would help Alfred's script output better match raw terminal output.

 

This is indeed to do with normalisation, and a consequence of using NSTask. Alfred doesn't have control over the normalisation so I created a little command line app for users to re-encode, or you can use iconv. More details (and the command line tool) in this topic:

 

http://www.alfredforum.com/topic/2015-encoding-issue/

 

Cheers,

Andrew

 

[moving to closed]

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...