Alfred PDF Tools – Optimize, encrypt and manipulate PDF files

2pilvic · April 11, 2022

27 minutes ago, xilopaint said:

Thanks guys. All you guys can show your love by giving a star on the workflow's GitHub repo.

Btw I would appreciate if anyone with an ARM Mac could tell me if this issue is still present in v3.

It works great, even "optimize". Thank you very much for your wonderful workflow (iMac, M1, MacOS 12.2.1.)

xilopaint · April 11, 2022

2 hours ago, 2pilvic said:

It works great, even "optimize". Thank you very much for your wonderful workflow (iMac, M1, MacOS 12.2.1.)

Do you remember having any issues with Optimize in v2 and your ARM Mac?

Edited April 11, 2022 by xilopaint

2pilvic · April 12, 2022

23 hours ago, xilopaint said:

Do you remember having any issues with Optimize in v2 and your ARM Mac?

Sorry I can't tell you, but I haven't used the "optimize" function before. Thanks again

xilopaint · April 12, 2022

24 minutes ago, 2pilvic said:

Sorry I can't tell you, but I haven't used the "optimize" function before. Thanks again

No problem. Thank you too.

xilopaint · April 13, 2022

Update (v3.1)

• Added a Deskew file action.

Edited April 13, 2022 by xilopaint

h2ner · April 19, 2022

Thank you very much for this great workflow. Would you also consider adding repair via some other library such as mupdf? Fairly often, I have PDFs that aren't repairable with gs but mupdf works great. Having such automated would be fantastic.

famaus · April 22, 2022

Hi,

Thanks for this workflow.

I only see below file actions in Alfred and none of the one's you mentioned. Could you please advice what am I missing?

image.png.d60592b6d360204c9fc5c2006a8e68e2.png

Version details:

macOS: 12.3.1

Alfred: 4.6.4

Skimmer Workflow: 3.1

PyPDF: 4-1.27.0

Thanks!

xilopaint · April 22, 2022

2 minutes ago, famaus said:

Hi,

Thanks for this workflow.

I only see below file actions in Alfred and none of the one's you mentioned. Could you please advice what am I missing?

Version details:

macOS: 12.3.1

Alfred: 4.6.4

Skimmer Workflow: 3.1

PyPDF: 4-1.27.0

Thanks!

This is not a Skimmer workflow thread.

famaus · April 22, 2022

Sorry, my bad.. 🤯

Your workflow, Alfred PDF Tools works great .

Thanks again!

xilopaint · April 25, 2022

Update (v3.1.1)

• Fixed critical bug that allowed command injection via subprocess.Popen.

• Code refactoring.

Edited April 25, 2022 by xilopaint

xilopaint · April 25, 2022

Guys, this is an important release as it fixes a critical bug that allowed shell injection. Anyone interested in having a harmless proof of concept of the bug can rename a PDF file as $(say boom).pdf and run the Optimize file action.

Besides the critical nature of the bug, it’s extremely unlikely that it has ever been exploited since it depends on the user to run either Optimize or Deskew file actions in a PDF with a filename that causes command substitution.

The bug was fixed by validating the user input — which is the file path in the case of a file action — and preventing subprocess.Popen to run. Now, an exception is raised if the input is illegal and the user receives a macOS notification with the warning.

For better security, now the workflow is Bandit compliant.

Edited April 25, 2022 by xilopaint

xilopaint · April 25, 2022

Btw, it would be nice if some shell wizard like @vitoror someone else could try to hack the latest version of the workflow, but I think the bug is already fixed.

vitor · April 25, 2022

Are you sending your external commands to Popen as a single unescaped string? You should always shellescape that input or (better) send the arguments as an array (Popen should support that default).

Here’s as explanation of the principle, in Ruby.

xilopaint · April 25, 2022

1 hour ago, vitor said:

Are you sending your external commands to Popen as a single unescaped string?

Here you can see the start of the code block. I'm sending the command through this variable:

cmd = f'echo -y | {os.path.dirname(__file__)}/bin/k2pdfopt "{pdf_path}" -as -mode copy -n -o "%s [optimized].pdf" -x'

pdf_path is a variable that stores the input of the file action, basically Alfred {query}. As you can see it’s enclosed in double quotes.

1 hour ago, vitor said:

or (better) send the arguments as an array (Popen should support that default)

I tried this at some point, but it didn't work for some reason, so I found this in the docs and gave up:

Quote

The shell argument (which defaults to False) specifies whether to use the shell as the program to execute. If shell is True, it is recommended to pass args as a string rather than as a sequence.

In my case shell=True.

Edited April 25, 2022 by xilopaint

vitor · April 25, 2022

There’s your problem. Because you’re using the method which uses “the shell as the program to execute”, you’re open to shell injection. Change it to the array method and you solve the problem.

1 hour ago, xilopaint said:

I tried this at some point, but it didn't work for some reason

I’d bet on the the pipe tripping you up. Those tend to be harder to implement in the array method. Find a way to send STDIN to your command.

As a last resort, as mentioned, shellescape (there should be something in the standard library) the user input before feeding it into the string. But again, I recommend learning the proper method. Checking user input and warning of “illegal” input shouldn’t be the solution, especially if it’s bespoke.

Edited April 26, 2022 by vitor

xilopaint · April 26, 2022

1 hour ago, vitor said:

There’s your problem. Because you’re using the method which uses “the shell as the program to execute”, you’re open to shell injection.

Yes, but it's necessary in my case because I need to read STDOUT to give the user the possibility to track the progress of the shell process through a script filter.

1 hour ago, vitor said:

Change it to the array method and you solve the problem.

I had given up on the array method because Python docs recommend against it if shell=True (I don't understand why, btw).

1 hour ago, vitor said:

As a last resort, as mentioned, shellescape (there should be something in the standard library) the user input before feeding it into the string. But again, I recommend learning the proper method.

Could you give me an example on how a shellescaped string of a filepath would look like in this case?

1 hour ago, vitor said:

But again, I recommend learning the proper method. Checking user input and warning of “illegal” input shouldn’t be the solution, especially if it’s bespoke.

What do you mean with bespoke? Btw, I check the user input for preventing subprocess.Popen to run. Why is that not a solution if it solves the problem? Anyway I'm still open for a more elegant solution, of course.

Edited April 26, 2022 by xilopaint

vitor · April 26, 2022

21 minutes ago, xilopaint said:

but it's necessary in my case because I need to read stdout

There’s no reason the array method shouldn’t give you STDOUT.

21 minutes ago, xilopaint said:

I had given up on the array method because Python docs recommend against it if shell=True

But why do you want shell=True? You probably shouldn’t be using that.

21 minutes ago, xilopaint said:

Could you give me an example on how a shellescaped string of a filepath would look like in this case?

$ printf '%q' '/some/path here/with spaces/and (stran"ge) Charac!{r$'
/some/path\ here/with\ spaces/and\ \(stran\"ge\)\ Charac\!\{r\$

21 minutes ago, xilopaint said:

What do you mean with bespoke?

As in “if you wrote the verification code yourself instead of relying on something from the standard library”.

21 minutes ago, xilopaint said:

Why is that not a solution if it solves the problem?

It is a solution, but not the best. The proper way is to use the proper tools to provide proper arguments. If you give the arguments in the correct way you don’t have to block the user because they have a weird file name, it’ll just work.

Edited April 26, 2022 by vitor

xilopaint · April 26, 2022

16 minutes ago, vitor said:

There’s no reason the array method shouldn’t give you STDOUT.

I mean shell=True and not the array method.

16 minutes ago, vitor said:

But why do you want shell=True? You probably shouldn’t be using that.

Because I offer some kind of progress bar in real-time with a re-running script filter. It shows how many pages of the PDF file were already processed and the progress percentage. As far as I know I can’t get this in real-time with any other subprocess method than subprocess.Popen with shell=True.

Edited April 26, 2022 by xilopaint

xilopaint · April 26, 2022

Hey @vitor, ignore my last post. I investigated the issue more deeply and will get back with a new post on why I think there’s no easy alternative to using shell=True in this case.

vitor · April 26, 2022

It may be that you’re correct. I dislike Python, so I avoid writing it, so I won’t be familiar with all intricacies without looking at the docs.

But it doesn’t matter for the solution. If you do really need to invoke a shell, escape the input (note how even in that answer they recommended the array method in bold).

Edited April 26, 2022 by vitor

xilopaint · April 26, 2022

@vitor, this is my command now:

cmd = f'echo | {os.path.dirname(__file__)}/bin/k2pdfopt {shlex.quote(pdf_path)} -as -mode copy -n -o "%s [optimized].pdf" -x'

As you can see I use echo to pass a newline to the next command where the binary k2pdfopt is executed. This is necessary because k2pdfopt is a CLI that waits for a newline to proceed:

This is the reason why I need shell=True as you cannot run multiple commands without a shell. Also, I cannot use the array method.

On 4/26/2022 at 7:08 AM, vitor said:

If you do really need to invoke a shell, escape the input (note how even in that answer they recommended the array method in bold).

Thanks for the link! I was not aware about shlex.quote. I just released a new version using this function and it’s enough to prevent command injection through command substitution. Unfortunately, k2pdfopt cannot handle filepaths with double quotes yet, so I reported this to developer. Now the only blocked paths are those with double quotes. I consider this a win.

Edited May 13, 2022 by xilopaint

xilopaint · April 26, 2022

Update (v3.1.2)

• Improved fix for the shell injection bug.

vitor · April 27, 2022

4 hours ago, xilopaint said:

As you can see I use echo to pass a newline to the next command where the binary k2pdfopt is executed. This is necessary because k2pdfopt is a CLI that waits for a newline to proceed

That is, more likely than not, a “useless use of echo”. If you want to send a newline as input, then (as already mentioned) find a way to send STDIN to your command. But first make sure there really is no way to run the tool without interactivity.

4 hours ago, xilopaint said:

as you cannot run multiple commands without a shell.

You’re not running multiple commands, you’re piping from one to another. Also, from that user’s specific example they shouldn’t be using the shell either. They should be running two processes sequentially, which is what the shell would do anyway.

xilopaint · April 27, 2022

1 hour ago, vitor said:

That is, more likely than not, a “useless use of echo”. If you want to send a newline as input, then (as already mentioned) find a way to send STDIN to your command.

It’s possible to send data to STDIN via subprocess.Popen.communicate but it's not an option for me because this method waits for the child process to finish to return and I can't read from STDOUT meanwhile.

Edited April 27, 2022 by xilopaint

xilopaint · May 13, 2022

Update (v3.1.3)

• Improved fix for the shell injection bug.

Alfred PDF Tools – Optimize, encrypt and manipulate PDF files

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

xilopaint

xilopaint

alwaysaugust

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in