Jump to content

Alfred PDF Tools – Optimize, encrypt and manipulate PDF files


Recommended Posts

27 minutes ago, xilopaint said:

Thanks guys. All you guys can show your love by giving a star on the workflow's GitHub repo.

 

Btw I would appreciate if anyone with an ARM Mac could tell me if this issue is still present in v3.

It works great, even "optimize". Thank you very much for your wonderful workflow (iMac, M1, MacOS 12.2.1.)

Link to comment

Thank you very much for this great workflow. Would you also consider adding repair via some other library such as mupdf? Fairly often, I have PDFs that aren't repairable with gs but mupdf works great. Having such automated would be fantastic.

Link to comment

Hi,

Thanks for this workflow.

 

I only see below file actions in Alfred and none of the one's you mentioned. Could you please advice what am I missing?

 

image.png.d60592b6d360204c9fc5c2006a8e68e2.png

 

Version details:

macOS: 12.3.1

Alfred: 4.6.4

Skimmer Workflow: 3.1

PyPDF: 4-1.27.0

 

Thanks!

Link to comment
2 minutes ago, famaus said:

Hi,

Thanks for this workflow.

 

I only see below file actions in Alfred and none of the one's you mentioned. Could you please advice what am I missing?

 

image.png.d60592b6d360204c9fc5c2006a8e68e2.png

 

Version details:

macOS: 12.3.1

Alfred: 4.6.4

Skimmer Workflow: 3.1

PyPDF: 4-1.27.0

 

Thanks!

 

This is not a Skimmer workflow thread.

Link to comment

Guys, this is an important release as it fixes a critical bug that allowed shell injection. Anyone interested in having a harmless proof of concept of the bug can rename a PDF file as $(say boom).pdf and run the Optimize file action.


Besides the critical nature of the bug, it’s extremely unlikely that it has ever been exploited since it depends on the user to run either Optimize or Deskew file actions in a PDF with a filename that causes command substitution.


The bug was fixed by validating the user input — which is the file path in the case of a file action — and preventing subprocess.Popen to run. Now, an exception is raised if the input is illegal and the user receives a macOS notification with the warning.

 

For better security, now the workflow is Bandit compliant.

Edited by xilopaint
Link to comment
1 hour ago, vitor said:

Are you sending your external commands to Popen as a single unescaped string?

 

Here you can see the start of the code block. I'm sending the command through this variable:

 

cmd = f'echo -y | {os.path.dirname(__file__)}/bin/k2pdfopt "{pdf_path}" -as -mode copy -n -o "%s [optimized].pdf" -x'

 

pdf_path is a variable that stores the input of the file action, basically Alfred {query}. As you can see it’s enclosed in double quotes.

 

1 hour ago, vitor said:

or (better) send the arguments as an array (Popen should support that default)

 

I tried this at some point, but it didn't work for some reason, so I found this in the docs and gave up:

 

Quote

The shell argument (which defaults to False) specifies whether to use the shell as the program to execute. If shell is True, it is recommended to pass args as a string rather than as a sequence.

 

In my case shell=True.

Edited by xilopaint
Link to comment

There’s your problem. Because you’re using the method which uses “the shell as the program to execute”, you’re open to shell injection. Change it to the array method and you solve the problem.

 

1 hour ago, xilopaint said:

I tried this at some point, but it didn't work for some reason

 

I’d bet on the the pipe tripping you up. Those tend to be harder to implement in the array method. Find a way to send STDIN to your command.

 

As a last resort, as mentioned, shellescape (there should be something in the standard library) the user input before feeding it into the string. But again, I recommend learning the proper method. Checking user input and warning of “illegal” input shouldn’t be the solution, especially if it’s bespoke.

Edited by vitor
Link to comment
1 hour ago, vitor said:

There’s your problem. Because you’re using the method which uses “the shell as the program to execute”, you’re open to shell injection.

 

Yes, but it's necessary in my case because I need to read STDOUT to give the user the possibility to track the progress of the shell process through a script filter.

 

1 hour ago, vitor said:

Change it to the array method and you solve the problem.

 

I had given up on the array method because Python docs recommend against it if shell=True (I don't understand why, btw).

 

1 hour ago, vitor said:

As a last resort, as mentioned, shellescape (there should be something in the standard library) the user input before feeding it into the string. But again, I recommend learning the proper method.

 

Could you give me an example on how a shellescaped string of a filepath would look like in this case?

 

1 hour ago, vitor said:

But again, I recommend learning the proper method. Checking user input and warning of “illegal” input shouldn’t be the solution, especially if it’s bespoke.

 

What do you mean with bespoke? Btw, I check the user input for preventing subprocess.Popen to run. Why is that not a solution if it solves the problem? Anyway I'm still open for a more elegant solution, of course.

Edited by xilopaint
Link to comment
21 minutes ago, xilopaint said:

but it's necessary in my case because I need to read stdout

 

There’s no reason the array method shouldn’t give you STDOUT.

 

21 minutes ago, xilopaint said:

I had given up on the array method because Python docs recommend against it if shell=True

 

But why do you want shell=True? You probably shouldn’t be using that.

 

21 minutes ago, xilopaint said:

Could you give me an example on how a shellescaped string of a filepath would look like in this case?

 

$ printf '%q' '/some/path here/with spaces/and (stran"ge) Charac!{r$'
/some/path\ here/with\ spaces/and\ \(stran\"ge\)\ Charac\!\{r\$

 

21 minutes ago, xilopaint said:

What do you mean with bespoke?

 

As in “if you wrote the verification code yourself instead of relying on something from the standard library”.

 

21 minutes ago, xilopaint said:

Why is that not a solution if it solves the problem?

 

It is a solution, but not the best. The proper way is to use the proper tools to provide proper arguments. If you give the arguments in the correct way you don’t have to block the user because they have a weird file name, it’ll just work.

Edited by vitor
Link to comment
16 minutes ago, vitor said:

There’s no reason the array method shouldn’t give you STDOUT.

 

I mean shell=True and not the array method.

 

16 minutes ago, vitor said:

But why do you want shell=True? You probably shouldn’t be using that.

 

Because I offer some kind of progress bar in real-time with a re-running script filter. It shows how many pages of the PDF file were already processed and the progress percentage. As far as I know I can’t get this in real-time with any other subprocess method than subprocess.Popen with shell=True.

Edited by xilopaint
Link to comment

It may be that you’re correct. I dislike Python, so I avoid writing it, so I won’t be familiar with all intricacies without looking at the docs.

 

But it doesn’t matter for the solution. If you do really need to invoke a shell, escape the input (note how even in that answer they recommended the array method in bold).

Edited by vitor
Link to comment

@vitor, this is my command now:

cmd = f'echo | {os.path.dirname(__file__)}/bin/k2pdfopt {shlex.quote(pdf_path)} -as -mode copy -n -o "%s [optimized].pdf" -x'

As you can see I use echo to pass a newline to the next command where the binary k2pdfopt is executed. This is necessary because k2pdfopt is a CLI that waits for a newline to proceed:

 

P2zhc6r.png

 

This is the reason why I need shell=True as you cannot run multiple commands without a shell. Also, I cannot use the array method.

 

On 4/26/2022 at 7:08 AM, vitor said:

If you do really need to invoke a shell, escape the input (note how even in that answer they recommended the array method in bold).

 

Thanks for the link! I was not aware about shlex.quote. I just released a new version using this function and it’s enough to prevent command injection through command substitution. Unfortunately, k2pdfopt cannot handle filepaths with double quotes yet, so I reported this to developer. Now the only blocked paths are those with double quotes. I consider this a win.

Edited by xilopaint
Link to comment
4 hours ago, xilopaint said:

As you can see I use echo to pass a newline to the next command where the binary k2pdfopt is executed. This is necessary because k2pdfopt is a CLI that waits for a newline to proceed

 

That is, more likely than not, a “useless use of echo”. If you want to send a newline as input, then (as already mentioned) find a way to send STDIN to your command. But first make sure there really is no way to run the tool without interactivity.

 

4 hours ago, xilopaint said:

 

You’re not running multiple commands, you’re piping from one to another. Also, from that user’s specific example they shouldn’t be using the shell either. They should be running two processes sequentially, which is what the shell would do anyway.

Link to comment
1 hour ago, vitor said:

That is, more likely than not, a “useless use of echo”. If you want to send a newline as input, then (as already mentioned) find a way to send STDIN to your command.

 

It’s possible to send data to STDIN via subprocess.Popen.communicate but it's not an option for me because this method waits for the child process to finish to return and I can't read from STDOUT meanwhile.

Edited by xilopaint
Link to comment
  • 3 weeks later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...