jonteamere Posted October 27, 2016 Posted October 27, 2016 I'm looking for a workflow that can combine multiple lines of plain text into a single line. I guess it'd be similar to a 'reformat paragraph' function. If a workflow would be difficult, are there any plain text apps that can handle that function?
deanishe Posted October 27, 2016 Posted October 27, 2016 Run Script, Language = /usr/bin/python with input as argv Script: from __future__ import print_function import re import sys print(re.sub(r'\n+', ' ', sys.argv[1]), end='') That will replace one or more consecutive newlines with a space. Connect to whichever inputs/outputs you need. jonteamere 1
jonteamere Posted October 27, 2016 Author Posted October 27, 2016 Works like a charm! Thank you, Sir.
jonteamere Posted October 29, 2016 Author Posted October 29, 2016 Suddenly stopped working. I don't know what's changed. @deanishe are there any fixes or alternative ways?
deanishe Posted October 29, 2016 Posted October 29, 2016 How am I supposed to tell what's wrong from "Suddenly stopped working"? Provide the input, the expected output and the actual output. And anything shown in the debugger.
jonteamere Posted October 31, 2016 Author Posted October 31, 2016 @deanishe I think I figured it out. It came down to \n vs. \r (which appeared to be "newlines"). Is there a way to change the code to recognize either newlines or carriage returns? Also, at the risk of annoying you... Say I have a document like the example below. It consists of groups of text separated by a empty line. So, TEXT + \n \n + TEXT repeated. I'd be nice to highlight all content in the document and have the script search and reformat just the TEXT portions (if they contained carriage returns or new lines) while preserving the empty line between each 'paragraph' of text. Quote Nonetheless, while the true burden may be underestimated, the economic burden from both a sufferer’s and societal perspective is profound. One database analysis found that direct endometriosis- related costs were considerable and appeared driven by hospitalizations An actuarial analysis revealed that women with endometriosis incur total medical costs that are, on average, 63% higher than medical costs for the average woman in a commercially insured group. Others reported being “totally incapacitated” and even dismissed from or left their jobs due to symptoms. Most recent data indicate that the total annual burden of endometriosis-associated symptoms in the United States has reached a staggering $119 billion. I appreciate any help that you can provide.
deanishe Posted October 31, 2016 Posted October 31, 2016 2 minutes ago, jonteamere said: @deanishe I think I figured it out. It came down to \n vs. \r (which appeared to be "newlines"). Is there a way to change the code to recognize either newlines or carriage returns? Easy peasy: from __future__ import print_function import re import sys # Replace any combination of \n and/or \r with a single space print(re.sub(r'[\r\n]+', ' ', sys.argv[1]), end='') 2 minutes ago, jonteamere said: Say I have a document like the example below. It consists of groups of text separated by a empty line. So, TEXT + \n \n + TEXT repeated. I'd be nice to highlight all content in the document and have the script search and reformat just the TEXT portions (if they contained carriage returns or new lines) while preserving the empty line between each 'paragraph' of text. I appreciate any help that you can provide. from __future__ import print_function import sys # Split text into lines and strip any whitespace at line ends lines = [l.strip() for l in sys.argv[1].strip().splitlines()] # Collect lines into paragraphs paras = [] buf = [] for l in lines: if not l: # empty line, i.e. new paragraph s = ' '.join(buf).strip() if s: paras.append(s) buf = [] else: buf.append(l) # Add last paragraph if there is one s = ' '.join(buf).strip() if s: paras.append(s) print('\n\n'.join(paras), end='') jonteamere 1
jonteamere Posted October 31, 2016 Author Posted October 31, 2016 @deanishe Brilliant! Just wonderful. Thanks so much.
jonteamere Posted August 10, 2017 Author Posted August 10, 2017 Hey, @deanishe So, tried to use the above script for a rather long 450,000+ character plaintext file containing my notes. Selected all text → triggered workflow → got an error (attached below). Is there a character or paragraph limit? It been working like a dream since you wrote it. Let me know if I can provide you with any other info.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now