Jump to content

Add LLM / Artificial Intelligence API as "Action" widget for workflows


Recommended Posts

Most of the LLM APIs can be used in a very simple way:

1. Send a block of text to the API

2. Receive a block of text back

 

As a variant of this, the API's can be forced into a returning a data structure (usually JSON).

 

I'd propose that you let use register API key for the popular APIs somewhere in the app (e.g. Claude, ChatGPT, or whatever)

Then, if that's satisfied, in the workflows allow for a GenAI widget that builds a prompt by merging variables into a text template.  You might have a toggle to indicate whether you wanted to force JSON.   You might also support a couple specific JSON formats (such as a simple list, or a file list, filename, etc.).

In many ways the function of this thing would be very comparable to the "Run Script" widget.  You could actually accomplish this with the Run Script widget and some custom code but it would be fussy.

Link to comment
Share on other sites

There is an insane amount of configuration options, inputs, output formats, API differences, and more, that would make this impractical. Not to mention they change all the time. The best way to tame the complexity into something that makes sense are workflows which handle what you need like you need. Such workflows can then be called via External Triggers and more.

 

There’s ChatGPT / Dall-E and Ayai for chatting, for example. But also Writing Assistant which uses the same APIs for a different purpose. A single object wouldn’t be able to cover the complexity, changing landscape, and multitude of tasks these are used for. Especially while keeping its configuration tenable.

Link to comment
Share on other sites

I'd like to respectfully disagree.  90% of the people who use chatgpt use it in an extremely naive way and are very happy with it.  They open ChatGPT, type their question into the box, and read the answer.

I'm actually a developer of Gen AI based business applications (for pay) and while there are indeed oodles of configurations you could adjust (temperature, etc.), most people happily accomplish happy outcomes without altering anything in the default configuration.

Text In - Text Out.  That'll solve 90% of the use cases for things like:

I wanted it for use cases like these (that I'm currently solving using Obsidian has an AI working essentially as I describe today within Obsidian):

  • Below is a bulleted list of items.  From each item give a unique but valid Mac filename with the extension ".md".  Give me this in a single JSON list structure.
  • Anonymize the data below to replace real people's names, addresses, emails, etc, with fake name and email
  • Below is a snippet of Python code.  Toggle commenting commenting on the block so that any line containing code that has been commented (by putting '# ' at the start of the line) will be uncommented.  Or any code line that is not commented should become commented (by putting '# ' at the start of the line).  Do not alter lines that contain no python code.
  • The below code was captured by text-to-speech technology.  It has many grammatical and punctuation errors resulting from that process.  Please attempt to fix the text to remove obvious errors.  Insert the marker '<<???>>' in any place where you can't make confident guesses about what the text should be.

With that said, it's your product and your vision.  I'm just a guy with a half-baked idea for something I've seen other products do.

Link to comment
Share on other sites

9 hours ago, DaveF41 said:

90% of the people who use chatgpt use it in an extremely naive way and are very happy with it.  They open ChatGPT, type their question into the box, and read the answer.


Agreed. The workflows cover exactly that use case, including being able to continue the conversation, start a new one, or send specific queries before a piece of data. You can also, after reading the answer, press a shortcut to copy the last reply or the entire conversation.

 

9 hours ago, DaveF41 said:

I wanted it for use cases like these


You can accomplish those with the ChatGPT / DALL-E workflow (and likely Ayai too) via the Universal Action to send some text with a prefix (pairs nicely with the “Below is”).

 

If those are real specific examples you ask all the time, that’s covered by the workflow’s External Trigger. See “How can I reuse pre-made prompts?” in the FAQ.

 

Oh, and welcome to the forum!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...