Using alternative and local models with the ChatGPT / DALL-E workflow

iandol · March 21

@vitor, so I've modified the workflow to accept a new env variable (see https://github.com/alfredapp/openai-workflow/pull/16):

image.png.8daeb56dfeb760976458a37aa79505c5.png

This for example allows you to use the free Gemma model from Google, using https://openrouter.ai — you can get free API keys with no need for a credit card etc.

I get the following script error on first use however:

[09:57:11.777] ERROR: ChatGPT / DALL-E - COPY[Text View] Code 1: /Users/ian/Library/CloudStorage/Dropbox/Assorted/Alfred 
Settings/Alfred.alfredpreferences/workflows/user.workflow.B857589D-A0F4-44DB-AB2C-DA78E4EE0FAA/chatgpt: execution error: Error: TypeError: 
undefined is not an object (evaluating 'chunks.slice(-1)[0]["choices"]') (-2700)

The order of requests then gets a bit mixed up once this error occurs.

A modified workflow for testing is here (i changed the keyword from chatgpt to askai so it doesn't clash with the original): https://0x0.st/Xrtj.zip

vitor · March 21

I’ve already replied on GitHub, try that version.

iandol · March 21

Here is what the workflow sends (using the excellent Proxyman):

POST /api/v1/chat/completions HTTP/1.1
Host: openrouter.ai
User-Agent: curl/8.4.0
Accept: */*
Content-Type: application/json
Authorization: Bearer sk-or-v1-xxx
Content-Length: 120

{"model":"google/gemma-7b-it:free","messages":[{"role":"user","content":"What is the capital of Ghana?"}],"stream":true}

And the raw response:

HTTP/1.1 200 OK
Date: Thu, 21 Mar 2024 02:37:24 GMT
Content-Type: text/event-stream
Transfer-Encoding: chunked
Connection: keep-alive
access-control-allow-credentials: true
access-control-allow-headers: Authorization, User-Agent, X-Api-Key, X-CSRF-Token, X-Requested-With, Accept, Accept-Version, Content-Length, Content-MD5, Content-Type, Date, X-Api-Version, HTTP-Referer, X-Windowai-Title, X-Openrouter-Title, X-Title, X-Stainless-Lang, X-Stainless-Package-Version, X-Stainless-OS, X-Stainless-Arch, X-Stainless-Runtime, X-Stainless-Runtime-Version
access-control-allow-methods: GET,OPTIONS,PATCH,DELETE,POST,PUT
access-control-allow-origin: *
Cache-Control: no-cache
strict-transport-security: max-age=63072000
x-matched-path: /api/v1/chat/completions
x-vercel-id: hkg1::np57k-1710988642888...
CF-Cache-Status: DYNAMIC
Server: cloudflare
CF-RAY: 867a8f49c...

: OPENROUTER PROCESSING

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"\n\n"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"Acc"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"ra"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"."}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"\n\n"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"Acc"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"ra"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" is"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" the"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" capital"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" and"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" largest"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" city"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" of"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" Ghana"}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"."}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}

data: [DONE]

So the error may be due to this line:

const finishReason = chunks.slice(-1)[0]["choices"][0]["finish_reason"]

There is no finish reason in this repsonse, this depends on the model apparently:

https://openrouter.ai/docs#responses

vitor · March 21

Please add further information on the GitHub issue, so we don’t split the discussion. But remember that as per the moderator’s note:

On 2/27/2024 at 1:03 AM, iandol said:

This is complex and requires advanced configuration, not something we can officially provide support for. This thread was split from the main one so members of the community can help each other setting up their own specific models.

So if you find the cause and can submit a PR (something janky is fine, as long as it works and I can evaluate) I’ll can look further into it. Otherwise, as stated previously, it would be untenable to support every quirk from every model. Either they support the API correctly or not, and the workflow works with the ones who do.

iandol · March 21

I closed the github issue as your test version works well.

The cause is in the response data. Google's Gemma model does not use `finish_reason`:

data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}

So I assume that:

const finishReason = chunks.slice(-1)[0]["choices"][0]["finish_reason"]

Causes the "undefined is not an object (evaluating 'chunks.slice(-1)[0]["choices"]')" — if there was some way to use a bit more defensive testing there (check if that key exists then check if it is null)? The final streamed chunk was just [DONE] — but as far as I understand your parsing code, this would be dropped as it isn't JSON?

I'll see what I can do as far as a pull request...

Edited March 21 by iandol

vitor · March 21

11 minutes ago, iandol said:

I closed the github issue as your test version works well.

It’s still possible to comment on closed issues.

11 minutes ago, iandol said:

if there was some way to use a bit more defensive testing there (check if that key exists then check if it is null)?

Not without making the code slower and more convoluted. The finish reason is imperative for the code to know when to stop and how to proceed. If that model doesn’t send that key, it’s not implementing the API correctly.

iandol · March 21

@vitor: I don't understand this change on line 200 of chatgpt:

const apiEndpoint = envVar("dalle_api_endpoint") || "https://api.openai.com"

This seems to force the dalle api? How does chatgpt_api_endpoint get used?

vitor · March 21

That’s a copying mistake. Should have been chatgpt_api_endpoint. Fixed. Thank you.

_oho · March 24

On 3/19/2024 at 12:58 PM, Cipri said:

I'd love to see a workflow like this for Gemini since their API is free.

Actually, you can use Ollama (https://github.com/ollama/ollama) to serve locally any available open source models (including gemma from Google, llama2 from Meta, or Mistral from the French startup, but many others ...).

Ollama is compatible with OpenAI API so I just hacked a bit the current workflow by changing `chatgpt_api_enpoint` to `http://localhost:11434`

+ changing model name and label in `userconfigurationconfig`

And to my great surprise it works ...

Of course, I believe your mac should be robust enough. Forget to mention I have a M3 Max with 48 Go of RAM. But I'm pretty sure it's reasonably work on a M1 with 8 Mo of RAM. On my M3 answer is very fast.

It would be good to have an update of the workflow that would add the option (+ howto) to use Ollama...

_oho.

Edited March 24 by _oho

vitor · March 24

@_oho Don’t edit the plist directly, that’s what the Workflow Environment Variables section is for. Your changes will be overridden on updates. The next version will have an option to override the model as well, you can download a preview of that. I’ve moved your post to the correct thread for the discussion.

_oho · March 24

10 hours ago, vitor said:

@_oho Don’t edit the plist directly, that’s what the Workflow Environment Variables section is for. Your changes will be overridden on updates. The next version will have an option to override the model as well, you can download a preview of that. I’ve moved your post to the correct thread for the discussion.

Thank you Vitor for removing my post. I realized it was not appropriate as it was a real salvage dirty hack :). To be honest I haven't played with Alfred workflows for around 10 years. Just using it (every day) without modifications. I have to admit that it have improved so much (congrats to the team by the way).

I have tried your preview but can't make it work :(.

I have only change in environment variable : `chatgpt_api_endpoint` (don't know what entry is expected on `chatgpt_model_override` as changing this does not seems to help).

Changed also `gpt_model` in Configuration Builder to add my own local currently used models.

Than could change my chosen model in `Configure Workflow`.

Buty whatever I do, I keep having API key error, meaning it's not using my local endpoint

Any idea ?

Hope this is not a very dump thing I'm doing wrong (I'm sure seeing how dirty I can hack, you would not be surprised ).

O.

_oho · March 24

14 minutes ago, _oho said:

Thank you Vitor for removing my post. I realized it was not appropriate as it was a real salvage dirty hack :). To be honest I haven't played with Alfred workflows for around 10 years. Just using it (every day) without modifications. I have to admit that it have improved so much (congrats to the team by the way).

I have tried your preview but can't make it work :(.

I have only change in environment variable : `chatgpt_api_endpoint` (don't know what entry is expected on `chatgpt_model_override` as changing this does not seems to help).

Changed also `gpt_model` in Configuration Builder to add my own local currently used models.

Than could change my chosen model in `Configure Workflow`.

Buty whatever I do, I keep having API key error, meaning it's not using my local endpoint

Any idea ?

Hope this is not a very dump thing I'm doing wrong (I'm sure seeing how dirty I can hack, you would not be surprised ).

O.

Ok, think I found it (may be ? At least I get it to work).

Had activated logs (cool feature) and seen error is coming from `chatgpt` script. Editing this script I can't see the use of `chatgpt_api_endpoint` variable. In stead `dalle_api_enpoint` only is used (may be for ChatGPT and DALLE ?).

const apiEndpoint = envVar("dalle_api_endpoint") || "https://api.openai.com

As anyway I have no paid credit for Dalle, so I have overrided `dalle_api_endpoint`in Environement Variable. And it seems to work.

Thank you for your quick answer yesterday and thank you for you enormous / genius Workflow that I love. Actually make me remind how great is Alfred...

Can't wait for your new version that will fix allow easy integration off other models (like Ollama).

Was wondering, do you think there is a way to get code synthaxic coloration in ChatGPT workflow answer ?

O.

Sign In

Using alternative and local models with the ChatGPT / DALL-E workflow

Recommended Posts

iandol

Link to comment

vitor

Link to comment

iandol

Link to comment

vitor

Link to comment

iandol

Link to comment

vitor

Link to comment

iandol

Link to comment

vitor

Link to comment

_oho

Link to comment

vitor

Link to comment

_oho

Link to comment

_oho

Link to comment

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity