Jump to content

iandol

Member
  • Posts

    166
  • Joined

  • Last visited

  • Days Won

    7

Posts posted by iandol

  1. I at least got the notion that @vitor wanted to keep that workflow "streamlined". It certainly would be possible to magically switch to claude based on the model name (claude model names begin with `claude`, that is how Kiki from @gloogloo does this), thus could be done without any new UI, but would require more complex code on the backend. You could make a pull request on github, then @vitor could evaluate if he is willing to make this change?

     

    offtopic: your µBib looks awesome!

  2. 4 hours ago, CharlesVermeulen said:

    This is a noob question, but then again, I'm a noob. 

    I'm running the free version of Chatgpt

     

    There is no "free" version of the API according to the pricing page:

     

    https://openai.com/pricing

     

    There is a free web page interface (https://chat.openai.com), but that is not the API. For the API they give you some tokens at the start if I remember correctly, then you must pay-per-use.  

     

     

    If you want free: use a local LLM (the model runs on your Mac, no costs and no privacy concerns as OpenAI hoovers up all your data), or use a wrapper tool like https://openrouter.ai (that can utilise OpenAI, Claude, or many different open source free models using a single unified API). Sadly there is a small API incompatibility between this workflow and OpenRouter (@vitor may accept a pull request to fix it but I haven't had time...), so that leaves you with: give your credit card details to OpenAI, or use a local model...

     

  3. Note: this workflow also supports OpenRouter as well as Local LLM tools like https://lmstudio.ai and https://gpt4all.io/index.html and https://ollama.com — this means you are not tied to OpenAI and its (im my personal opinion) problematic "hijack" of LLMs into a paid corporate tool. LLMs are based on academically "open" technology, and OpenAI was originally started as a way to democratise these tools before profit and corporate battles took ChatGPT as a closed-walled garden...

     

  4. 22 hours ago, gloogloo said:

    @iandol I've just released a small update making the main request script a separate file. Feel free to add what you need. Man, I'm actually very close to being code illiterate and most of what you see in there both in bash or Javascript came from GPT 4 itself, so any help is welcome and it's great if you can contribute adding something that helps you or others.

     

    Well what a good advert for the utility of LLMs then! I made a pull request to your github...

  5. @gloogloo — I think you've done an amazing job, I think the workflow is not too complex for the user at least and you've provided nice documentation (given this is something you did mostly for yourself). The feature set makes this workflow useful in its own right. Supporting OpenRouter for example means many more Alfred users who either dislike OpenAI as a company (that's me), don't have a credit card, or can't afford or want to pay rolling fees, can use the open source models easily. 

     

    I can add a pull request for custom API endpoint if you want, as i understand from a quick look the main request is made using a bash script and curl right? I'm fine with bash, though I never learnt javascript so i am more wary to tweak that code... it would maybe help if you made the bash script a file rather than embedded in the workflow, as I think that makes it easier to contribute to via github etc.

     

    Perhaps a thread on these forums for your workflow would raise visibility and maybe garner some help for future feature updates. I personally don't mind your use of dialogs for UI personally, but I'm sure some questions to @vitor would help in implementing the super cool new views that Alfred 5 enables.

  6. Do you see a log of your individual request activity in the OpenAI accounts page? With OpenRouter (screenshot below), each request is shown with the tokens sent and received and then the costs (in this case Mistral is an opensource and therefore free-to-use model), that could help you work out if pruning may help?

     

    image.thumb.png.0326f21d53a61212c98c774e50d4ee63.png

  7. 4 hours ago, Gold said:

    I'm really enjoying this ChatGPT workflow so far.

     

    However, I'm curious if anyone else is concerned about the cost of usage? I've noticed that my prompt queries in Alfred's ChatGPT workflow are costing me 3x to 5x times higher than what I experienced with my former ChatGPT app(s).

     

    Don't forget there are local LLMs, which while less powerful, are totally free to use (and easier to customise), and also other "wrapper" tools which give you more flexibility in terms of what to use as the model backend (like https://openrouter.ai/).

     

    I don't use OpenAI myself but I assume costs increase as message context grows? If so, one way to keep costs down could be to "prune" the previous message context, but that does make the chat less accurate. One way to do that already is to start new chats whenever you don't really need the previous message context?

  8. On 4/7/2024 at 4:00 AM, Alfred0 said:

    I have a question about this idea. I'm aiming to create several workflows that apply specific instructions to the given text e.g. I'm trying to create a workflow that takes the selected text and improves the writing based on a custom prompt/instructions.

     

    @Alfred0 — you can have a look at Kiki (https://github.com/afadingthought/kiki-ai-workflow), which supports different profiles and I think is designed more for your workflow. I also use different system prompts and variables to "guide" the LLM (in my case using BetterTouchTool, but the underlying idea is the same, have an LLM for editing, one for creative writing, one for coding support or geek stuff etc.). I believe @vitor wants to keep this workflow "lean'n'clean" in terms of the core feature set, and the great thing about Alfred is how easy it is to modify workflows for specific purposes 😍

  9. Wow, kiki has a great feature set (and well documented), thanks @gloogloo — I expect local LLM tool LMStudio will also work with Kiki without any changes (it is great to have a free LLM running without need for internet or accounts etc.) Do you use stream=true in your code? You should add this to the gallery if it isn't already there.

  10. Hi @giovanni — I am getting this error at present:

     

    Traceback (most recent call last):
      File "/Users/ian/Library/CloudStorage/Dropbox/Assorted/Alfred Settings/Alfred.alfredpreferences/workflows/user.workflow.FA895A69-DD0A-4EDF-AA0A-3934D0B00C79/convert.py", line 19, in <module>
        from pint import UnitRegistry, UndefinedUnitError, DimensionalityError
      File "/Users/ian/Library/CloudStorage/Dropbox/Assorted/Alfred Settings/Alfred.alfredpreferences/workflows/user.workflow.FA895A69-DD0A-4EDF-AA0A-3934D0B00C79/pint/__init__.py", line 17, in <module>
        import pkg_resources
      File "/Users/ian/Library/CloudStorage/Dropbox/Assorted/Alfred Settings/Alfred.alfredpreferences/workflows/user.workflow.FA895A69-DD0A-4EDF-AA0A-3934D0B00C79/pkg_resources/__init__.py", line 57, in <module>
        from pkg_resources.extern import six
    ImportError: cannot import name 'six' from 'pkg_resources.extern' (/Users/ian/Library/CloudStorage/Dropbox/Assorted/Alfred Settings/Alfred.alfredpreferences/workflows/user.workflow.FA895A69-DD0A-4EDF-AA0A-3934D0B00C79/pkg_resources/extern/__init__.py)

     

    I have to admit I don't quite understand how the python dependencies are packaged in the workflow. My python is 3.12.1 (installed with pyenv)

  11. @vitor's ChatGPT workflow would "almost" work for Claude. The Claude API looks like:

     

    https://docs.anthropic.com/claude/reference/messages-streaming

     

     

    curl https://api.anthropic.com/v1/messages \
         --header "content-type: application/json" \
         --header "x-api-key: $ANTHROPIC_API_KEY" \
         --data \
    '{
      "model": "claude-3-opus-20240229",
      "messages": [{"role": "user", "content": "Hello"}],
      "max_tokens": 256,
      "stream": true
    }'

     

    And the responses look like:

     

    event: message_start
    data: {"type": "message_start", "message": {"id": "msg_1nZdL29xx5MUA1yADyHTEsnR8uuvGzszyY", "type": "message", "role": "assistant", "content": [], "model": "claude-3-opus-20240229", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 25, "output_tokens": 1}}}
    
    event: content_block_start
    data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

     

    It is similar to OpenAI API, but there are enough differences that even with some of the advanced env variables that the current test release has it won't work. You could use something like OpenRouter, which "wraps" Claude's API with the OpenAI one. But the current workflow doesn't work with the stream end markers and so can be buggy.

     

    https://openrouter.ai

     

     

  12. 15 hours ago, vitor said:


    The API doesn’t allow that, so support can’t be added until OpenAI adds it.

     

    Couldn't the system prompt help with that? At least for my local LLM workflow, I set up different system prompts which "guide" the LLM to answer in a particular way (as an english editor, as a computer specialist etc.) — in my case BetterTouchTool triggers these with different keystrokes. The issue is how to manage these flexibly in an Alfred workflow. Currently you support a single system prompt, if you could have a set of these, and a way to specify which one to use this would at least help in guiding the LLM to respond in a specific way. Each time the system prompt changed you would reset the conversation to.

  13. I closed the github issue as your test version works well.

     

    The cause is in the response data. Google's Gemma model does not use `finish_reason`:

     

    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}

     

    So I assume that:

     

    const finishReason = chunks.slice(-1)[0]["choices"][0]["finish_reason"]

     

    Causes the "undefined is not an object (evaluating 'chunks.slice(-1)[0]["choices"]')" — if there was some way to use a bit more defensive testing there (check if that key exists then check if it is null)? The final streamed chunk was just [DONE] — but as far as I understand your parsing code, this would be dropped as it isn't JSON?

     

    I'll see what I can do as far as a pull request...

  14. Here is what the workflow sends (using the excellent Proxyman):

     

    POST /api/v1/chat/completions HTTP/1.1
    Host: openrouter.ai
    User-Agent: curl/8.4.0
    Accept: */*
    Content-Type: application/json
    Authorization: Bearer sk-or-v1-xxx
    Content-Length: 120
    
    {"model":"google/gemma-7b-it:free","messages":[{"role":"user","content":"What is the capital of Ghana?"}],"stream":true}

     

    And the raw response:

     

    HTTP/1.1 200 OK
    Date: Thu, 21 Mar 2024 02:37:24 GMT
    Content-Type: text/event-stream
    Transfer-Encoding: chunked
    Connection: keep-alive
    access-control-allow-credentials: true
    access-control-allow-headers: Authorization, User-Agent, X-Api-Key, X-CSRF-Token, X-Requested-With, Accept, Accept-Version, Content-Length, Content-MD5, Content-Type, Date, X-Api-Version, HTTP-Referer, X-Windowai-Title, X-Openrouter-Title, X-Title, X-Stainless-Lang, X-Stainless-Package-Version, X-Stainless-OS, X-Stainless-Arch, X-Stainless-Runtime, X-Stainless-Runtime-Version
    access-control-allow-methods: GET,OPTIONS,PATCH,DELETE,POST,PUT
    access-control-allow-origin: *
    Cache-Control: no-cache
    strict-transport-security: max-age=63072000
    x-matched-path: /api/v1/chat/completions
    x-vercel-id: hkg1::np57k-1710988642888...
    CF-Cache-Status: DYNAMIC
    Server: cloudflare
    CF-RAY: 867a8f49c...
    
    : OPENROUTER PROCESSING
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"\n\n"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"Acc"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"ra"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988644,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"."}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"\n\n"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"Acc"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"ra"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" is"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" the"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" capital"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" and"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" largest"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" city"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" of"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":" Ghana"}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":"."}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}
    
    data: {"id":"gen-hR4SSt1tMFgZ9uec7mox6phaQIij","model":"google/gemma-7b-it:free","created":1710988645,"object":"chat.completion","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}
    
    data: [DONE]
    

     

    So the error may be due to this line:

     

    const finishReason = chunks.slice(-1)[0]["choices"][0]["finish_reason"]

     

    There is no finish reason in this repsonse, this depends on the model apparently:

     

    https://openrouter.ai/docs#responses

  15. @vitor, so I've modified the workflow to accept a new env variable (see https://github.com/alfredapp/openai-workflow/pull/16):

     

    image.png.8daeb56dfeb760976458a37aa79505c5.png

     

    This for example allows you to use the free Gemma model from Google, using https://openrouter.ai — you can get free API keys with no need for a credit card etc.

     

    I get the following script error on first use however:

     

    [09:57:11.777] ERROR: ChatGPT / DALL-E - COPY[Text View] Code 1: /Users/ian/Library/CloudStorage/Dropbox/Assorted/Alfred 
    Settings/Alfred.alfredpreferences/workflows/user.workflow.B857589D-A0F4-44DB-AB2C-DA78E4EE0FAA/chatgpt: execution error: Error: TypeError: 
    undefined is not an object (evaluating 'chunks.slice(-1)[0]["choices"]') (-2700)

     

    The order of requests then gets a bit mixed up once this error occurs.

     

    A modified workflow for testing is here (i changed the keyword from chatgpt to askai so it doesn't clash with the original): https://0x0.st/Xrtj.zip

     

  16. Well poe.com has a web search bot for AI:

     

    https://poe.com/Web-Search


    ...but I don't imagine it supports a simple GET style request[1]... Remember they want to monetise / control access for this which is why you are stuck with a bunch of APIs rather than an open HTTPs  interface... This depends on the websites and what they are willing to offer

    ----

    [1] you could use proxyman to work out what the traffic to that is, when I use it I get a uuid or encoded unique URL back but what the traffic is I didn't test.

  17. I'll have a look. The OpenAI API is pretty simple to be honest, taking their simple guide:

     

    https://platform.openai.com/docs/guides/text-generation/chat-completions-api

     

    curl https://api.openai.com/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{
        "model": "gpt-3.5-turbo",
        "messages": [
          {
            "role": "system",
            "content": "You are a helpful assistant."
          },
          {
            "role": "user",
            "content": "Hello!"
          }
        ]
      }'

     

    We have: API address, API key, the model name & the messages as the core components. Messages are obvious, but the address, key and model name are essential and also required for online alternatives like openrouter.ai and local tools like LM Studio. The hard coded model names for OpenAI do not work for any other alternative, so a way to override it is needed. These are definitely "if there was only one more version, what should be included" options... I think having the standard drop-down for models hard coded is great for beginners (your UI is clean and simple), and the env variable as a text field is perfect for more advanced use.

     

    There are a bunch of other parameters for fine tuning the model response: temperature, max_tokens, n, top_p etc. — of these I think none are really essential, though if I was forced to pick I'd have temperature (guiding the fidelity vs. creativity of the model responses) and max_tokens (as at least local models have specific token count limits):

     

    https://platform.openai.com/docs/api-reference/chat/create

     

    These options are certainly very specialist. I agree that stream=off is not worth supporting, as it adds substantial backend complexity for you with minimal gain (while I love GPT4All, I just won't use it with Alfred, and LM Studio, Ollama and others can take its place...)

     

  18. Feature Request: you have added the custom API address, which is great. There are services like openrouter which support OpenAI API, with many other models too: https://openrouter.ai/docs#principles — I think Poe is another example. To get these to work you must specify the model (i.e. "openai/gpt-3.5-turbo"). At the moment you hard-code the model values so this will fail. If you allow more flexible model input then servies like openrouter could also be used by this workflow. The simplest is to use an env var to override the model if set (assume this is advanced user only). Otherwise a text entry option in the workflow UI?

  19. 22 hours ago, llityslife said:

    @iandol I am sure there is no problem with the use. I can use the third-party API normally using https://github.com/chrislemke/ChatFred.

     

    Well, your experience tells us there is a problem. Again, there is not one API method, there are several endpoints each with different requirements and functions. At a minimum there are two endpoints (/v1/completions & /v1/chat/completions) and there is a streaming and non-streaming mode. I have said that for local use, two different apps that both "support" OpenAI API, only one works, and the other doesn't (because it needs stream=false). Just because a service says X it does not specify X.y or X.z — what API mode does ChatFred use, is it the same? My point is to help you determine what the problem is, without knowing the problem you have no hope of finding the solution...

  20. But does their API support streaming mode or not? I suspect even with streaming if their API is slow then this will cause connection stalled errors (there is a timeout in the code, if you tweak it perhaps you can recover the error)? I know there are some services that bypass the country limitations (I am in China, so must go through a VPN for example), and this can add latency to the connection also...

  21. 13 minutes ago, llityslife said:

    but I tested using a third-party API to find errors, and I've set the API URL and API key.

     

    What third-party tool are you testing? It needs to support stream=true to work... I see the same connection stalled error with GPT4All (https://gpt4all.io/index.html), which uses a non-streaming OpenAI API, but LMStudio (https://lmstudio.ai) which supports streaming does work...

×
×
  • Create New...