Jump to content

zeitlings

Member
  • Posts

    210
  • Joined

  • Last visited

  • Days Won

    13

Posts posted by zeitlings

  1. 3 hours ago, iandol said:

    Wow, amazing set of supported interfaces, thank you @zeitlings -- beautiful icon too!!!

     

    Thanks 🤗

     

    3 hours ago, iandol said:

    On github, alpha.1 is newer than alpha.2, I assume alpha.2 is the better one to download though?

     

    Is it? I noticed that the sorting looks wrong when viewing all releases, do you mean this? But alpha.2 should be set as the latest release. So yes, alpha.2 has some fixes and changes over alpha.1. However, mind that the problem with the data directory not being created is not fixed yet.

  2. Only messages that the model deems to require real-time information will elicit a detour through exa.ai. OpenAI dubbed this tool- or function calling.

    I caution the model to be extremely conservative about calling the function, so ideally it should only happen when really necessary. Whenever it does happen, you will notice that a "Searching" notification is injected into the conversation! This is when the external API is contacted, otherwise exa sees nothing, and even then it doesn't see your message, but only a search query constructed by the model to get the information it needs.

  3. 17 minutes ago, vitor said:

    Note that some apps, e.g. Reeder, only have one single unnamed window. So this change (haven’t tested, basing this on the description) may make that app disappear entirely.

     

    Yeah, the solution is a trade-off. I've added the option to explicitly preserve unnamed windows to compensate for those cases. By default, these windows are hidden, but they can be made visible if necessary, at the risk of potentially mixing in other unwanted windows. Given that unnamed windows are the exception in my experience, I think this should be an adequate solution for most situations.

  4. Hey @dood,

     

    I did some digging and found that Arc does some strange things: Each modal or popup seems to be created as a unique window. The same appears to happen when a "booster" is added or has been added - but this behavior is inconsistent. These are not valid windows, although their properties suggest otherwise, but fortunately most of those windows are unnamed. A small caveat is that sometimes the unnamed Arc windows could be considered valid, e.g. when opening the configuration of an extension.

     

    Anyway, I've modified the code so that you now have to explicitly opt in to keep unnamed windows. This seems to catch most if not all of the invalid Arc windows.

    Also note the new hidden environment variable to blacklist specific window names if necessary.

     

    v1.3.0

    • Added configuration option to explicitly preserve or dismiss unnamed windows
    • Added hidden environment variable ignored_window_names to blacklist window names as an additional failsafe
      • Note: Enter the names as a comma separated list
  5. 2 hours ago, vitor said:

    What are some examples of situations where you set up multiple combos in a row? I ask in particular because you’re a skilled coder so I want to understand the situations you’re having to resort to multiple key combos.

     

    I am reasoning about an extensible "Inference Task" configuration scheme for my GPT workflows. Such a "task" defines among other things the system prompt, the parameters and the desired behavior how to handle the result, e.g. whether a result should be pasted into the frontmost application at the end or just copied.

     

    Concrete, simple examples are definitions for Universal Actions to correct the spelling of text, change the tone, or create summaries. Another use case would be a configuration for a chat session that effectively turns the current session into a translation engine or an etymological dictionary.

     

    And yet another use case, and this would require the key combo dispatches, is "snippet triggers". If the predefined combos could be evaluated in one go, it would be possible to define very specific application purposes. For example, in text editors actions could be configured to select only the last paragraph as the target for the task (⌥⇧↑, ⌘C, →, ↩, ↩), or the entire page, or actions could be primed to react very specifically to certain applications (e.g. Logseq).

     

    If this were possible via Alfred, I wouldn't have to rebuild the entire NSEvent simulation just for that, i.e. I wouldn't probably just drop that whole part. 😄

     

    I hope that makes sense, and granted, this may be bit niche. But I figured, since the core functionality already exists, maybe it's not such a stretch to ask.

    If the internal implementation doesn't allow for an easy implementation, so be it~ As for the delay issue, maybe an approach similar to that of the "simulate typing text" automation task would suffice? Fast, medium, slow dispatch speed.

     

    If you're interested, I'd be happy to show you the drafts of the configuration scheme to give you a better idea of what's going on, or if you're curious about pkl (github), which I'm playing with to set it up.

  6. This sounds like an addition that would fit into [Extra Automation Tasks > Keyboard Simulation].

    The idea is to pass in an array of string sequences that will be evaluated as a chain of "Dispatch Key Combo" objects.

     

    Example input: 

    [
    	"⇧⌘←",
    	"⇧↑",
    	"⇧↑",
    	"⌘B"
    ]

     

    That way we could reduce complexity in terms of individual components present on the canvas and programmatically define key combos that require more than one dispatch.

  7. 6 hours ago, rudraadavee said:

    hey messed with the workflow for a while and found that the "new chat" function has stopped working

     

    I've just checked all the possible ways to start a new chat, and they all work on my end. (Without more information, I can't infer where it stopped working for you or why.)

    Don't forget to keep an eye on the debugger to potentially identify any problems more easily.

     

    6 hours ago, rudraadavee said:

    also could you help me with the path in "openai_alternative_local_url_path" in case of local LLMs?

     

    Sure, what's the problem? 

  8. NoteCmdr looks neat! Coincidentally, I just stumbled over another extension today the author wrote, and it looks like he is using Apple's Accessibility API to "hack" into Apple Notes. 

     

    But it looks like nothing that fancy is necessary for your use case. I just tested the approach with dispatching key combos and it does the trick.

    https://transfer.archivete.am/10YNHX/Test | Apple Notes Snippet.alfredworkflow

     

    image.gif.57ff3397685a016a8160585541f42379.gif

  9. The problem is that those check list bullets are no ordinary unicode characters, but more complex custom objects internal to Apple Notes.

    You'll notice that you can't even select them. If you copy a checklist, however, Apple Notes converts the checklist bullets to this markdown notation:- [ ]

    Trying to expand those will however also not automatically convert them to the interactive checklist objects. I'm afraid that this won't work with regular text expansions.

     

    You can however try to create a custom workflow that inserts your snippet and then dispatches some key combos (e.g. shift+cmd+←, shift+↑, shift+↑, shift+cmd+L). The last one is the Apple Notes shortcut for creating checklists.

  10. Just an observation: The Menu Bar Search workflow can't find the "Tags..." item either. It looks like you've run into an inconsistency with this particular menu item.  Both the Automation Task and the Menu Bar Search workflow failing to interact with it suggests this simply might be an outlier in its behaviour.

     

    Aaand some probing seems to confirm it. Apparently the item doesn't even exist 🥲

     

    on run argv
    	tell application "System Events"
    		set finderProcess to a reference to (first application process whose name is "Finder")
    		set fileMenuBar to (a reference to menu 1 of menu bar item "File" of menu bar 1 of finderProcess)
    		--set tagsMenuItem to (first menu item of fileMenuBar whose name begins with "Tags") -- no match
    		set allMenuItems to menu items of fileMenuBar
    		--set probablyTagsMenuItem to item 39 of allMenuItems -- Should be "Tags..." but fails
    		--click probablyTagsMenuItem -- no effect
    		--return
    
    		set allMenuItemsCount to count of allMenuItems
    		repeat with i from 1 to allMenuItemsCount 
    			set menuItem to item i of allMenuItems -- Get the menu item at index i
    			set theName to name of menuItem as text
    			display dialog theName & " - Number: " & (i as text)
    		end repeat
    	end tell
    end run

     

  11. I want to call attention to some aspects that are easily overlooked and share some tips.

     

    Note the hidden options, i.e. modifiers mentioned in the documentation to make use of 

    • Multi-Line prompt editing
    • Vocalization of the most recent answer (See here how to change the voice)
    • The HUD to quickly check your most important current settings.

    Some third party proxies allow you to use certain models for free.

    • For example, Groq currently grants access to Llama 3 70B, which is en par with Gemini 1.5 Pro according to some benchmarks.
    • OpenRouter curates a list of free models that can be used with their API.
    • The new API for Google's Gemini is currently in beta. If your API requests are detected as coming from the US, Gemini comes with a substantial free tier quota for both Gemini 1.5 Flash and Pro.

    I also encourage you to explore local LLMs. Meta recently released their Llama 3 model, and the "tiny" Llama 3 8B performs beautifully on newer Macs. According to some benchmarks, the 8B version competes with GPT 3.5 Turbo, and I can confirm that I was at least very positively surprised by its capabilities. Related: Ollama Workflow.

     

     

  12. Hey everyone!

     

    I'm excited to share the alpha preview of Ayai - GPT Nexus with you, a workflow that lets you interact with various AI language models.

    The workflow integrates APIs from OpenAI, Anthropic, and Google, while also supporting local LLMs.

     

    Why an Alpha Preview?

     

    While Ayai - GPT Nexus is still a work in progress, it’s already quite useful. Development has slowed down recently, and I’m not actively working on it as much. Rather than keep it under wraps until it's perfect, I wanted to release it now so you can take advantage of its current features and provide feedback that could shape its future development.

     

    Current Features

    • Chat with AI Models: Interact with ChatGPT, Claude, Perplexity, and Gemini.
    • Third-Party Proxies: Connect through OpenRouter, Groq, Fireworks, Together.ai, and other proxies compatible with OpenAI's response format.
    • Local LLMs: Use local models via interfaces like Ollama or LM Studio.
    • Live Web Search: Enable live web search for OpenAI models with additional Exa.ai or Azure API keys (experimental).

     

    What’s Missing?

     

    The "Ayai" part of the name hints at some personal AI assistant features I have in mind and plan to explore in the future. Additionally, the text processing features like paraphrasing, simplifying, changing the tone of text, or summarizing documents (PDFs, docx, text files, etc.) without needing to go through the chat window are still on the to-do list.

     

    Your feedback is, of course, appreciated!

     

    Download here: Ayai · GPT Nexus on Github

     

    image.thumb.png.520fdad480a32e0e1000a115df349e01.png

     

     

    Looking forward to hearing what you think!

     

  13. Window Navigator v1.2.0

    I successfully rewrote the program to rely solely on the Accessibility API 🎉

    By removing all AppleScript components, the program is now more predictable, reliable, and faster.

     

  14. Curious. You can try the updated version now to see if the problem is fixed!

    FYI, if you want the workflow to be more responsive, you can install the Xcode Command Line Tools to create and use a compiled version.

     

    xcode-select --install

     

  15. What I was missing in the existing window switchers was a way to navigate between windows of the same application that are scattered across different desktop spaces, so I created one that does just that 😄

     

    image.png.67e7bcb6aac7b71297fa68b6f937c54c.pngWindow Navigator


    Navigate to any window of the currently focused application or any application across all desktops, or switch windows within the current desktop space.

     

    GitHub-Download-000.svg?logo=github&logoColor=white

    Usage

    1. Search the windows of the active app globally using the Navigator keyword.
    2. Search app windows in the current desktop space using the Switcher keyword.
    3. Search all visible windows of all apps globally using the Global keyword.
    • to navigate to the selected window. 
    • ⌘⏎ to close the selected window.
    • ⌥⏎ to quit the owning application.
    • Configure the hotkeys for quick access.

     

    1. Navigator

    preview_winnav-1.png

    2. Switcher

    preview_winnav-2.png

    3. Global

    preview_winnav-3.png

  16. For instance, to allow a theme to be specified for a particular Text View or Script Filter instance.

     

    Possible implementation:

    {
      "alfredworkflow" : {
        "arg" : "{var:runtime_argument}",
        "config" : {
          "fontsizing" : "{var:text_view_font_size}"
        },
        "variables" : {
          "alfred_theme" : "{var:text_view_theme_override}"
        }
      }
    }

     

    Where text_view_theme_override has been set to, e.g., theme.custom.914F4F4C-49AF-499A-A4BE-7BD657F7D4F6 in the environment variables.

  17. Neat! There are a few minor problems with your initial release:

    • You forgot to remove your compiled versions of Workflow.swiftaltr, and Notifier the from the workflow
    • You are not referencing the environment variable keyword in your script filter (should be: {var:keyword})
    • You probably want to remove the 6MB preview gif as it is not even animated in the configuration preview 😄

     

    One thing that is important to me is being able to quickly inspect a workflow's cache and data folders.

    If you're interested in adding this use case and want some inspiration, you could take a look at the approach I took to add this to Acidham's workflow: #15

×
×
  • Create New...