Actually, you can use Ollama (https://github.com/ollama/ollama) to serve locally any available open source models (including gemma from Google, llama2 from Meta, or Mistral from the French startup, but many others ...).
Ollama is compatible with OpenAI API so I just hacked a bit the current workflow by changing `chatgpt_api_enpoint` to `http://localhost:11434`
+ changing model name and label in `userconfigurationconfig`
And to my great surprise it works ...
Of course, I believe your mac should be robust enough. Forget to mention I have a M3 Max with 48 Go of RAM. But I'm pretty sure it's reasonably work on a M1 with 8 Mo of RAM. On my M3 answer is very fast.
It would be good to have an update of the workflow that would add the option (+ howto) to use Ollama...