SAFIA LLM Providers: Lunos, Groq, OpenAI and Custom
Configure SAFIA to use Lunos, Groq, OpenAI, or any OpenAI-compatible API — including OpenRouter and LM Studio — for chat completions and vision.
SAFIA supports four LLM providers out of the box. All use an OpenAI-compatible API interface, so switching between them is a matter of changing a few lines in your .env. You choose your provider during safia setup and can change it at any time with safia config.
Voice transcription always uses Groq (Whisper) — regardless of which provider handles your chat completions. If you want to enable voice messages, you need a GROQ_API_KEY in your .env even when using a different primary provider. The only exception is when LLM_PROVIDER=groq, in which case your LLM_API_KEY is reused for transcription automatically.
Lunos is the default provider and requires no extra URL configuration — SAFIA knows the endpoint automatically when you set LLM_PROVIDER=lunos.What is Lunos?
Lunos is an AI gateway that provides access to a range of models via an OpenAI-compatible API. It’s the recommended starting point for SAFIA users..env configuration
LLM_PROVIDER=lunosLLM_API_KEY=sk-your-lunos-api-key# Optional: override the default modelLLM_MODEL=openai/gpt-oss-120b# Always required for voice messagesGROQ_API_KEY=gsk_your-groq-api-key
The default model openai/gpt-oss-120b works well for SAFIA’s financial assistant tasks. You can change LLM_MODEL to any model ID available on your Lunos account.Vision (receipt scanning)Lunos also handles receipt photo scanning via the VISION_MODEL setting. The default vision model works out of the box:
Groq offers fast inference with a generous free tier, making it a great choice if you want low latency or are getting started without spending money. When Groq is your primary provider, your LLM_API_KEY is also used for voice transcription — you don’t need a separate GROQ_API_KEY.What is Groq?
Groq runs open-weight models (LLaMA, Mixtral, etc.) on custom LPU hardware, delivering very low token-generation latency..env configuration
LLM_PROVIDER=groqLLM_API_KEY=gsk_your-groq-api-key# LLM_API_KEY is automatically reused for Whisper voice transcription# when LLM_PROVIDER=groq — no separate GROQ_API_KEY needed.LLM_MODEL=llama-3.3-70b-versatile
Groq’s free tier has per-minute and per-day rate limits. If your bot serves many users, you may hit these limits during peak usage. Consider lowering DAILY_MESSAGE_LIMIT in your .env to stay within Groq’s free tier quotas.
Use this provider to connect SAFIA directly to OpenAI’s official API. This gives you access to GPT-4o and other flagship OpenAI models..env configuration
LLM_PROVIDER=openaiLLM_API_KEY=sk-your-openai-api-keyLLM_MODEL=gpt-4o# Still required for voice transcriptionGROQ_API_KEY=gsk_your-groq-api-key
If you use gpt-4o as your LLM_MODEL, it also supports vision natively. Set VISION_MODEL=gpt-4o to use the same model for both chat and receipt scanning, keeping your setup simple.
Use this provider to connect SAFIA to any OpenAI-compatible API — including aggregator services like OpenRouter, local servers like LM Studio or Ollama, or your own inference endpoint..env configuration
LLM_PROVIDER=customLLM_BASE_URL=https://your-api-endpoint.com/v1LLM_API_KEY=your-api-keyLLM_MODEL=your-model-id# Still required for voice transcriptionGROQ_API_KEY=gsk_your-groq-api-key
LLM_BASE_URL must be an OpenAI-compatible endpoint. SAFIA appends /chat/completions and /models to this base URL.OpenRouterOpenRouter aggregates hundreds of models from multiple providers behind a single API key.
You can use any model listed on openrouter.ai/models — paste its ID as LLM_MODEL.LM StudioLM Studio runs local models and exposes an OpenAI-compatible server on your machine. Start the local server in LM Studio and note the port (default 1234).
The LLM_API_KEY value is not validated by LM Studio but must be non-empty.Ollama (OpenAI compatibility mode)Ollama exposes an OpenAI-compatible endpoint when started with the environment variable OLLAMA_HOST=0.0.0.0 or accessed via its built-in compatibility path.
Make sure the model is already pulled (ollama pull llama3.2) before starting SAFIA.
Local models (LM Studio, Ollama) run on your own hardware. Performance and quality depend entirely on your machine’s GPU/CPU and the model you load. SAFIA’s financial reasoning prompts work best with models that have at least 7B parameters.
Regardless of your LLM_PROVIDER choice, voice messages are transcribed using Whisper via Groq. This is a separate API call using a separate key.
Why does voice always use Groq?
Groq provides one of the fastest and cheapest Whisper implementations available. Keeping transcription on a single provider simplifies configuration and ensures consistent latency regardless of which LLM you use for chat.
To enable voice messages, add your Groq API key to .env:
GROQ_API_KEY=gsk_your-groq-api-key
If GROQ_API_KEY is absent and LLM_PROVIDER is not groq, voice messages are silently disabled — the bot will ignore audio messages.
You can switch providers at any time without reinstalling. Run safia config, select AI Provider & API Key, choose the new provider, enter the new key, and save. Then restart:
safia configsafia restart
Switching providers does not affect your stored data (transactions, debts, portfolios, knowledge base documents). Only the LLM used for new conversations changes.