SAFIA LLM Providers: Lunos, Groq, OpenAI and Custom

SAFIA supports four LLM providers out of the box. All use an OpenAI-compatible API interface, so switching between them is a matter of changing a few lines in your .env. You choose your provider during safia setup and can change it at any time with safia config.

Voice transcription always uses Groq (Whisper) — regardless of which provider handles your chat completions. If you want to enable voice messages, you need a GROQ_API_KEY in your .env even when using a different primary provider. The only exception is when LLM_PROVIDER=groq, in which case your LLM_API_KEY is reused for transcription automatically.

Provider comparison

Provider	Best for	Free tier	Notes
Lunos	Default, multilingual	Check site	OpenAI-compatible gateway; default model is `openai/gpt-oss-120b`
Groq	Low-latency inference	Yes	Excellent free tier; also handles voice transcription
OpenAI	Official GPT models	No	Direct access to GPT-4o, GPT-4 Turbo, GPT-3.5
Custom	Any compatible API	Varies	OpenRouter, LM Studio, Ollama, or your own server

Configuration by provider

Lunos
Groq
OpenAI
Custom

Lunos is the default provider and requires no extra URL configuration — SAFIA knows the endpoint automatically when you set LLM_PROVIDER=lunos.What is Lunos? Lunos is an AI gateway that provides access to a range of models via an OpenAI-compatible API. It’s the recommended starting point for SAFIA users..env configuration

LLM_PROVIDER=lunos
LLM_API_KEY=sk-your-lunos-api-key

# Optional: override the default model
LLM_MODEL=openai/gpt-oss-120b

# Always required for voice messages
GROQ_API_KEY=gsk_your-groq-api-key

The default model openai/gpt-oss-120b works well for SAFIA’s financial assistant tasks. You can change LLM_MODEL to any model ID available on your Lunos account.Vision (receipt scanning)Lunos also handles receipt photo scanning via the VISION_MODEL setting. The default vision model works out of the box:

VISION_MODEL=mistralai/mistral-small-3.2-24b-instruct

Groq offers fast inference with a generous free tier, making it a great choice if you want low latency or are getting started without spending money. When Groq is your primary provider, your LLM_API_KEY is also used for voice transcription — you don’t need a separate GROQ_API_KEY.What is Groq? Groq runs open-weight models (LLaMA, Mixtral, etc.) on custom LPU hardware, delivering very low token-generation latency..env configuration

LLM_PROVIDER=groq
LLM_API_KEY=gsk_your-groq-api-key

# LLM_API_KEY is automatically reused for Whisper voice transcription
# when LLM_PROVIDER=groq — no separate GROQ_API_KEY needed.

LLM_MODEL=llama-3.3-70b-versatile

Recommended models

Model ID	Context	Notes
`llama-3.3-70b-versatile`	128k	Strong general reasoning; good default
`mixtral-8x7b-32768`	32k	Efficient mixture-of-experts model
`llama-3.1-8b-instant`	128k	Very fast; useful for simple queries

Get your Groq API key at console.groq.com.

Groq’s free tier has per-minute and per-day rate limits. If your bot serves many users, you may hit these limits during peak usage. Consider lowering DAILY_MESSAGE_LIMIT in your .env to stay within Groq’s free tier quotas.

Use this provider to connect SAFIA directly to OpenAI’s official API. This gives you access to GPT-4o and other flagship OpenAI models..env configuration

LLM_PROVIDER=openai
LLM_API_KEY=sk-your-openai-api-key

LLM_MODEL=gpt-4o

# Still required for voice transcription
GROQ_API_KEY=gsk_your-groq-api-key

Recommended models

Model ID	Notes
`gpt-4o`	Best quality; multimodal (handles vision too)
`gpt-4-turbo`	Strong reasoning; larger context window
`gpt-3.5-turbo`	Faster and cheaper; good for high-volume use

Get your OpenAI API key at platform.openai.com.

If you use gpt-4o as your LLM_MODEL, it also supports vision natively. Set VISION_MODEL=gpt-4o to use the same model for both chat and receipt scanning, keeping your setup simple.

Use this provider to connect SAFIA to any OpenAI-compatible API — including aggregator services like OpenRouter, local servers like LM Studio or Ollama, or your own inference endpoint..env configuration

LLM_PROVIDER=custom
LLM_BASE_URL=https://your-api-endpoint.com/v1
LLM_API_KEY=your-api-key

LLM_MODEL=your-model-id

# Still required for voice transcription
GROQ_API_KEY=gsk_your-groq-api-key

LLM_BASE_URL must be an OpenAI-compatible endpoint. SAFIA appends /chat/completions and /models to this base URL.

OpenRouterOpenRouter aggregates hundreds of models from multiple providers behind a single API key.

LLM_PROVIDER=custom
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_API_KEY=sk-or-your-openrouter-key
LLM_MODEL=anthropic/claude-3.5-sonnet

You can use any model listed on openrouter.ai/models — paste its ID as LLM_MODEL.

LM StudioLM Studio runs local models and exposes an OpenAI-compatible server on your machine. Start the local server in LM Studio and note the port (default 1234).

LLM_PROVIDER=custom
LLM_BASE_URL=http://localhost:1234/v1
LLM_API_KEY=lm-studio
LLM_MODEL=your-loaded-model-name

The LLM_API_KEY value is not validated by LM Studio but must be non-empty.

Ollama (OpenAI compatibility mode)Ollama exposes an OpenAI-compatible endpoint when started with the environment variable OLLAMA_HOST=0.0.0.0 or accessed via its built-in compatibility path.

LLM_PROVIDER=custom
LLM_BASE_URL=http://localhost:11434/v1
LLM_API_KEY=ollama
LLM_MODEL=llama3.2

Make sure the model is already pulled (ollama pull llama3.2) before starting SAFIA.

Local models (LM Studio, Ollama) run on your own hardware. Performance and quality depend entirely on your machine’s GPU/CPU and the model you load. SAFIA’s financial reasoning prompts work best with models that have at least 7B parameters.

Voice transcription

Regardless of your LLM_PROVIDER choice, voice messages are transcribed using Whisper via Groq. This is a separate API call using a separate key.

Why does voice always use Groq?

Groq provides one of the fastest and cheapest Whisper implementations available. Keeping transcription on a single provider simplifies configuration and ensures consistent latency regardless of which LLM you use for chat.

To enable voice messages, add your Groq API key to .env:

GROQ_API_KEY=gsk_your-groq-api-key

If GROQ_API_KEY is absent and LLM_PROVIDER is not groq, voice messages are silently disabled — the bot will ignore audio messages.

Switching providers

You can switch providers at any time without reinstalling. Run safia config, select AI Provider & API Key, choose the new provider, enter the new key, and save. Then restart:

safia config
safia restart

Switching providers does not affect your stored data (transactions, debts, portfolios, knowledge base documents). Only the LLM used for new conversations changes.

​Provider comparison

​Configuration by provider

​Voice transcription

​Switching providers

Provider comparison

Configuration by provider

Voice transcription

Switching providers