infra: ai compose profile — Ollama + LiteLLM for local dev #86

New Issue

alvis · 2026-04-17T08:10:11Z

alvis commented

2026-04-17 08:10:11 +00:00

Goal

Make it possible to run the full AI stack locally without depending on Agap services.

Services

ollama — image ollama/ollama, port 127.0.0.1:11434:11434, volume /mnt/ssd/dbs/oo/ollama for model weights
litellm — image ghcr.io/berriai/litellm:main-latest, port 127.0.0.1:4000:4000, config mounts infra/litellm/config.yaml

Env vars to document

OLLAMA_URL — defaults to http://localhost:11434; in prod points to Agap Ollama
LITELLM_URL — defaults to http://localhost:4000; in prod points to llm.alogins.net

LiteLLM config (`infra/litellm/config.yaml`)

model_list:
  - model_name: tip-generator
    litellm_params:
      model: ollama/qwen2.5:7b
      api_base: http://ollama:11434
  - model_name: embedder
    litellm_params:
      model: ollama/nomic-embed-text
      api_base: http://ollama:11434
  - model_name: judge
    litellm_params:
      model: anthropic/claude-haiku-4-5-20251001
      # only used in offline sim; requires ANTHROPIC_API_KEY

Notes

Add ai profile to existing docker-compose.yml
Start with: docker compose --profile ai up
Production Agap services (Ollama at localhost:11434, LiteLLM at llm.alogins.net) are used when running --profile core or --profile full — just override the env vars
Pull models on first start: docker exec ollama ollama pull qwen2.5:7b && ollama pull nomic-embed-text

## Goal Make it possible to run the full AI stack locally without depending on Agap services. ## Services - **`ollama`** — image `ollama/ollama`, port `127.0.0.1:11434:11434`, volume `/mnt/ssd/dbs/oo/ollama` for model weights - **`litellm`** — image `ghcr.io/berriai/litellm:main-latest`, port `127.0.0.1:4000:4000`, config mounts `infra/litellm/config.yaml` ## Env vars to document - `OLLAMA_URL` — defaults to `http://localhost:11434`; in prod points to Agap Ollama - `LITELLM_URL` — defaults to `http://localhost:4000`; in prod points to `llm.alogins.net` ## LiteLLM config (`infra/litellm/config.yaml`) ```yaml model_list: - model_name: tip-generator litellm_params: model: ollama/qwen2.5:7b api_base: http://ollama:11434 - model_name: embedder litellm_params: model: ollama/nomic-embed-text api_base: http://ollama:11434 - model_name: judge litellm_params: model: anthropic/claude-haiku-4-5-20251001 # only used in offline sim; requires ANTHROPIC_API_KEY ``` ## Notes - Add `ai` profile to existing `docker-compose.yml` - Start with: `docker compose --profile ai up` - Production Agap services (Ollama at localhost:11434, LiteLLM at llm.alogins.net) are used when running `--profile core` or `--profile full` — just override the env vars - Pull models on first start: `docker exec ollama ollama pull qwen2.5:7b && ollama pull nomic-embed-text`

alvis added this to the M2 — AI tips + multi-source signals milestone 2026-04-17 08:10:11 +00:00

alvis closed this issue

2026-04-17 14:22:49 +00:00

alvis referenced this issue from a commit

2026-04-24 15:10:18 +00:00

feat: M2 AI tips — LiteLLM gateway, context assembler, end-to-end generation pipeline

alvis referenced this issue from a commit

2026-05-12 15:37:06 +00:00

chore(m2): close out remaining loose ends (#80, #86, #90)

Sign in to join this conversation.