Ported from taskpile experiments/clustering_eval (prompt v1, qwen2.5:1.5b).
The experiment showed ARI 0.22→0.77 and AUROC 0.76→0.91 on synthetic tasks
when embedding LLM-expanded descriptions instead of raw titles.
- Expand each task title via LiteLLM tip-generator before embedding
- Prefix with "clustering: " (nomic-embed-text task instruction prefix)
- Cache expansions in-memory by content hash within a compute cycle
- Falls back to raw title if enrichment fails; no change to fallback behaviour
Fixes#129
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old code called Ollama's /api/embeddings one task at a time, which caused
silent fallback to project-based grouping when host.docker.internal:11434 was
unreachable from the ml-serving container.
- Switch to LiteLLM /embeddings (model alias "embedder") as primary path
- Batch all task contents in one request instead of N serial calls
- Fall back to Ollama /api/embed (updated to current API) when LITELLM_URL is absent
- Update tests to mock _embed_batch instead of the removed _embed
Fixes#123
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>