Ported from taskpile experiments/clustering_eval (prompt v1, qwen2.5:1.5b). The experiment showed ARI 0.22→0.77 and AUROC 0.76→0.91 on synthetic tasks when embedding LLM-expanded descriptions instead of raw titles. - Expand each task title via LiteLLM tip-generator before embedding - Prefix with "clustering: " (nomic-embed-text task instruction prefix) - Cache expansions in-memory by content hash within a compute cycle - Falls back to raw title if enrichment fails; no change to fallback behaviour Fixes #129 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.8 KiB
7.8 KiB