diff --git a/CLAUDE.md b/CLAUDE.md index eac1f68..f0db4e7 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -92,7 +92,7 @@ oO generates tips through a multi-agent pipeline (ADR-0013): pre-compute agents | Alias | Model | Used by | |-------|-------|---------| | `tip-generator` | qwen2.5:1.5b (default) | `ml/serving` tip generation | -| `embedder` | nomic-embed-text | task clustering, dedup | +| `embedder` | nomic-embed-text | task clustering (after LLM enrichment), dedup | | `judge` | claude-haiku-4-5 (cloud, eval only) | offline sim | Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap host, `http://host.docker.internal:11434` from containers). @@ -197,7 +197,7 @@ Recent completions: - ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013) - ADR-0014 complete: unified Profile schema + backfill, manifest plumbing, `/api/profile` read-through, registry-driven eligibility filter, inference framework + per-agent inference, legacy consent column drop — 2026-05-05 - Rich per-agent inference for all four active agents (#112, #114, #115, #116) — 2026-05-06: quiet/peak hours (time-of-day), z-score baseline (momentum), p50 lateness + project realness (overdue-task), adaptive lookback + weekly/daily cycles (recent-patterns) -- Semantic task clustering via nomic-embed-text + focus-area preferred_areas inference (#97, #113) — 2026-05-06: `ml/agents/clustering.py`, focus-area v2.0.0 +- Semantic task clustering via nomic-embed-text + LLM enrichment (#97, #113, #129) — 2026-05-12: `ml/agents/clustering.py`; titles expanded via `tip-generator` before embedding; persistent cache in `task_enrichments` table; recompute gated on task-list hash change; focus-area v3.0.0 outputs all clusters with enriched descriptions - Per-user feature freshness SLAs (#61) — 2026-05-06: `invalidated_by` mirrored into `ProfileFeature`; drift-detection test added - MLflow tracing added to `ml/serving` for all agent calls — 2026-05-06: `ml/serving/mlflow_client.py`; activated by `MLFLOW_TRACKING_URI=http://mlflow:5000` (default in compose `full` profile); requires `--profile mlops` for the MLflow container. Issue #118 (M4) tracks removal from production critical path. @@ -223,7 +223,7 @@ Lives in `ml/agents/inference/`. `run_inference(manifest, history)` evaluates al - `infer()` error → emit `cold_start_default` (never crashes) - Results written to `user_preferences` with `source='inferred'`; keys with `source='user'` are never overwritten -All five agents are at v1.2.0. Per-agent inferred params (all live in `ml/agents/.py`): +Per-agent inferred params (all live in `ml/agents/.py`): | Agent | Inferred params | Notes | |-------|----------------|-------| @@ -231,7 +231,7 @@ All five agents are at v1.2.0. Per-agent inferred params (all live in `ml/agents | `momentum` | `engagement_trend`, `baseline_completions_per_day`, `stdev` | Baseline = 28d rolling mean done/day; snippet uses z-score language | | `overdue-task` | `lateness_tolerance_days`, `project_realness` | Tolerance = p50 lateness from TaskCompletion history; realness = project median vs global median | | `recent-patterns` | `lookback_days`, `weekly_cycle`, `daily_cycle` | Lookback sized to ≥30 done events; cycles use peak-to-mean ratio; snippet hints when strength > 0.5 | -| `focus-area` | `preferred_areas` | Top-2 project IDs by task completion count; semantic clustering via `ml/agents/clustering.py` in compute() | +| `focus-area` | *(none)* | No inferred params. Clusters tasks via LLM-enriched embeddings and outputs all areas with expanded descriptions. Recomputes only when task list changes (hash-gated). | `UserHistory` carries both `events: list[FeedbackEvent]` and `task_completions: list[TaskCompletion]`. `AgentInferRequest` (ml/serving) accepts `task_completions: list[dict]` alongside `feedback_history`. diff --git a/ml/agents/focus_area.py b/ml/agents/focus_area.py index 9eca856..5f84d74 100644 --- a/ml/agents/focus_area.py +++ b/ml/agents/focus_area.py @@ -10,7 +10,7 @@ from .manifest import AgentManifest MANIFEST = AgentManifest( id="focus-area", version="3.0.0", # output all clusters as context; no scoring (#129) - description="Clusters the user's task list and summarises all areas for the orchestrator.", + description="Clusters tasks semantically, enriches titles via LLM, and outputs a full area summary with expanded descriptions for the orchestrator.", pref_schema={"type": "object", "additionalProperties": False, "properties": {}}, context_schema=["todoist.tasks"], required_consents=["data:core", "data:todoist"],