docs: update CLAUDE.md with session learnings (#97, #113)

- focus-area v2.0.0 completion in recent completions; remove from active work - Update focus-area inferred params table row - min_history gotcha: checked against events, not task_completions - httpx trust_env=False rule for ml/ code - Agent test command Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 06:56:17 +00:00
parent 26fc67776f
commit a75be0d832
1 changed files with 7 additions and 2 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -66,6 +66,7 @@ docs/              architecture notes, ADRs, API specs
 - ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work.
 - No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later).
 - Compose profiles: `core` (api + web + admin), `full` (adds ml-serving), `mlops` (adds MLflow), `ai` (adds Ollama + LiteLLM). Mix as needed.
+- Run Python agent tests from repo root: `python3 -m pytest ml/agents/tests/ -x -q` (tests add repo root to `sys.path` themselves).

 ## Definition of done (per feature)

@@ -90,6 +91,8 @@ Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap hos

 Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.

+All `httpx` calls in `ml/` must use `trust_env=False` to bypass the system proxy — same rule as `bw` and curl. Pattern: `httpx.Client(trust_env=False, timeout=N)`.
+
 **Multi-agent tip generation pipeline (ADR-0013):**
 1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
 2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
@@ -108,10 +111,10 @@ Recent completions:
 - ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)
 - ADR-0014 complete: unified Profile schema + backfill, manifest plumbing, `/api/profile` read-through, registry-driven eligibility filter, inference framework + per-agent inference, legacy consent column drop — 2026-05-05
 - Rich per-agent inference for all four active agents (#112, #114, #115, #116) — 2026-05-06: quiet/peak hours (time-of-day), z-score baseline (momentum), p50 lateness + project realness (overdue-task), adaptive lookback + weekly/daily cycles (recent-patterns)
+- Semantic task clustering via nomic-embed-text + focus-area preferred_areas inference (#97, #113) — 2026-05-06: `ml/agents/clustering.py`, focus-area v2.0.0

 Active work (M2):
 - Per-user feature freshness SLAs (#61, ADR-0011 phase B)
- Embedding-based task clustering for focus-area inference (#97, #113)

 ## ADR-0014 endpoint map (as of step 6)

@@ -140,10 +143,12 @@ All five agents are at v1.2.0. Per-agent inferred params (all live in `ml/agents
 | `momentum` | `engagement_trend`, `baseline_completions_per_day`, `stdev` | Baseline = 28d rolling mean done/day; snippet uses z-score language |
 | `overdue-task` | `lateness_tolerance_days`, `project_realness` | Tolerance = p50 lateness from TaskCompletion history; realness = project median vs global median |
 | `recent-patterns` | `lookback_days`, `weekly_cycle`, `daily_cycle` | Lookback sized to ≥30 done events; cycles use peak-to-mean ratio; snippet hints when strength > 0.5 |
-| `focus-area` | *(none yet)* | Needs project-level feedback linkage (#78) |
+| `focus-area` | `preferred_areas` | Top-2 project IDs by task completion count; semantic clustering via `ml/agents/clustering.py` in compute() |

 `UserHistory` carries both `events: list[FeedbackEvent]` and `task_completions: list[TaskCompletion]`. `AgentInferRequest` (ml/serving) accepts `task_completions: list[dict]` alongside `feedback_history`.

+`min_history` is checked against `len(history.events)` (feedback events), **not** `task_completions`. Agents that infer from completions should set `min_history=0` and guard inside `infer()`.
+
 ## What NOT to do

 - Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.