docs: update CLAUDE.md with session learnings (#97, #113)

- focus-area v2.0.0 completion in recent completions; remove from active work
- Update focus-area inferred params table row
- min_history gotcha: checked against events, not task_completions
- httpx trust_env=False rule for ml/ code
- Agent test command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-06 06:56:17 +00:00
parent 26fc67776f
commit a75be0d832

View File

@@ -66,6 +66,7 @@ docs/ architecture notes, ADRs, API specs
- ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work.
- No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later).
- Compose profiles: `core` (api + web + admin), `full` (adds ml-serving), `mlops` (adds MLflow), `ai` (adds Ollama + LiteLLM). Mix as needed.
- Run Python agent tests from repo root: `python3 -m pytest ml/agents/tests/ -x -q` (tests add repo root to `sys.path` themselves).
## Definition of done (per feature)
@@ -90,6 +91,8 @@ Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap hos
Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.
All `httpx` calls in `ml/` must use `trust_env=False` to bypass the system proxy — same rule as `bw` and curl. Pattern: `httpx.Client(trust_env=False, timeout=N)`.
**Multi-agent tip generation pipeline (ADR-0013):**
1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
@@ -108,10 +111,10 @@ Recent completions:
- ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)
- ADR-0014 complete: unified Profile schema + backfill, manifest plumbing, `/api/profile` read-through, registry-driven eligibility filter, inference framework + per-agent inference, legacy consent column drop — 2026-05-05
- Rich per-agent inference for all four active agents (#112, #114, #115, #116) — 2026-05-06: quiet/peak hours (time-of-day), z-score baseline (momentum), p50 lateness + project realness (overdue-task), adaptive lookback + weekly/daily cycles (recent-patterns)
- Semantic task clustering via nomic-embed-text + focus-area preferred_areas inference (#97, #113) — 2026-05-06: `ml/agents/clustering.py`, focus-area v2.0.0
Active work (M2):
- Per-user feature freshness SLAs (#61, ADR-0011 phase B)
- Embedding-based task clustering for focus-area inference (#97, #113)
## ADR-0014 endpoint map (as of step 6)
@@ -140,10 +143,12 @@ All five agents are at v1.2.0. Per-agent inferred params (all live in `ml/agents
| `momentum` | `engagement_trend`, `baseline_completions_per_day`, `stdev` | Baseline = 28d rolling mean done/day; snippet uses z-score language |
| `overdue-task` | `lateness_tolerance_days`, `project_realness` | Tolerance = p50 lateness from TaskCompletion history; realness = project median vs global median |
| `recent-patterns` | `lookback_days`, `weekly_cycle`, `daily_cycle` | Lookback sized to ≥30 done events; cycles use peak-to-mean ratio; snippet hints when strength > 0.5 |
| `focus-area` | *(none yet)* | Needs project-level feedback linkage (#78) |
| `focus-area` | `preferred_areas` | Top-2 project IDs by task completion count; semantic clustering via `ml/agents/clustering.py` in compute() |
`UserHistory` carries both `events: list[FeedbackEvent]` and `task_completions: list[TaskCompletion]`. `AgentInferRequest` (ml/serving) accepts `task_completions: list[dict]` alongside `feedback_history`.
`min_history` is checked against `len(history.events)` (feedback events), **not** `task_completions`. Agents that infer from completions should set `min_history=0` and guard inside `infer()`.
## What NOT to do
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.