diff --git a/CLAUDE.md b/CLAUDE.md index bec8897..a8d0109 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -66,6 +66,7 @@ docs/ architecture notes, ADRs, API specs - ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work. - No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later). - Compose profiles: `core` (api + web + admin), `full` (adds ml-serving), `mlops` (adds MLflow), `ai` (adds Ollama + LiteLLM). Mix as needed. +- Run Python agent tests from repo root: `python3 -m pytest ml/agents/tests/ -x -q` (tests add repo root to `sys.path` themselves). ## Definition of done (per feature) @@ -90,6 +91,8 @@ Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap hos Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias. +All `httpx` calls in `ml/` must use `trust_env=False` to bypass the system proxy — same rule as `bw` and curl. Pattern: `httpx.Client(trust_env=False, timeout=N)`. + **Multi-agent tip generation pipeline (ADR-0013):** 1. Pre-compute agents (`ml/agents//`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL 2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets @@ -108,10 +111,10 @@ Recent completions: - ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013) - ADR-0014 complete: unified Profile schema + backfill, manifest plumbing, `/api/profile` read-through, registry-driven eligibility filter, inference framework + per-agent inference, legacy consent column drop — 2026-05-05 - Rich per-agent inference for all four active agents (#112, #114, #115, #116) — 2026-05-06: quiet/peak hours (time-of-day), z-score baseline (momentum), p50 lateness + project realness (overdue-task), adaptive lookback + weekly/daily cycles (recent-patterns) +- Semantic task clustering via nomic-embed-text + focus-area preferred_areas inference (#97, #113) — 2026-05-06: `ml/agents/clustering.py`, focus-area v2.0.0 Active work (M2): - Per-user feature freshness SLAs (#61, ADR-0011 phase B) -- Embedding-based task clustering for focus-area inference (#97, #113) ## ADR-0014 endpoint map (as of step 6) @@ -140,10 +143,12 @@ All five agents are at v1.2.0. Per-agent inferred params (all live in `ml/agents | `momentum` | `engagement_trend`, `baseline_completions_per_day`, `stdev` | Baseline = 28d rolling mean done/day; snippet uses z-score language | | `overdue-task` | `lateness_tolerance_days`, `project_realness` | Tolerance = p50 lateness from TaskCompletion history; realness = project median vs global median | | `recent-patterns` | `lookback_days`, `weekly_cycle`, `daily_cycle` | Lookback sized to ≥30 done events; cycles use peak-to-mean ratio; snippet hints when strength > 0.5 | -| `focus-area` | *(none yet)* | Needs project-level feedback linkage (#78) | +| `focus-area` | `preferred_areas` | Top-2 project IDs by task completion count; semantic clustering via `ml/agents/clustering.py` in compute() | `UserHistory` carries both `events: list[FeedbackEvent]` and `task_completions: list[TaskCompletion]`. `AgentInferRequest` (ml/serving) accepts `task_completions: list[dict]` alongside `feedback_history`. +`min_history` is checked against `len(history.events)` (feedback events), **not** `task_completions`. Agents that infer from completions should set `min_history=0` and guard inside `infer()`. + ## What NOT to do - Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.