Compare commits

...

53 Commits

Author SHA1 Message Date
ac1226c367 feat(integrations): migrate google-health from Fit REST to Google Health API v4
Google Fit REST API was closed to new sign-ups on 2024-05-01 and shuts down
end of 2026, surfacing as "Access blocked: this app's request is invalid"
when starting the OAuth flow.

- Swap the 10 fitness.* OAuth scopes for the 3 googlehealth.*.readonly
  scopes (activity_and_fitness, health_metrics_and_measurements, sleep).
- Replace fitness/v1 dataset:aggregate + sessions calls with
  health.googleapis.com/v4/users/me/dataTypes/{steps,total-calories,
  heart-rate,sleep}/dataPoints, filtered to today's window.
- Read the v4 DataPoint union defensively (the per-type schema is sparsely
  documented) and log the first raw sample at debug so we can refine field
  paths after the first real OAuth.
- Output Signal contract is unchanged — agents and downstream consumers
  see the same steps/activity/heart_rate/sleep signals.

Cloud Console still needs: enable Google Health API, add the 3 scopes to
the consent screen, add test user (all googlehealth scopes are Restricted).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 05:42:05 +00:00
2159d4cbd1 fix(infra): unblock docker builds for stars agent and web
- Dockerfile.ml: install build-essential so pyswisseph (stars agent) compiles
- Dockerfile.web: copy root package.json + pnpm-workspace.yaml + pnpm-lock.yaml into builder stage so pnpm --filter resolves the workspace
- CLAUDE.md: record both gotchas alongside the existing Docker rebuild notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:46:20 +00:00
522454ab61 feat(agents): stars agent — astrological transits via pyswisseph (#121)
Computes natal chart (Sun/Moon/Mercury/Venus/Mars/Jupiter/Saturn) from
birth_date and finds active transits (conjunction/sextile/square/trine/
opposition) between today's sky and the user's natal positions. Top 3
most-exact transits are passed to the orchestrator as interpretive themes
to colour the tip — grounded and actionable, not predictive.

Birth date sourced from agent_prefs (populated by a connected Google
data source); requires data:google-health consent. Agent self-silences
when birth_date is absent. pyswisseph added to ml/serving/requirements.txt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 10:59:10 +00:00
be8c006a4d feat(agents): tarot agent — daily three-card draw (situation/action/outcome) (#120)
Draws 3 Major Arcana cards from a daily seed (user_id + date) so the
reading is stable within a day and unique per user. Card meanings and
action hints are precomputed in the agent; the orchestrator receives a
structured prompt snippet and is instructed to weave the themes into a
grounded, practical tip without explaining the cards.

No inferred params, no external data — requires only data:core consent.
TTL 6 h (refreshes at most twice daily).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 10:52:55 +00:00
8474468614 feat(integrations): add Google Health card to connect page (#119)
The OAuth backend (signal source, /connect and /callback routes, token
refresh, consent grant) was already complete. This adds the missing UI:
a Google Health card in /connect with Connect/Disconnect actions, and
broadens the "See my tip →" CTA to appear when any integration is
connected (not only Todoist).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 10:28:14 +00:00
ad43a8f06a fix(recommender): serve fallback tips to users with no integrations (#117)
The integration-token gate returned 422 for users with no connected
sources, blocking them from any tip. Users with no integrations now go
through the full orchestrator pipeline; if it fails (or returns nothing
because agent outputs are also empty), randomFallbackTip() fires and
serves a generic advice tip instead of an error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 09:54:54 +00:00
56fda0d737 chore(scheduler): skip agents whose data sources aren't granted (#128)
Check getEligibleAgentIds per user in runCycle before calling
computeAndStore — agents without consented data sources, silenced by
active context, or disabled via preference are skipped rather than
computed unconditionally. Eligibility check failure skips the whole
user (fail-closed). Skipped count added to cycle-complete log line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:45:08 +00:00
b1bd3d465f docs(readme): replace inline issue checklists with Gitea milestone links
Roadmap phase sections now show shipped summaries only; open work lives
in Gitea milestones. Eliminates duplicate source-of-truth between README
and issue tracker.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:34:45 +00:00
8fd08379d7 chore(m2): close out remaining loose ends (#80, #86, #90)
- Add `ai` compose profile — Ollama + LiteLLM containers for local dev
  when Agap shared services are unavailable; use with LITELLM_URL /
  OLLAMA_URL env vars pointing ml-serving at localhost
- Mark #90 done (LLM schema validation + fallback shipped in 85a332b)
- Mark #80 superseded by ADR-0013 (multi-agent orchestrator is the pipeline)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:31:25 +00:00
85a332b22b feat(recommender): LLM schema validation + hardcoded fallback tips on AI failure (#90)
Python (ml/serving):
- Validate tip item after JSON parse: non-empty content, valid kind
- Retry on schema failure with a targeted clarification prompt, same 2× retry budget
- JSON parse failures keep the existing retry suffix

TypeScript (recommender):
- Add TipSource 'fallback' to shared-types
- FALLBACK_TIPS: 12 general-purpose life tips (hardcoded, no DB read)
- fetchOrchestratorTip returns {ok} discriminated union instead of null
- On !res.ok or fetch error: serve a random fallback tip with rationale 'AI service issues'
- Update tests: 204 path removed; both failure cases now expect source='fallback'

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:21:03 +00:00
772bb6e194 feat(consents): auto-grant data:<provider> on connect; remove agent: consents (ADR-0015)
- integrations.ts: grant data:<provider> on OAuth callback, revoke on disconnect
- Backfill migration: INSERT OR IGNORE data:<provider> for all active tokens
- Agent manifests: drop agent:<id> from required_consents (momentum, time-of-day,
  overdue-task, recent-patterns, health-vitals) — per-agent control is a preference
- eligibility.ts: update comment to reflect data:-only consent model
- test_manifest.py: assert no agent: consents remain in any manifest
- migrations.test.ts: backfill idempotency tests for issue #127
- Dockerfile.api: drop --offline flag (fixes ERR_PNPM_NO_OFFLINE_META)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:09:58 +00:00
34925310cf docs: update focus-area manifest description and CLAUDE.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:00:06 +00:00
f66f337779 feat(focus-area): use enriched descriptions in cluster output
cluster_tasks now attaches enriched_description to each task dict.
focus-area reads enriched_description (falling back to raw content) when
building the area summary, so the orchestrator sees the expanded 3-sentence
descriptions instead of terse raw titles.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:58:31 +00:00
f6b89fc849 refactor(focus-area): output all clusters as context; remove scoring and preferred_areas
The agent no longer picks a winner — it summarises every cluster so the
orchestrator can decide what's relevant. Scoring by overdue count overlapped
with the overdue-task agent. preferred_areas (project-ID based, broken label
matching) removed entirely.

Output format: numbered list of areas with task titles included.
Snapshot: {cluster_count, clusters: [{label, task_count, tasks}]}.
Version bumped to 3.0.0; inferred_params cleared.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:57:04 +00:00
12c956b588 fix(clustering): drop TTL check from isUpToDate; task hash is the only signal
If tasks haven't changed, the output is valid forever. If they changed,
always recompute regardless of age. TTL on focus-area restored to 24h —
it only controls recommender eligibility, not recompute frequency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:46:43 +00:00
d12f11d29d feat(clustering): 1h TTL + skip recompute when tasks unchanged
focus-area now recomputes at most once per hour, and only if the task list
actually changed since the last compute.

- focus-area TTL: 43200s → 3600s; version bumped to 2.1.0
- computeAndStore hashes sorted task contents (MD5) and checks the stored
  _task_hash in the existing snapshot; skips the ml-serving call when the
  hash matches and the output isn't expired
- ml-serving injects _task_hash into the snapshot so the next cycle can compare

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:45:15 +00:00
9ddeea6cac feat(clustering): persistent enrichment cache in task_enrichments table
Each unique task title is now enriched by LiteLLM once and cached in the DB.
Subsequent agent compute cycles (every 12h) fetch the cache before calling
ml-serving; only new titles hit the tip-generator.

- DB: task_enrichments(content_hash PK, description, model, created_at)
- TS: fetchEnrichmentCache / persistEnrichments helpers in agent-outputs.ts;
  enrichment_cache passed in compute request, new_enrichments persisted from response
- Python: AgentComputeRequest.enrichment_cache / AgentComputeResponse.new_enrichments;
  AgentInput.enrichment_cache; _enrich_batch returns (descriptions, new_entries);
  cluster_tasks returns (clusters, new_enrichments)
- FocusAreaAgent stashes new_enrichments in signals_snapshot under _new_enrichments;
  compute_agent endpoint pops it before storing the snapshot

Closes part of #129

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:39:35 +00:00
08d08ad7b0 feat(clustering): LLM-enrichment before embedding (port from taskpile #129)
Ported from taskpile experiments/clustering_eval (prompt v1, qwen2.5:1.5b).
The experiment showed ARI 0.22→0.77 and AUROC 0.76→0.91 on synthetic tasks
when embedding LLM-expanded descriptions instead of raw titles.

- Expand each task title via LiteLLM tip-generator before embedding
- Prefix with "clustering: " (nomic-embed-text task instruction prefix)
- Cache expansions in-memory by content hash within a compute cycle
- Falls back to raw title if enrichment fails; no change to fallback behaviour

Fixes #129

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:20:48 +00:00
1ca2351488 fix(clustering): route embeddings through LiteLLM instead of Ollama directly
The old code called Ollama's /api/embeddings one task at a time, which caused
silent fallback to project-based grouping when host.docker.internal:11434 was
unreachable from the ml-serving container.

- Switch to LiteLLM /embeddings (model alias "embedder") as primary path
- Batch all task contents in one request instead of N serial calls
- Fall back to Ollama /api/embed (updated to current API) when LITELLM_URL is absent
- Update tests to mock _embed_batch instead of the removed _embed

Fixes #123

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 13:42:53 +00:00
4e9210fcef fix(web): wrap loadTip in arrow fn to satisfy MouseEventHandler type 2026-05-12 13:34:46 +00:00
59c493323f fix(recommender): remove Todoist fallback on orchestrator failure; add snooze exclusion
When fetchOrchestratorTip returned null (LiteLLM timeout, bad JSON, etc.)
the recommender silently fell back to randomPolicy, serving a raw Todoist
task with no rationale — explaining both reported symptoms.

- Remove randomPolicy/signalToCandidate; return 204 when orchestrator fails
  so the UI shows "All clear" instead of a confusing Todoist task
- Pass recent_tip through the stack (frontend → POST /recommend →
  fetchOrchestratorTip → ml/serving RecommendRequest → build_orchestrator_messages)
  so after snooze the LLM is instructed not to repeat the snoozed content

Fixes #122

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 13:28:32 +00:00
d4b40e2590 docs: document MLflow trace API, span inspection, and no-agent diagnosis
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 11:23:13 +00:00
a0a069c525 fix(admin): break redirect loop on /forbidden for non-admin users
The middleware was redirecting non-admins to /forbidden but /forbidden
wasn't excluded from the matcher, so the middleware ran again on that
page, saw a non-admin, and redirected again — infinite loop. Added
/forbidden to the pass-through list alongside /login.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 11:12:16 +00:00
d1f28666b0 feat(integrations): add Google Health (Fit) integration with full permissions
OAuth2 flow with all 11 Google Fitness scopes (activity, body, sleep,
heart rate, nutrition, location, blood glucose/pressure/temperature,
oxygen saturation, reproductive health). Stores access + refresh tokens;
auto-refreshes on expiry.

GoogleHealthSignalSource fetches steps, sleep sessions, active minutes,
calories, and heart rate from the Fit aggregate + sessions APIs. Signals
flow into both the tip orchestrator and the health-vitals pre-compute
agent, which generates prompt snippets about step progress, sleep
deficit, sedentary time, and elevated heart rate.

Signal.kind extended with 'health'; IntegrationProvider extended with
'google-health'. Agent compute signal mapping enriched to include source,
kind, and all features so health-vitals can filter its own signals.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 11:12:11 +00:00
161e654027 feat(serving): replace MLflow run logging with native trace spans
Convert ml-serving from isolated MLflow runs to nested traces using
mlflow.start_span_no_context(). The recommend endpoint now emits a full
span tree: recommend (CHAIN) → build_context (TOOL), agent:* (AGENT) ×N,
llm_orchestrator (LLM). Compute and infer endpoints each emit a single span.

Supporting changes:
- mlflow-skinny>=3.1.0 added to requirements
- MLflow configured with --serve-artifacts + mlflow-artifacts:/ default root
  for cross-container artifact proxy (spans now persist from ml-serving)
- --allowed-hosts extended to include mlflow:5000 (SDK includes port in Host)
- science_destiny slider wired through prompts.py and recommend endpoint
- Config page exposes science/destiny slider (0=data-driven, 100=intuitive)
- Tip page shows rationale inline on tap

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 08:26:05 +00:00
afacc34969 fix(agents): instruct orchestrator to output tip in English
Small models (qwen2.5:1.5b) mirror the language of task title content
in the prompt. Adding an explicit English note to snippets that embed
raw task titles (focus-area, overdue-task) prevents language bleed.
Also added the instruction to the orchestrator system prompt and user
message as belt-and-suspenders.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 11:53:21 +00:00
c124ff4d24 docs: update CLAUDE.md with session learnings (#118 tracing, compose gotchas)
- Clarify compose profile requirement for build/up (silent no-op without --profile)
- Add --force-recreate pattern for env-var-only changes
- Document MLflow host_header and auth gotchas for container-to-container calls
- Record MLflow tracing addition and #118 M4 tracking issue

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 10:41:57 +00:00
95e1b342b4 fix(serving): wire MLflow auth and Host header for container-to-container calls
- Pass MLFLOW_ADMIN_PASSWORD as fallback password credential
- Set host_header='localhost' to satisfy MLflow's --allowed-hosts check
  (MLflow rejects Host: mlflow but accepts Host: localhost)
- Default MLFLOW_TRACKING_URI to http://mlflow:5000 in compose so the
  env_file value is not silently overridden to empty

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 10:39:08 +00:00
c43dbaf23d feat(serving): add MLflow tracing to ml-serving for all agent calls
Logs one MLflow run per /recommend (params, token metrics, latency,
full prompt + tip as artifacts) and per /agents/{id}/compute and
/infer call (signals snapshot, inferred prefs, latency).

Tracing is a no-op when MLFLOW_TRACKING_URI is unset; ml-serving
starts and serves tips correctly without MLflow configured.

Refs #118 (M4: remove from production / move off critical path).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 10:30:24 +00:00
488a764519 docs: mark M2 complete in README
All M2 items shipped: ADR-0014 (unified profile + inference framework),
per-agent auto-inference, tip generator, TipCandidate schema, prompt
versioning, model benchmark, task clustering, UX refinements.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 08:02:44 +00:00
c67f2b14c4 docs: update CLAUDE.md with #61 completion and feature test patterns
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 07:45:40 +00:00
17b9516903 feat(features): mirror invalidatedBy into Python ProfileFeature (#61)
Adds invalidated_by: tuple[str, ...] to ProfileFeature, mirroring the
invalidatedBy bus subjects from registry.ts. Adds a test that parses the
TS source and asserts Python stays in sync — same drift-detection pattern
used for names and ttlSec.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 07:10:36 +00:00
a75be0d832 docs: update CLAUDE.md with session learnings (#97, #113)
- focus-area v2.0.0 completion in recent completions; remove from active work
- Update focus-area inferred params table row
- min_history gotcha: checked against events, not task_completions
- httpx trust_env=False rule for ml/ code
- Agent test command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 06:56:17 +00:00
26fc67776f feat(agents): semantic task clustering + focus-area inferred preferred_areas (#97, #113)
- New ml/agents/clustering.py: embed task content via nomic-embed-text
  (Ollama), greedy cosine clustering (threshold 0.72, max 6 clusters),
  graceful fallback to project-id grouping when Ollama is unreachable
- focus_area v2.0.0: compute() uses semantic clusters as focus areas;
  adds preferred_areas InferredParam inferred from top-2 projects by
  task_completion count
- 135 tests, all passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 06:54:46 +00:00
336644a90a docs: update CLAUDE.md with rich per-agent inference completions (#112–#116)
- Inference framework table updated: all agents at v1.2.0 with full param list
- Documents UserHistory.task_completions and AgentInferRequest.task_completions
- Marks #112/114/115/116 complete in recent completions
- Active work updated: #78 closed, #61 and #97/#113 as next priorities

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 06:28:30 +00:00
1d9a395591 feat(agents): quiet window + peak hours + tz prefs for time-of-day agent (#112)
Adds four InferredParams (all TTL=24h, min_history=50 except preferred_hour=10):
- quiet_start / quiet_end: longest contiguous below-baseline hour run (HH:MM)
- peak_hours: top-quartile done-event hours, sorted ascending
- tz: cold-start only ("UTC"); populated from auth provider, no inference function

compute() updated:
- in_quiet check (quiet window) takes precedence over peak hours
- in_peak emits "peak productivity hour" language when current hour is in peak_hours
- approaching peak (within 2h) surfaces for orchestrator timing
- tz surfaced in snippet header when not UTC
- snapshot adds peak_hours, in_quiet, in_peak, tz

- Agent bumped to v1.2.0
- 21 new tests: night-owl, early-bird, shift-worker, quiet/peak snippet rendering
- Fixed test_snapshot_keys in test_agents.py to include new snapshot fields

Closes #112

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 06:05:51 +00:00
bc71dc203d feat(agents): adaptive lookback + weekly/daily cycle detection for recent-patterns (#116)
Replaces the coarse density-bucket window_days with three InferredParams (all TTL=24h):
- lookback_days: min window containing ≥30 done events, capped at 30d (min_history=5)
- weekly_cycle: per-DOW peak-to-mean strength list (min_history=21, ≥3 weeks of signal)
- daily_cycle: per-hour peak-to-mean strength list (min_history=14)

compute() renders cycle hints when strength > 0.5:
  "User tends to complete tips on Tuesdays and Saturdays."
  "User is most active around 8pm."
Legacy window_days pref key still accepted as a fallback.

- window_days pref renamed lookback_days; backward-compat fallback in compute()
- Agent bumped to v1.2.0
- 19 new tests: weekend-warrior, weekday-only, evening-person, no-pattern,
  legacy compat, snippet rendering with strong/weak signals

Closes #116

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 05:51:45 +00:00
4cade4868b feat(agents): per-user baseline + stdev inference for momentum agent (#114)
Adds two InferredParams (TTL=7d) computed from 28-day rolling daily done counts:
- baseline_completions_per_day: mean done events/day over the window
- stdev: stdev of daily counts (floored at 0.1 to avoid division by zero)

MomentumAgent.compute() now calculates a z-score from recent done events in
inp.feedback_history vs the inferred baseline. Snippet language switches to
z-score framing ("above your usual pace", "slowing down") when |z| >= 1.0,
falling back to engagement_trend labels when in the normal range.

- engagement_trend InferredParam preserved for backward compatibility
- momentum_window pref added (default 7, user-overridable)
- 14 new tests covering power user, casual user, returning-from-break, and
  relative stdev comparison; engagement_trend tests updated for z-score priority
- Agent bumped to v1.2.0

Closes #114

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 05:18:29 +00:00
04212ff318 feat(agents): p50-lateness tolerance + per-project realness for overdue-task (#115)
Replaces snooze-rate heuristic with p50 of actual task lateness (completedAt − dueAt).
Adds project_realness inference: projects with chronic lateness get realness < 1 and
the agent softens its snippet language from "overdue" to "past target date".

- TaskCompletion added to UserHistory with lateness_days computed property
- _infer_lateness_tolerance: p50 of task_completions, clipped at 0, float
- _infer_project_realness: per-project median lateness normalised by global median
- Both InferredParams use 7d TTL; cold_start = 0.0 / {}
- AgentInferRequest accepts task_completions; endpoint wires them through
- 12 new tests covering punctual/chronic/mixed users and language softening
- Agent bumped to v1.2.0

Closes #115

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 05:14:04 +00:00
35257b7756 docs: mark ADR-0014 complete in CLAUDE.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 11:50:42 +00:00
ed1705cb5d feat(db): drop users.consentGiven/consentAt (ADR-0014 step 8)
Backfills consent_given=1 rows into user_consents as data:core before
dropping the legacy columns. auth.ts now writes user_consents on signup;
POST /consent writes user_consents; admin/user routes cleaned of the old
fields. Migration is idempotent — DROP COLUMN is wrapped in try/catch so
it no-ops on fresh DBs that never had the columns.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 11:50:27 +00:00
afb0e9b0cb feat(agents): per-agent inference — momentum, overdue-task, recent-patterns, focus-area (ADR-0014 step 7)
All four agents bumped to v1.1.0.

momentum (#114): infers engagement_trend ('up'|'stable'|'down') by comparing
done-rate in the last 7 days vs the prior 7 days. Agent surfaces the trend
in its snippet ("trending up — build on the momentum").

overdue-task (#115): infers lateness_tolerance_days (0/1/2) from snooze rate.
Agent now filters tasks against the tolerance so low-urgency users aren't
nagged about tasks that are only hours overdue.

recent-patterns (#116): infers window_days (7/14/30) from feedback event
density — sparse users get a wider window so the snippet isn't always empty.

focus-area (#113): no inferred params (project-level feedback linkage needed,
tracked under #78). preferred_areas pref was declared but ignored; agent now
honours it as a tiebreaker and mentions it in the snippet.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 11:21:10 +00:00
ad6747c242 feat(profile): /api/profile + eligibility filter + inference framework (ADR-0014 steps 4-6)
Step 4 — /api/profile read-through API:
  GET  /api/profile          → { user, prefs, consents, contexts }
  PATCH /api/profile/prefs/:scope  upsert user_preferences (source='user')
  PATCH /api/profile/consents      grant / revoke consent keys
  PATCH /api/profile/contexts      create / activate / deactivate contexts
  Legacy consentGiven bit folded in as data:core fallback.

Step 5 — registry-driven eligibility filter:
  fetchRegistry() exported from agent-registry.ts.
  profile/eligibility.ts: getEligibleAgentIds(userId) — filters by required
  consents, silenced_in_contexts, and user_preferences[enabled=false].
  fetchOrchestratorTip filters agent_outputs to eligible set before calling
  ml/serving /recommend. Fail-closed: registry unavailable → empty set.

Step 6 — shared context-inference framework (#111) + time-of-day proof (#112):
  ml/agents/inference/: UserHistory, FeedbackEvent, run_inference().
  Framework: cold-start, min_history gating, error fallback, structured logs.
  TimeOfDayAgent v1.1.0: inferred_params=[preferred_hour]; also reads
  quiet_start/quiet_end from agent_prefs. agent_prefs injected by TS caller.
  AgentInput gains agent_prefs field.
  ml/serving: POST /agents/{agent_id}/infer endpoint.
  agent-outputs.ts computeAndStore: loads prefs before compute, calls /infer
  after, persists results (source='inferred'); user overrides never touched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 11:14:25 +00:00
305eeae38b feat(agents): manifest plumbing + GET /agents/registry (ADR-0014 step 3)
Each agent now exports a module-level MANIFEST declaring id, version,
pref_schema, required_consents, ttl_sec, and silenced_in_contexts. The
registry surfaces both the agent and its manifest, and rejects on
mismatch so the two cannot drift.

ml/serving exposes GET /agents/registry; services/api proxies it as
GET /api/agents/registry with a 60s in-process cache so admin pageviews
don't hammer upstream. Failures aren't cached.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 10:55:54 +00:00
5d43339616 feat(api): unified Profile schema + consent backfill (ADR-0014 step 1-2)
Adds user_preferences, user_consents, user_contexts and the tone /
tip_kinds_json columns on users. Backfills consent_given=1 rows into
user_consents as data:core; INSERT OR IGNORE keeps it idempotent and
respects later revocations.

Migration body moves to db/migrations.ts so tests can apply it to a
fresh in-memory handle without opening the prod DB on import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 10:28:47 +00:00
d454a0a8bf docs: ADR-0014 — unified Profile model + agent registry
Propose a shared substrate for per-user prefs, contexts, per-key
consents, and per-agent state so adding an agent stays a manifest
change. Updates CLAUDE.md, README, and architecture docs to reflect
the multi-agent pipeline (ADR-0013) and the registry direction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 10:19:07 +00:00
41302d9f36 fix: repair Docker build — TS errors and missing docs in image
- Remove unused `httpx` import from bench.ts (package does not exist)
- Add explicit `IRouter` type on `router` in agent-outputs.ts and bench.ts
  to resolve TS2742 portable-type errors
- Remove `docs` from .dockerignore so Dockerfile.admin can copy it into
  the runner image (DOCS_ROOT=/app/docs is read at runtime by the admin)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:52:27 +00:00
05f748159b chore: remove shadow policy machinery (ADR-0013 step 10)
Deletes shadowPolicies map, getShadowPolicies, setPolicyActive from
recommender.ts; removes /api/admin/policies routes from admin.ts; removes
getPolicies, togglePolicy, PolicyInfo from admin api.ts; removes the
policy toggle section from the ops page.

168 API tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:45:32 +00:00
8e9718e8ba chore(ml): remove bandit endpoints + helpers (ADR-0013 step 9)
Deletes all LinUCB and ε-greedy code from ml/serving: score, reward,
stats, reset, features endpoints; feature vector builders; per-user state
file helpers; related Pydantic models; numpy/math/time imports.

Removes test_score.py (pure bandit unit tests). 40 remaining tests pass.
STATE_DIR kept — nats_consumer still writes sync metadata there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:41:58 +00:00
c65bedcf68 feat(api): orchestrator cutover — replace bandit with multi-agent pipeline (ADR-0013 step 6)
POST /recommend now calls ml/serving /recommend with pre-computed agent
snippets + task context instead of /generate + /score/egreedy/v2. Falls
back to a random signal candidate when ml/serving is unavailable.

Removes: remotePolicy, fetchLlmCandidates, sendRewardWithRetry,
candidateCache, pickPromptVersion. Feedback handler keeps inferReward +
tipFeedback writes for observability; reward delivery to the bandit is gone.
tipScores.policy is now 'orchestrator'; promptVersion is 'v4-orchestrator'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:37:15 +00:00
7e958a779d feat(api): agent pre-compute scheduler (ADR-0013 step 5)
Extracts computeAndStore() from the /agents/:agentId/compute route so it
can be called without an HTTP round-trip. startAgentPrecomputeScheduler()
runs every 15 min: fetches active users (tip view in 48h), runs all agents
in parallel per user, then purges outputs expired >24h. Agent IDs are
resolved from ml/serving /health at startup with a fallback hardcoded list.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:29:50 +00:00
37aec4fee1 chore: ADR-0007/0012 superseded status + admin users ID column
ADR-0007 and ADR-0012 both superseded by ADR-0013 as of 2026-05-01.
UsersTable gains a truncated ID column for quick user identification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:20:44 +00:00
b3cf588f2f feat(ml): multi-agent context framework + v4 orchestrator prompt
Adds ml/agents/ — five specialised sub-agents (overdue_task, momentum,
time_of_day, recent_patterns, focus_area) each producing a prompt snippet
from user signals. A registry wires them up; the orchestrator prompt in
ml/serving/prompts.py synthesises their outputs into one tip via LiteLLM.

Also wires /api/agents route in the API and updates the Dockerfile to copy
the full ml/ tree with PYTHONPATH=/app so agent imports resolve correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:20:05 +00:00
86 changed files with 7427 additions and 2116 deletions

View File

@@ -12,7 +12,6 @@
**/.env **/.env
**/.env.local **/.env.local
**/*.log **/*.log
docs
infra/docker/data infra/docker/data
**/__tests__ **/__tests__
**/*.test.ts **/*.test.ts

164
CLAUDE.md
View File

@@ -65,7 +65,18 @@ docs/ architecture notes, ADRs, API specs
- One PR = one concern. Conventional-commit prefixes (`feat:`, `fix:`, `chore:`, `docs:`, `refactor:`). - One PR = one concern. Conventional-commit prefixes (`feat:`, `fix:`, `chore:`, `docs:`, `refactor:`).
- ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work. - ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work.
- No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later). - No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later).
- Compose profiles: `core` (api + web + admin), `full` (adds ml-serving), `mlops` (adds MLflow), `ai` (adds Ollama + LiteLLM). Mix as needed. - Compose profiles: `core` (api + web + admin), `full` (adds ml-serving + nats), `mlops` (adds MLflow), `ai` (adds Ollama + LiteLLM). Mix as needed. Always pass `--profile <name>` to `build`/`up` — without a profile, no services are selected and builds silently do nothing.
- Docker rebuild: use `--force-recreate` on `up` when only env vars changed (no image rebuild needed); new env vars in `.env.local` are not picked up by a running container until it is recreated.
- Docker rebuild gotchas:
- **Never run two `docker compose up --build` at once** — both grab the same `--mount=type=cache,id=pnpm` and deadlock on the API's `pnpm --prod deploy` step. Symptom: build sits silent for hours on `[api builder 8/8]`. Before starting any build, check `ps aux | grep "docker compose"` and kill any prior `up --build` (`kill -9 <pid>` — the wrapper bash and the docker compose binary are separate PIDs; kill the docker compose one).
- **Don't add `--offline` to `pnpm --prod deploy`** — pnpm's metadata cache (`/root/.cache/pnpm/`) is not in the `/pnpm/store` cache mount, so `--offline` fails with `ERR_PNPM_NO_OFFLINE_META` for transitive devDeps (e.g. vite via vitest). Leave the deploy step network-on; it works.
- **All TS Dockerfiles need `python3 make g++`** in the base stage — `better-sqlite3` rebuilds natively on install. Missing from `Dockerfile.admin` historically caused `gyp ERR! find Python` failures.
- **`Dockerfile.ml` needs `build-essential`** (not just `gcc`) — `pyswisseph` (stars agent) compiles C from source and fails with `fatal error: math.h: No such file or directory` if only `gcc` is installed; it needs `libc-dev` too, easiest via `build-essential`.
- **`Dockerfile.web` builder stage needs root `package.json` + `pnpm-workspace.yaml` + `pnpm-lock.yaml`** copied in. Without them, `pnpm --filter @oo/shared-types build` fails with `[ERR_PNPM_NO_PKG_MANIFEST] No package.json found in /app`. The deps stage has them but the builder is a fresh layer; selective copies must include them.
- **A clean build of `--profile core` takes ~3 min total** when the buildx cache is warm. If it's been silent for >10 min, check for the parallel-build deadlock above before assuming "still going".
- Run Python agent tests: `python3 -m pytest ml/agents/tests/ -x -q` (tests add repo root to `sys.path` themselves).
- Run Python feature tests: `python3 -m pytest ml/features/ -x -q`
- `ml/features/` files are Python mirrors of TS registries — TS is source of truth. Tests parse `registry.ts` with regex to detect drift; follow the same pattern whenever a new field is added to `ProfileFeature`.
## Definition of done (per feature) ## Definition of done (per feature)
@@ -78,45 +89,162 @@ docs/ architecture notes, ADRs, API specs
## AI stack ## AI stack
oO generates tips with an LLM and ranks them with a bandit. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change. oO generates tips through a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM assembles them into one tip. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
| Alias | Model | Used by | | Alias | Model | Used by |
|-------|-------|---------| |-------|-------|---------|
| `tip-generator` | qwen2.5:1.5b (default) | `ml/serving` tip generation | | `tip-generator` | qwen2.5:1.5b (default) | `ml/serving` tip generation |
| `embedder` | nomic-embed-text | task clustering, dedup | | `embedder` | nomic-embed-text | task clustering (after LLM enrichment), dedup |
| `judge` | claude-haiku-4-5 (cloud, eval only) | offline sim | | `judge` | claude-haiku-4-5 (cloud, eval only) | offline sim |
Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap host, `http://host.docker.internal:11434` from containers). Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap host, `http://host.docker.internal:11434` from containers).
Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias. Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.
**LLM tip generation pipeline:** All `httpx` calls in `ml/` must use `trust_env=False` to bypass the system proxy — same rule as `bw` and curl. Pattern: `httpx.Client(trust_env=False, timeout=N)`.
1. `ml/features/context.py` assembles user signals → structured prompt context
2. `POST /generate` in `ml/serving` calls LiteLLM → returns `TipCandidate[]` MLflow container-to-container calls: always pass `host_header="localhost"` to `MLflowClient` — MLflow's `--allowed-hosts` rejects `Host: mlflow` (the container DNS name) with 403. Auth credential is `MLFLOW_ADMIN_PASSWORD`. MLflow REST API lives at the origin root, not under the `/mlflow` UI prefix.
3. Bandit policy in `ml/serving` scores + ranks candidates
4. Best candidate returned as tip; reaction closes the online reward loop ### MLflow API versions — runs vs traces
MLflow uses **two API versions** — use the right one or you'll get 405:
| What | API prefix | Example |
|------|-----------|---------|
| Runs, experiments, metrics | `/api/2.0/mlflow/` | `runs/search`, `experiments/list` |
| Traces (LLM observability) | `/api/3.0/mlflow/traces/` | `traces/{trace_id}` |
**Experiment IDs:** `3` = oO/serving. Artifacts stored as run tags prefixed `artifact:<path>`.
### Querying from the host shell
Always strip the proxy and pass `Host: localhost` (no port — `localhost:5000` fails the DNS-rebinding check).
```bash
# Search recent runs (experiment 3)
env -u HTTPS_PROXY -u HTTP_PROXY -u ALL_PROXY -u https_proxy -u http_proxy -u all_proxy \
curl -s -H "Host: localhost" -u "admin:${MLFLOW_ADMIN_PASSWORD}" \
-X POST http://localhost:5000/api/2.0/mlflow/runs/search \
-H "Content-Type: application/json" \
-d '{"experiment_ids":["3"],"max_results":5,"order_by":["start_time DESC"]}'
# Get a trace by ID (note: /api/3.0/, not /api/2.0/)
env -u HTTPS_PROXY -u HTTP_PROXY -u ALL_PROXY -u https_proxy -u http_proxy -u all_proxy \
curl -s -H "Host: localhost" -u "admin:${MLFLOW_ADMIN_PASSWORD}" \
http://localhost:5000/api/3.0/mlflow/traces/tr-<trace_id> | python3 -m json.tool
```
The trace response includes `trace_metadata.mlflow.traceInputs/Outputs`, `trace_metadata.mlflow.trace.sizeStats` (num_spans), and `tags.mlflow.traceName`.
### Getting spans (Python client from inside the container)
The REST API has **no endpoint for spans**`/api/3.0/mlflow/traces/{id}/spans` returns 404. Use the Python client inside `oo-ml-serving-1`:
```bash
docker exec oo-ml-serving-1 python3 -c "
import mlflow, json, os
mlflow.set_tracking_uri('http://mlflow:5000')
os.environ['MLFLOW_TRACKING_USERNAME'] = 'admin'
os.environ['MLFLOW_TRACKING_PASSWORD'] = os.environ.get('MLFLOW_ADMIN_PASSWORD', '')
client = mlflow.tracking.MlflowClient()
trace = client.get_trace('tr-<trace_id>')
for span in trace.data.spans:
print(span.name, '| parent:', span.parent_id, '| status:', span.status)
print(' inputs:', json.dumps(span.inputs)[:200])
print(' outputs:', json.dumps(span.outputs)[:200])
print(' attrs:', span.attributes)
"
```
### Span structure for a tip generation trace
A healthy `recommend` trace has 3 spans:
| Span | Type | Parent | Key attributes |
|------|------|--------|---------------|
| `recommend` | CHAIN | (root) | `agent_count`, `latency_ms`; inputs include `agent_ids` list |
| `build_context` | TOOL | recommend | `agent_count`, `task_count`, `science_destiny` |
| `llm_orchestrator` | LLM | recommend | `prompt_tokens`, `completion_tokens`, `model`, `attempts` |
### Diagnosing "no agents in trace"
If the trace shows `agent_ids: []` and `agent_count: 0` in the root span, and the orchestrator prompt says *"No pre-computed agent context available"*, it means the recommender found zero eligible snippets at request time. Causes:
1. **Agent compute hasn't run** — no `agent_outputs` rows for this user yet
2. **Snippets expired** — TTL elapsed since last compute
3. **Eligibility filter dropped all agents** — none passed the manifest-driven check
Diagnose with:
```bash
docker exec oo-api-1 psql "$DATABASE_URL" -c \
"SELECT agent_id, computed_at, expires_at FROM agent_outputs WHERE user_id='<uid>' ORDER BY computed_at DESC LIMIT 10;"
```
**Multi-agent tip generation pipeline (ADR-0013):**
1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
3. `POST /recommend` in `ml/serving` assembles the orchestrator prompt (`v4-orchestrator`) and calls LiteLLM via the `tip-generator` alias
4. Returned tip is logged in `tip_scores` with the contributing agent set; reaction is logged for observability (no bandit reward loop)
## Current phase ## Current phase
**M1 shipped (core + admin). M2 (AI tips) in progress.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`. **M1 shipped (core + admin). M2 (AI tips) in progress.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`.
Recent completions (M1 add-on): Recent completions:
- ADR-0012ε-greedy v2 promotion (profile features, D=12) — 2026-04-26 - ADR-0013multi-agent recommendation: pre-computed agent snippets + orchestrator LLM (replaces ε-greedy bandit) — 2026-05-01
- Offline sim framework + MLflow integration — shipped in M1 add-on
- Token-based admin auth for Playwright/CI — secured auth boundary
Active work (M2):
- Signal abstraction for multi-source support (#78)
- Per-user feature freshness SLAs (#61, ADR-0011 phase B)
- LLM context assembler + tip generation scaffold (#79, #88) - LLM context assembler + tip generation scaffold (#79, #88)
- Model benchmarking for tip generation (#93) - Model benchmarking for tip generation (#93, #95)
- Admin UX refinements: feedback consolidation, settings placement (#100102) - Admin UX refinements: feedback consolidation, settings placement (#100102)
- ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)
- ADR-0014 complete: unified Profile schema + backfill, manifest plumbing, `/api/profile` read-through, registry-driven eligibility filter, inference framework + per-agent inference, legacy consent column drop — 2026-05-05
- Rich per-agent inference for all four active agents (#112, #114, #115, #116) — 2026-05-06: quiet/peak hours (time-of-day), z-score baseline (momentum), p50 lateness + project realness (overdue-task), adaptive lookback + weekly/daily cycles (recent-patterns)
- Semantic task clustering via nomic-embed-text + LLM enrichment (#97, #113, #129) — 2026-05-12: `ml/agents/clustering.py`; titles expanded via `tip-generator` before embedding; persistent cache in `task_enrichments` table; recompute gated on task-list hash change; focus-area v3.0.0 outputs all clusters with enriched descriptions
- Per-user feature freshness SLAs (#61) — 2026-05-06: `invalidated_by` mirrored into `ProfileFeature`; drift-detection test added
- MLflow tracing added to `ml/serving` for all agent calls — 2026-05-06: `ml/serving/mlflow_client.py`; activated by `MLFLOW_TRACKING_URI=http://mlflow:5000` (default in compose `full` profile); requires `--profile mlops` for the MLflow container. Issue #118 (M4) tracks removal from production critical path.
Active work (M2): *(all M2 items complete — see README for M3 planning)*
## ADR-0014 endpoint map (as of step 6)
| Endpoint | Purpose |
|----------|---------|
| `GET /api/profile` | Read-through: user globals + prefs (by scope) + consents + contexts |
| `PATCH /api/profile/prefs/:scope` | Upsert user_preferences rows (source='user') |
| `PATCH /api/profile/consents` | Grant / revoke consent keys |
| `PATCH /api/profile/contexts` | Create / activate / deactivate named contexts |
| `GET /api/agents/registry` | Manifest list (proxy to ml/serving; 60 s cache) |
| `POST /api/agents/:agentId/compute` | Internal: run agent compute for (user, agent) |
| `POST /agents/{agent_id}/infer` *(ml/serving)* | Run inference framework → `{inferred_prefs}` |
## Inference framework (ADR-0014 §3)
Lives in `ml/agents/inference/`. `run_inference(manifest, history)` evaluates all `InferredParam` entries in the manifest and returns `{key: value}`. Rules:
- Below `min_history` → emit `cold_start_default`
- `infer()` error → emit `cold_start_default` (never crashes)
- Results written to `user_preferences` with `source='inferred'`; keys with `source='user'` are never overwritten
Per-agent inferred params (all live in `ml/agents/<name>.py`):
| Agent | Inferred params | Notes |
|-------|----------------|-------|
| `time-of-day` | `preferred_hour`, `quiet_start`, `quiet_end`, `peak_hours`, `tz` | Quiet window = longest below-baseline hour run; peak = top-quartile done hours; tz cold-start only (from auth provider) |
| `momentum` | `engagement_trend`, `baseline_completions_per_day`, `stdev` | Baseline = 28d rolling mean done/day; snippet uses z-score language |
| `overdue-task` | `lateness_tolerance_days`, `project_realness` | Tolerance = p50 lateness from TaskCompletion history; realness = project median vs global median |
| `recent-patterns` | `lookback_days`, `weekly_cycle`, `daily_cycle` | Lookback sized to ≥30 done events; cycles use peak-to-mean ratio; snippet hints when strength > 0.5 |
| `focus-area` | *(none)* | No inferred params. Clusters tasks via LLM-enriched embeddings and outputs all areas with expanded descriptions. Recomputes only when task list changes (hash-gated). |
`UserHistory` carries both `events: list[FeedbackEvent]` and `task_completions: list[TaskCompletion]`. `AgentInferRequest` (ml/serving) accepts `task_completions: list[dict]` alongside `feedback_history`.
`min_history` is checked against `len(history.events)` (feedback events), **not** `task_completions`. Agents that infer from completions should set `min_history=0` and guard inside `infer()`.
## What NOT to do ## What NOT to do
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand. - Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
- Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships. - Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (bandit, LLM, hybrid), keep contract. - Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (multi-agent orchestrator today, future LLM/hybrid variants), keep contract.
- Don't hardcode the agent list. The orchestrator is registry-driven (ADR-0014); adding/removing an agent is a manifest change in `ml/agents/<id>/`, never a recommender edit.
- Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002). - Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
- Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003). - Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).
- Don't call LLMs directly from application code. All LLM calls go through `ml/serving` (Python) via `LITELLM_URL`. The TS recommender never holds a model name. - Don't call LLMs directly from application code. All LLM calls go through `ml/serving` (Python) via `LITELLM_URL`. The TS recommender never holds a model name.

174
README.md
View File

@@ -69,7 +69,7 @@ docs/ architecture, adr, api
## AI stack ## AI stack
oO is AI-native: the recommender's job is to **rank**, not to write. An LLM generates candidate tips from the user's context; the bandit picks the best one. oO is AI-native. Domain-specialized agents pre-compute snippets describing the user's state from one angle each; an orchestrator LLM reasons over the assembled snippets and produces one tip (ADR-0013). The orchestrator iterates a registry, not a hardcoded list (ADR-0014) — adding an agent is a manifest change, nothing else.
### Three-tier layout ### Three-tier layout
@@ -79,25 +79,28 @@ oO is AI-native: the recommender's job is to **rank**, not to write. An LLM gene
| Routing | **LiteLLM** | Unified OpenAI-compatible API; model aliases; cloud fallback | `llm.alogins.net` (Agap shared) | | Routing | **LiteLLM** | Unified OpenAI-compatible API; model aliases; cloud fallback | `llm.alogins.net` (Agap shared) |
| Testing | **OpenWebUI** | Prompt iteration, model comparison, manual evals | `ai.alogins.net` (Agap shared) | | Testing | **OpenWebUI** | Prompt iteration, model comparison, manual evals | `ai.alogins.net` (Agap shared) |
### Tip generation pipeline (Phase 2 target) ### Tip generation pipeline (ADR-0013, M2)
``` ```
User signals ──▶ Context assembler ──▶ LiteLLM ──▶ Ollama (local) User signals Pre-compute agents (every 15 min)
(tasks, calendar, (ml/features/) (routing) or cloud fallback (tasks, calendar, ──▶ ml/agents/{overdue-task, momentum, ──▶ agent_outputs
patterns, time) patterns, time) time-of-day, recent-patterns, (per-agent TTL)
focus-area, ...}
Eligibility filter: required consents + │
active context + per-user prefs (ADR-0014) ◀──┘
N typed TipCandidates Orchestrator prompt (`v4-orchestrator`)
{content, kind, model, = global prefs + active context + snippets
prompt_version, confidence}
Bandit policy (ml/serving) LiteLLM ──▶ Ollama (local) / cloud fallback
scores + ranks candidates
Best tip shown Tip shown to user
User reaction (done / snooze / dismiss + dwell) User reaction (done / snooze / dismiss + dwell)
Online bandit update + prompt_version tracking Logged to tip_feedback for observability
(no online ML reward loop — see ADR-0013)
``` ```
**Why LiteLLM as gateway:** All LLM calls use a single `LITELLM_URL` env var. Swapping from qwen2.5 to llama3.2, or routing a fraction to Claude for A/B, is a config change in LiteLLM — zero code change in oO. The model name in `tip_scores` tells you exactly which model produced each tip. **Why LiteLLM as gateway:** All LLM calls use a single `LITELLM_URL` env var. Swapping from qwen2.5 to llama3.2, or routing a fraction to Claude for A/B, is a config change in LiteLLM — zero code change in oO. The model name in `tip_scores` tells you exactly which model produced each tip.
@@ -118,158 +121,31 @@ All model calls route through **LiteLLM** at `llm.alogins.net` (or `LITELLM_URL`
## Roadmap ## Roadmap
Issues and open work are tracked in [Gitea milestones](http://localhost:3000/alvis/oO/milestones). Pick an issue, check its milestone (= phase), read the service's `README.md`, ship.
### Phase 0 — Walking skeleton *(M0)* ✓ shipped ### Phase 0 — Walking skeleton *(M0)* ✓ shipped
Goal: a single user signs in with Google, connects Todoist, and sees one random Todoist task on a black page. Deletion works. Single user signs in with Google, connects Todoist, sees one random task on a black page. Deletion works. Auth, integrations, recommender stub, PWA, feedback loop, ToS/privacy, metrics baseline.
- [x] Monorepo scaffold, docker-compose dev env
- [x] `auth` — Google OAuth2/PKCE via openid-client v6; session cookie; Next.js middleware guard
- [x] `integrations/todoist` — OAuth2 flow, token stored in DB, disconnect supported
- [x] `recommender` with `RandomPolicy`; stable `POST /recommend` contract; 30s task cache
- [x] `apps/web` — sign-in, connect, tip pages; PWA manifest + icons
- [x] Feedback: `done / snooze / dismiss`; reward inferred from dwell-time (`inferReward`); marks task complete in Todoist
- [x] Deploy modular monolith to Agap VM via Caddy at `o.alogins.net`
- [x] ToS + Privacy Policy pages (`/legal/terms`, `/legal/privacy`); implicit consent on sign-in
- [x] Account deletion: revokes tokens, purges data, soft-deletes profile; button on /connect
- [x] Metrics baseline: `tip_views` table (tip served) + `tip_feedback` (reactions) — activation + reaction rate queryable
### Phase 1 — Real signal + in-the-moment delivery *(M1)* ✓ shipped ### Phase 1 — Real signal + in-the-moment delivery *(M1)* ✓ shipped
Goal: tips are picked, not drawn from a hat — and they arrive at the right moment on the web. Tips are picked, not drawn from a hat. Event bus, Todoist sync, task features, ε-greedy policy (v1 + v2), web push, NATS JetStream bridge, shadow-policy registry, offline sim framework, per-user profile features, admin + ML ops console (`apps/admin`).
- [x] Event bus scaffold: typed in-process EventEmitter with 500-event ring buffer; subjects match future NATS JetStream — swap is mechanical
- [x] Todoist sync emits `signals.task.synced`; tip served/feedback emit `signals.tip.*`
- [x] Features extracted per task: `is_overdue`, `task_age_days`, `priority`; context: `hour_of_day`, `day_of_week`
- [x] **ε-greedy v1** (d=7, ε=0.10, day-of-week sin/cos features); per-user state persisted to disk
- [x] **ε-greedy v2** (d=12, profile features: completion rate, dismiss rate, dwell, preferred hour, tip volume) in shadow; promoted to active policy (ADR-0012)
- [x] `RemotePolicy` in recommender: calls ml/serving, falls back to RandomPolicy on timeout/error; logs explainability to `tip_scores`
- [x] Feedback loop: dwell-time inferred reward (`inferReward`) → online model update; `done` in 15 s2 min = +1.0 (magic zone)
- [x] Offline simulation framework (`ml/experiments/sim`): rule/LLM/claude-code judges, two-policy comparison, results persisted to `sim_runs` + `sim_events`
- [x] **Web Push** (VAPID): SW, subscribe/unsubscribe API, "notify me" button on tip page
- [x] Shadow-policy registry: run N shadow policies per request, log picks without serving them (#56)
- [x] NATS JetStream bridge — durable `signals.>` and `feedback.>` streams; in-process bus stays the source of truth, every publish bridges out (#21, shipped)
- [x] Per-user profile features (completion rate, dismiss rate, dwell, preferred hour, tip volume) — event-driven, JIT invalidation (#81)
- [ ] Quiet-hours + dedupe for push delivery
- [ ] Delayed rewards: tasks completed directly in Todoist (requires webhook from Todoist)
- [ ] Apple OAuth (deferred to M3)
#### M1 add-on — Admin & ML Ops Console *(fully shipped)* ### Phase 2 — AI tips + multi-source signals *(M2)* ✓ shipped
Tips are AI-generated from user context. Multi-agent pipeline (ADR-0013): five pre-compute agents (`overdue-task`, `momentum`, `time-of-day`, `recent-patterns`, `focus-area`) emit prompt snippets; orchestrator LLM produces one tip. Unified Profile + agent registry + auto-inference framework (ADR-0014). LLM output validation + fallback. LiteLLM gateway, model benchmarking, prompt research, MLflow tracing.
oO is ML-heavy. Without a cockpit, every model change ships blind. This console is the team's single pane for users, signals, features, models, experiments, and tip outcomes — with the ability to *act* on them (revoke a token, replay an event, promote a model, reset a bandit).
**Framework pick — `apps/admin` on Next.js 15 + Tremor + shadcn/ui.** Analytics-first UI for an analytics-first product, stays on our existing TS/React/Tailwind stack, reuses `packages/shared-types`, `sdk-js`, and the Auth.js session. Specialized ML tooling (MLflow) runs as a **separate external service** linked from the admin shell; Grafana panels are embedded.
| Layer | Tool | Why |
|-------|------|-----|
| App shell | **Next.js 15** (new `apps/admin`) | Same stack as `apps/web`; reuses auth, types, SDK |
| Dashboards / charts | **[Tremor](https://tremor.so)** | Analytics-first React + Tailwind — KPI cards, time-series, categorical, heatmaps |
| CRUD primitives | **[shadcn/ui](https://ui.shadcn.com)** | Copy-paste Radix components; forms, dialogs, command palette |
| Heavy grids | **[TanStack Table v8](https://tanstack.com/table)** | Sortable / paginated / virtualized tables (events, users, tips) |
| Extra charts | **[Recharts](https://recharts.org)** / **[visx](https://airbnb.io/visx)** | Fallbacks where Tremor falls short (e.g. force graphs, Sankey) |
| Model registry / experiments | **[MLflow](https://mlflow.org)** *(external — `o.alogins.net/mlflow`)* | Experiment tracking, artifact browser, model registry; own basic-auth |
| Infra metrics | **[Grafana](https://grafana.com)** *(embedded panels)* | One ops source of truth |
| Ad-hoc analysis | **[Marimo](https://marimo.io)** reactive notebooks | Python-native for the ML side; launch-out link |
| AuthZ | `profile.role='admin'` + Next.js middleware | Reuses existing session; no new auth surface |
**Rejected alternatives (so we don't re-litigate):**
- *Retool / AppSmith* — low-code speed, but admin logic leaves our repo; weak analytics affordances for an analytics product
- *Streamlit / Gradio / Dash* — Python-first; thin RBAC and routing; splits our frontend stack in two
- *React-admin / Refine.dev* — strong CRUD scaffolding, but analytics/ML views feel bolted on; we'd rebuild Tremor-style dashboards ourselves
- *Superset / Metabase as the admin surface* — excellent for BI, poor for operational **writes** (revoke, replay, promote). Plan: **adopt Superset in M4** for BI alongside batch pipelines; ship a read-only SQL widget inside admin for now
**Build sequence:**
1. [x] **ADR-0006** — record the framework choice + "embed, don't rebuild" rule for MLflow/Grafana
2. [x] **Scaffold**`apps/admin` with Next.js 15, Tailwind, Tremor; deploy behind Caddy at `admin.o.alogins.net`
3. [x] **RBAC**`role` column on `users`; admin-only Next.js middleware; seed first admin via `ADMIN_SEED_EMAIL` env; `admin_actions` audit-log table
4. [x] **Overview dashboard** — DAU/WAU KPI cards, tips served, reaction breakdown, activation funnel
5. [x] **User explorer** — list + detail page: identity, consents, integrations, last tip, reward history; revoke-integration + reset-bandit + rebuild-profile actions
6. [x] **Event stream viewer** — live tail of `signals.*` with filters by subject/user/time; same UI when the bus swaps to NATS
7. [x] **Features page** — features sent to `ml/serving` per scoring call; per-user profile features with freshness; diff across time
8. [x] **Tips page** — tips served, scored, feedback reactions with policy/model breakdown
9. [x] **Reward analytics** — reaction distribution over time; per-policy / per-model / per-prompt-version compare; slice by `hour_of_day`, `priority`, cohort
10. [x] **Data quality widget** — missing-feature rate, stale-token rate, daily completeness heatmap; per-feature freshness SLA status
11. [x] **Ops actions** — revoke token (Users page), rebuild profile, reset bandit, enable/disable shadow policies; every action audit-logged
12. [x] **Health rollup**`/admin/health` surfaces api, ml/serving, SQLite, event-bus, MLflow; auto-refreshes every 15s
13. [x] **Read-only SQL runner** — SELECT-only runner against SQLite + saved queries (sunsets to Superset in M4)
14. [x] **Offline simulation runner** — launch `ml/experiments/sim` from admin UI; track sim runs, judge, policy comparison
15. [x] **Token-based admin auth**`POST /api/auth/token` for Playwright/CI; `ADMIN_TOKEN` env var (#105)
16. [x] **Docs pages** — admin documentation and runbooks inline
### Phase 2 — AI tips + multi-source signals *(M2)* in progress
Goal: tips are AI-generated from user context, not just raw Todoist tasks. Multiple signal sources feed a generalized pipeline. Research-intensive milestone.
**AI infrastructure (unblock everything else):**
- [ ] `ai` compose profile — Ollama + LiteLLM for local dev; env vars `OLLAMA_URL` / `LITELLM_URL` (#86)
- [ ] AI gateway — wire `ml/serving` to LiteLLM; model aliases `tip-generator` + `embedder` (#87)
**AI tip generation pipeline:**
- [x] Context assembler — user signals + feature store → structured prompt context (`ml/features/context.py`); skeleton implemented
- [ ] Tip generator endpoint — `POST /generate` in `ml/serving`; LLM → N typed `TipCandidate` objects (#79)
- [ ] `TipCandidate` shared schema — `{content, kind, source, model, prompt_version, confidence}`; update recommender pipeline (#89)
- [ ] LLM output validation + retry — JSON schema gate, clarification retry (2×), fallback to task-based (#90)
- [ ] Prompt versioning — `prompt_version` + `model` columns in `tip_scores`; content-hash invalidation (#91)
- [x] LLM tip quality dashboard — reaction breakdown by model / prompt_version in `/admin/reward-analytics` (#92)
**Evaluation & model selection:**
- [ ] Model benchmark — compare qwen2.5:7b / llama3.2:3b / gemma3:4b via offline sim + LLM judge (#93)
- [ ] LLM prompt research — persona design, context injection strategies, few-shot examples (#84)
**Pipeline architecture:**
- [x] Signal source abstraction — `SignalSource` interface for Todoist + extensible design (#78)
- [ ] Generalized recommendation pipeline — candidate → rank → render stages (#80)
- [x] Feature registry + user profile builder — centralized features, persistent profiles, event-driven invalidation (#81)
- [ ] Tip kind system — task, advice, insight, reminder with kind-aware UI + rewards (#82)
**Policy research:**
- [ ] Next-gen policies — Thompson sampling, neural bandits, hybrid transfer learning (#83)
**Integrations & infra (carried from M1):**
- [ ] Apple OAuth (#7)
- [x] NATS JetStream replacing in-process bus (#21) — adapter ships in `services/api/src/events/nats.ts`; in-proc bus is the producer, JetStream is the durable mirror
- [x] Todoist sync via events (#22) — background scheduler in `services/api/src/signals/scheduler.ts` emits `signals.task.synced` every `TODOIST_SYNC_INTERVAL_MS`; on-demand fetch remains as freshness fallback
- [x] Event schema registry + protobuf CI gate (#54) — buf lint/breaking checks on every PR
- [x] Per-user freshness SLAs for features (#61) — context-feature (JIT) vs profile-feature (batched) spec in ADR-0011; CONTEXT_FEATURES in ml/features/context.py
- [x] Observability (#18) — structured logs via pino, W3C trace IDs, Sentry hooks, trace correlation end-to-end
- [ ] CI skeleton (#3), E2E tests (#20)
**Bugs & UX (fix before new features):**
- [x] TipFeedback type mismatch (#73)
- [x] Todoist token refresh (#74) — OAuth token auto-refresh on 401
- [x] Reward fire-and-forget (#75) — retry logic + logging
- [x] Data retention purge (#76) — daily purge of 30-day-old tip_scores/tip_feedback
- [x] Port mismatch (#77) — fixed in docker-compose + env var config
- [ ] UX refinements (#100102) — "done/snooze/dismiss" feedback only, config page UI, settings gear button
### Phase 3 — Native mobile *(M3)* ### Phase 3 — Native mobile *(M3)*
- [ ] iOS app (SwiftUI) with APNs push iOS (SwiftUI + APNs) and Android (Compose + FCM). `notifier` service gains APNs + FCM channels. Auth migrated from Auth.js to dedicated OIDC provider. Decide-and-deliver scheduler. See [M3 milestone](http://localhost:3000/alvis/oO/milestone/3).
- [ ] Android app (Compose) with FCM push
- [ ] `notifier` gains APNs + FCM channels, per-device rate limits
- [ ] Migrate auth from Auth.js to dedicated OIDC provider (trigger from ADR-0004)
- [ ] Consolidate MLflow behind shared OIDC (SSO for all internal services)
- [ ] Decide-and-deliver scheduler: per-user "is this tip worth interrupting now?" threshold
### Phase 4 — MLOps at scale *(M4)* ### Phase 4 — MLOps at scale *(M4)*
- [x] MLflow deployed as external service (`mlops` compose profile); own auth; health check integrated Retraining pipeline, feature-to-prompt batch jobs, prompt optimization loop, LLM fine-tuning on reaction signals, modular-monolith import-boundary lint, online experiments framework, drift monitoring. See [M4 milestone](http://localhost:3000/alvis/oO/milestone/4).
- [ ] Write first retraining pipeline + first MLflow experiment logging from `ml/serving` + JetStream consumers (#98)
- [ ] Feature-to-prompt pipeline — nightly batch job materializes context for LLM; cuts inline latency (#94)
- [ ] Prompt optimization loop — sim A/B → MLflow experiment → human-approved promotion (#95)
- [ ] LLM fine-tuning — tip reactions as training signal; LoRA on base model; MLflow tracks runs (#96)
- [ ] Embedding-based task clustering — `nomic-embed-text` for dedup + user pattern features (#97)
- [ ] Modular-monolith packaging + import-boundary lint (#47)
- [ ] Consolidate MLflow auth into shared OIDC provider (tracked as M3 issue #85)
- [ ] Shadow → A/B → launch pipeline as first-class in MLflow
- [ ] Online experiments framework: deterministic assignment + bandit policies alongside fixed-split A/B
- [ ] Cross-user collaborative features (opt-in only); cohort slicing; fairness checks
- [ ] Drift monitoring (feature + prediction + reward drift); model cards per LLM version
### Phase 5 — Production hardening *(M5)* ### Phase 5 — Production hardening *(M5)*
- [ ] Audit logging, rotation of provider tokens + internal signing keys Audit logging, key rotation, k3s → k8s, multi-region, public integration SDK, billing. See [M5 milestone](http://localhost:3000/alvis/oO/milestone/5).
- [ ] **k3s** on existing VM, then k8s + HPA once multi-node justified (no cliff)
- [ ] Multi-region failover, Postgres PITR, event-bus mirroring
- [ ] Public integration SDK; sandbox tenancy for third-party connectors
- [ ] Billing + subscription tiers
--- ---
## Contributing ## Contributing
This repo is split into independent modules; most tickets belong to exactly one. Pick an issue, check its milestone (= phase), read the service's `README.md`, ship. This repo is split into independent modules; most tickets belong to exactly one. Pick an issue from [Gitea](http://localhost:3000/alvis/oO/issues), read the service's `README.md`, ship.
Conventions and per-service guidance live in [`CLAUDE.md`](CLAUDE.md). Conventions and per-service guidance live in [`CLAUDE.md`](CLAUDE.md).

View File

@@ -1,32 +1,17 @@
'use client'; 'use client';
import { useEffect, useState } from 'react'; import { useState } from 'react';
import { AdminShell } from '@/components/AdminShell'; import { AdminShell } from '@/components/AdminShell';
import { getPolicies, togglePolicy, replaySignal, PolicyInfo } from '@/lib/api'; import { replaySignal } from '@/lib/api';
const VALID_SUBJECTS = ['signals.tip.served', 'signals.tip.feedback', 'signals.task.synced']; const VALID_SUBJECTS = ['signals.tip.served', 'signals.tip.feedback', 'signals.task.synced'];
export default function OpsPage() { export default function OpsPage() {
const [policies, setPolicies] = useState<PolicyInfo[]>([]);
const [replaySubject, setReplaySubject] = useState(VALID_SUBJECTS[0]); const [replaySubject, setReplaySubject] = useState(VALID_SUBJECTS[0]);
const [replayPayload, setReplayPayload] = useState('{\n "userId": "",\n "tipId": ""\n}'); const [replayPayload, setReplayPayload] = useState('{\n "userId": "",\n "tipId": ""\n}');
const [msg, setMsg] = useState(''); const [msg, setMsg] = useState('');
const [error, setError] = useState(''); const [error, setError] = useState('');
useEffect(() => {
getPolicies().then((r) => setPolicies(r.policies)).catch(() => {});
}, []);
const handleToggle = async (name: string, active: boolean) => {
try {
await togglePolicy(name, active);
setPolicies((prev) => prev.map((p) => p.name === name ? { ...p, active } : p));
setMsg(`Policy "${name}" ${active ? 'enabled' : 'disabled'}.`);
} catch (e: any) {
setError(e.message);
}
};
const handleReplay = async () => { const handleReplay = async () => {
let payload: Record<string, unknown>; let payload: Record<string, unknown>;
try { try {
@@ -50,36 +35,14 @@ export default function OpsPage() {
<div> <div>
<h1 className="text-xl font-semibold">Ops</h1> <h1 className="text-xl font-semibold">Ops</h1>
<p className="text-sm text-gray-500 mt-1"> <p className="text-sm text-gray-500 mt-1">
Live system controls toggle shadow recommendation policies, replay past signals Live system controls replay past signals for backfill or debugging, and find
for backfill or debugging, and find per-user actions (token revoke, bandit reset) per-user actions (token revoke) on the{' '}
on the <a href="/users" className="text-indigo-400 hover:underline">Users page</a>. <a href="/users" className="text-indigo-400 hover:underline">Users page</a>.
</p> </p>
</div> </div>
{msg && <p className="text-green-400 text-sm">{msg}</p>} {msg && <p className="text-green-400 text-sm">{msg}</p>}
{error && <p className="text-red-400 text-sm">{error}</p>} {error && <p className="text-red-400 text-sm">{error}</p>}
{/* Policy toggles */}
<section className="space-y-3">
<h2 className="text-base font-medium text-gray-300">Policies</h2>
{policies.length === 0 ? (
<p className="text-gray-500 text-sm">No shadow policies registered. Shadow policies can be added to the recommender source.</p>
) : (
<div className="space-y-2">
{policies.map((p) => (
<div key={p.name} className="flex items-center justify-between bg-gray-900 border border-gray-800 rounded p-3">
<span className="text-sm text-gray-300 font-mono">{p.name}</span>
<button
onClick={() => handleToggle(p.name, !p.active)}
className={`px-3 py-1 rounded text-xs ${p.active ? 'bg-green-800 text-green-200' : 'bg-gray-800 text-gray-400'}`}
>
{p.active ? 'Active' : 'Disabled'}
</button>
</div>
))}
</div>
)}
</section>
{/* Replay signal */} {/* Replay signal */}
<section className="space-y-3"> <section className="space-y-3">
<h2 className="text-base font-medium text-gray-300">Replay signal</h2> <h2 className="text-base font-medium text-gray-300">Replay signal</h2>

View File

@@ -37,7 +37,7 @@ export function UsersTable() {
<table className="w-full text-sm"> <table className="w-full text-sm">
<thead className="bg-gray-900 border-b border-gray-800"> <thead className="bg-gray-900 border-b border-gray-800">
<tr> <tr>
{['Email', 'Name', 'Role', 'Consent', 'Joined', 'Status'].map((h) => ( {['ID', 'Email', 'Name', 'Role', 'Consent', 'Joined', 'Status'].map((h) => (
<th <th
key={h} key={h}
className="text-left px-4 py-2.5 text-xs text-gray-500 font-medium uppercase tracking-wide" className="text-left px-4 py-2.5 text-xs text-gray-500 font-medium uppercase tracking-wide"
@@ -50,13 +50,13 @@ export function UsersTable() {
<tbody className="divide-y divide-gray-800"> <tbody className="divide-y divide-gray-800">
{loading ? ( {loading ? (
<tr> <tr>
<td colSpan={6} className="px-4 py-6 text-center text-gray-500"> <td colSpan={7} className="px-4 py-6 text-center text-gray-500">
Loading Loading
</td> </td>
</tr> </tr>
) : users.length === 0 ? ( ) : users.length === 0 ? (
<tr> <tr>
<td colSpan={6} className="px-4 py-6 text-center text-gray-500"> <td colSpan={7} className="px-4 py-6 text-center text-gray-500">
No users yet. No users yet.
</td> </td>
</tr> </tr>
@@ -66,6 +66,9 @@ export function UsersTable() {
key={u.id} key={u.id}
className="hover:bg-gray-900 transition-colors cursor-pointer" className="hover:bg-gray-900 transition-colors cursor-pointer"
> >
<td className="px-4 py-2.5 text-gray-500 text-xs font-mono tabular-nums">
{u.id.slice(0, 8)}
</td>
<td className="px-4 py-2.5"> <td className="px-4 py-2.5">
<Link href={`/users/${u.id}`} className="hover:underline text-indigo-400"> <Link href={`/users/${u.id}`} className="hover:underline text-indigo-400">
{u.email} {u.email}

View File

@@ -91,10 +91,6 @@ export interface HealthStatus {
services: { name: string; status: string; latencyMs: number }[]; services: { name: string; status: string; latencyMs: number }[];
} }
export interface PolicyInfo {
name: string;
active: boolean;
}
export interface SavedQuery { export interface SavedQuery {
id: string; id: string;
@@ -223,16 +219,6 @@ export function getHealth() {
return apiFetch<HealthStatus>('/admin/health'); return apiFetch<HealthStatus>('/admin/health');
} }
export function getPolicies() {
return apiFetch<{ policies: PolicyInfo[] }>('/admin/policies');
}
export function togglePolicy(name: string, active: boolean) {
return apiFetch<{ ok: boolean }>(`/admin/policies/${name}/toggle`, {
method: 'POST',
body: JSON.stringify({ active }),
});
}
export function replaySignal(subject: string, payload: Record<string, unknown>) { export function replaySignal(subject: string, payload: Record<string, unknown>) {
return apiFetch<{ ok: boolean }>('/admin/replay-signal', { return apiFetch<{ ok: boolean }>('/admin/replay-signal', {

View File

@@ -4,8 +4,8 @@ import type { NextRequest } from 'next/server';
export async function middleware(req: NextRequest) { export async function middleware(req: NextRequest) {
const { pathname } = req.nextUrl; const { pathname } = req.nextUrl;
// Pass through the login page and API calls // Pass through the login page, forbidden page, and API calls
if (pathname.startsWith('/login') || pathname.startsWith('/api/')) { if (pathname.startsWith('/login') || pathname.startsWith('/forbidden') || pathname.startsWith('/api/')) {
return NextResponse.next(); return NextResponse.next();
} }

File diff suppressed because one or more lines are too long

View File

@@ -1,12 +1,27 @@
'use client'; 'use client';
import { useEffect, useState, useCallback } from 'react'; import { useEffect, useState, useCallback } from 'react';
import { getVapidPublicKey, subscribePush } from '@/lib/api'; import { getVapidPublicKey, subscribePush, getOrchestatorPrefs, updateOrchestratorPref } from '@/lib/api';
type PushState = 'idle' | 'subscribed' | 'denied'; type PushState = 'idle' | 'subscribed' | 'denied';
export default function ConfigPage() { export default function ConfigPage() {
const [pushState, setPushState] = useState<PushState>('idle'); const [pushState, setPushState] = useState<PushState>('idle');
const [scienceDestiny, setScienceDestiny] = useState(50);
const [prefSaving, setPrefSaving] = useState(false);
useEffect(() => {
getOrchestatorPrefs().then((prefs) => {
if (typeof prefs.science_destiny === 'number') setScienceDestiny(prefs.science_destiny);
}).catch(() => {});
}, []);
const handleScienceDestinyChange = useCallback(async (value: number) => {
setScienceDestiny(value);
setPrefSaving(true);
try { await updateOrchestratorPref('science_destiny', value); }
finally { setPrefSaving(false); }
}, []);
useEffect(() => { useEffect(() => {
if (typeof Notification !== 'undefined') { if (typeof Notification !== 'undefined') {
@@ -87,6 +102,41 @@ export default function ConfigPage() {
</div> </div>
</section> </section>
{/* Tip style */}
<section style={{ marginBottom: '2.5rem' }}>
<h3 style={{ fontSize: '0.75rem', letterSpacing: '0.12em', textTransform: 'uppercase', color: 'rgba(255,255,255,0.35)', marginBottom: '1rem', fontWeight: 400 }}>
Tip style
</h3>
<div style={{
border: '1px solid rgba(255,255,255,0.1)',
borderRadius: '0.75rem',
padding: '1.25rem 1.5rem',
}}>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'baseline', marginBottom: '0.875rem' }}>
<span style={{ fontSize: '0.85rem', fontWeight: 500 }}>Science</span>
<span style={{ fontSize: '0.7rem', color: 'rgba(255,255,255,0.25)' }}>
{prefSaving ? 'saving…' : scienceDestiny === 50 ? 'balanced' : scienceDestiny < 50 ? 'data-driven' : 'intuitive'}
</span>
<span style={{ fontSize: '0.85rem', fontWeight: 500 }}>Destiny</span>
</div>
<input
type="range"
min={0}
max={100}
value={scienceDestiny}
onChange={(e) => handleScienceDestinyChange(Number(e.target.value))}
style={{ width: '100%', accentColor: 'var(--white)', cursor: 'pointer' }}
/>
<div style={{ color: 'rgba(255,255,255,0.3)', fontSize: '0.7rem', marginTop: '0.75rem' }}>
{scienceDestiny < 30
? 'Tips lean on patterns and data'
: scienceDestiny > 70
? 'Tips lean on intuition and meaning'
: 'Tips balance logic and intuition'}
</div>
</div>
</section>
{/* Integrations */} {/* Integrations */}
<section> <section>
<h3 style={{ fontSize: '0.75rem', letterSpacing: '0.12em', textTransform: 'uppercase', color: 'rgba(255,255,255,0.35)', marginBottom: '1rem', fontWeight: 400 }}> <h3 style={{ fontSize: '0.75rem', letterSpacing: '0.12em', textTransform: 'uppercase', color: 'rgba(255,255,255,0.35)', marginBottom: '1rem', fontWeight: 400 }}>

View File

@@ -51,6 +51,8 @@ function ConnectPageInner() {
} }
const todoistConnected = isConnected('todoist'); const todoistConnected = isConnected('todoist');
const googleHealthConnected = isConnected('google-health');
const anyConnected = todoistConnected || googleHealthConnected;
return ( return (
<main style={{ minHeight: '100vh', padding: '4rem 2rem', maxWidth: '480px', margin: '0 auto' }}> <main style={{ minHeight: '100vh', padding: '4rem 2rem', maxWidth: '480px', margin: '0 auto' }}>
@@ -85,7 +87,6 @@ function ConnectPageInner() {
marginBottom: '1rem', marginBottom: '1rem',
}}> }}>
<div style={{ display: 'flex', alignItems: 'center', gap: '0.875rem' }}> <div style={{ display: 'flex', alignItems: 'center', gap: '0.875rem' }}>
{/* Todoist logomark */}
<svg width="28" height="28" viewBox="0 0 24 24" fill="none" aria-label="Todoist"> <svg width="28" height="28" viewBox="0 0 24 24" fill="none" aria-label="Todoist">
<rect width="24" height="24" rx="6" fill="#DB4035"/> <rect width="24" height="24" rx="6" fill="#DB4035"/>
<path d="M6 8.5L11 13l7-7" stroke="#fff" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"/> <path d="M6 8.5L11 13l7-7" stroke="#fff" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"/>
@@ -130,7 +131,65 @@ function ConnectPageInner() {
)} )}
</div> </div>
{todoistConnected && ( {/* Google Health card */}
<div style={{
border: '1px solid rgba(255,255,255,0.1)',
borderRadius: '0.75rem',
padding: '1.25rem 1.5rem',
display: 'flex',
alignItems: 'center',
justifyContent: 'space-between',
marginBottom: '1rem',
}}>
<div style={{ display: 'flex', alignItems: 'center', gap: '0.875rem' }}>
<svg width="28" height="28" viewBox="0 0 24 24" fill="none" aria-label="Google Health">
<rect width="24" height="24" rx="6" fill="#EA4335"/>
<path d="M12 6.5c0-1.1.9-2 2-2s2 .9 2 2-.9 2-2 2-2-.9-2-2z" fill="#fff"/>
<path d="M8 10.5c0-1.1.9-2 2-2s2 .9 2 2-.9 2-2 2-2-.9-2-2z" fill="#fff" opacity=".7"/>
<path d="M12 14.5c0 2.2-1.8 4-4 4s-4-1.8-4-4 1.8-4 4-4 4 1.8 4 4z" fill="#fff" opacity=".4"/>
<path d="M13 13.5c.5-1 1.5-1.7 2.5-1.7 1.7 0 3 1.3 3 3s-1.3 3-3 3c-1 0-1.9-.5-2.5-1.3" stroke="#fff" strokeWidth="1.5" strokeLinecap="round" fill="none"/>
</svg>
<div>
<div style={{ fontWeight: 500, fontSize: '0.9rem' }}>Google Health</div>
<div style={{ color: 'var(--gray)', fontSize: '0.75rem', marginTop: '0.1rem' }}>
{googleHealthConnected ? 'Connected' : 'Steps, sleep & activity'}
</div>
</div>
</div>
{googleHealthConnected ? (
<button
onClick={() => handleDisconnect('google-health')}
disabled={disconnecting === 'google-health'}
style={{
background: 'transparent',
border: '1px solid rgba(255,255,255,0.15)',
color: 'var(--gray)',
borderRadius: '0.375rem',
padding: '0.375rem 0.875rem',
fontSize: '0.8rem',
}}
>
{disconnecting === 'google-health' ? '…' : 'Disconnect'}
</button>
) : (
<a
href="/api/integrations/google-health/connect?redirectTo=/connect"
style={{
background: 'var(--white)',
color: 'var(--black)',
borderRadius: '0.375rem',
padding: '0.375rem 0.875rem',
fontSize: '0.8rem',
fontWeight: 500,
}}
>
Connect
</a>
)}
</div>
{anyConnected && (
<div style={{ marginTop: '3rem' }}> <div style={{ marginTop: '3rem' }}>
<a <a
href="/tip" href="/tip"

View File

@@ -29,6 +29,7 @@ export default function TipPage() {
const [visible, setVisible] = useState(false); const [visible, setVisible] = useState(false);
const holdTimer = useRef<ReturnType<typeof setTimeout> | null>(null); const holdTimer = useRef<ReturnType<typeof setTimeout> | null>(null);
const [pressed, setPressed] = useState(false); const [pressed, setPressed] = useState(false);
const [showReasoning, setShowReasoning] = useState(false);
useEffect(() => { useEffect(() => {
if (state === 'loading' || state === 'done') { if (state === 'loading' || state === 'done') {
@@ -39,16 +40,17 @@ export default function TipPage() {
} }
}, [state]); }, [state]);
const loadTip = useCallback(async () => { const loadTip = useCallback(async (recentTip?: string) => {
setVisible(false); setVisible(false);
setState('loading'); setState('loading');
try { try {
const rec = await getRecommendation(); const rec = await getRecommendation(recentTip);
if (!rec) { if (!rec) {
setState('empty'); setState('empty');
return; return;
} }
setTip(rec.tip); setTip(rec.tip);
setShowReasoning(false);
setState('tip'); setState('tip');
} catch (err: any) { } catch (err: any) {
console.error('[tip] loadTip error', err?.status, err?.message); console.error('[tip] loadTip error', err?.status, err?.message);
@@ -60,10 +62,11 @@ export default function TipPage() {
const react = async (action: 'done' | 'dismiss' | 'snooze') => { const react = async (action: 'done' | 'dismiss' | 'snooze') => {
if (!tip) return; if (!tip) return;
const snoozedContent = action === 'snooze' ? tip.content : undefined;
setVisible(false); setVisible(false);
setState('done'); setState('done');
await sendFeedback(tip.id, { action }); await sendFeedback(tip.id, { action });
setTimeout(() => loadTip(), 700); setTimeout(() => loadTip(snoozedContent), 700);
}; };
const onPointerDown = () => { const onPointerDown = () => {
@@ -168,7 +171,7 @@ export default function TipPage() {
All clear. All clear.
</p> </p>
<button <button
onClick={loadTip} onClick={() => loadTip()}
style={{ style={{
marginTop: '2rem', marginTop: '2rem',
background: 'transparent', background: 'transparent',
@@ -235,6 +238,81 @@ export default function TipPage() {
</> </>
)} )}
{/* Reasoning overlay */}
{showReasoning && tip?.rationale && (
<div
onClick={(e) => { e.stopPropagation(); setShowReasoning(false); }}
style={{
position: 'fixed',
inset: 0,
display: 'flex',
alignItems: 'flex-end',
justifyContent: 'center',
zIndex: 20,
padding: '0 0 5rem',
}}
>
<div
onClick={(e) => e.stopPropagation()}
style={{
background: 'rgba(20,20,20,0.96)',
border: '1px solid rgba(255,255,255,0.08)',
borderRadius: '0.875rem',
padding: '1.25rem 1.5rem',
maxWidth: '360px',
width: 'calc(100% - 3rem)',
}}
>
<p style={{
margin: 0,
fontSize: '0.7rem',
letterSpacing: '0.1em',
textTransform: 'uppercase',
color: 'rgba(255,255,255,0.3)',
marginBottom: '0.625rem',
}}>
Why this tip
</p>
<p style={{
margin: 0,
fontSize: '0.9rem',
fontWeight: 300,
lineHeight: 1.5,
color: 'rgba(255,255,255,0.75)',
}}>
{tip.rationale}
</p>
</div>
</div>
)}
{/* ? button — bottom left, shows reasoning */}
{(state === 'tip' || state === 'actions') && tip?.rationale && (
<button
onClick={(e) => { e.stopPropagation(); setShowReasoning((v) => !v); }}
aria-label="Why this tip"
style={{
position: 'fixed',
bottom: '1.5rem',
left: '1.5rem',
background: 'transparent',
border: 'none',
color: showReasoning ? 'rgba(255,255,255,0.5)' : 'rgba(255,255,255,0.15)',
fontSize: '0.85rem',
fontWeight: 400,
lineHeight: 1,
padding: '0.5rem',
cursor: 'pointer',
pointerEvents: 'auto',
zIndex: 10,
transition: 'color 0.2s ease',
fontFamily: 'inherit',
}}
>
?
</button>
)}
{/* Settings gear — bottom right */} {/* Settings gear — bottom right */}
<a <a
href="/config" href="/config"

View File

@@ -23,9 +23,12 @@ export async function getSession() {
return apiFetch<{ user: { id: string; email: string; name?: string; image?: string } | null }>('/auth/session'); return apiFetch<{ user: { id: string; email: string; name?: string; image?: string } | null }>('/auth/session');
} }
export async function getRecommendation(): Promise<RecommendResponse | null> { export async function getRecommendation(recentTip?: string): Promise<RecommendResponse | null> {
try { try {
return await apiFetch<RecommendResponse>('/recommend', { method: 'POST' }); return await apiFetch<RecommendResponse>('/recommend', {
method: 'POST',
body: JSON.stringify(recentTip ? { recent_tip: recentTip } : {}),
});
} catch (e: any) { } catch (e: any) {
if (e.status === 204 || e.status === 422) return null; if (e.status === 204 || e.status === 422) return null;
throw e; throw e;
@@ -81,3 +84,15 @@ export async function unsubscribePush(endpoint: string) {
body: JSON.stringify({ endpoint }), body: JSON.stringify({ endpoint }),
}); });
} }
export async function getOrchestatorPrefs(): Promise<Record<string, unknown>> {
const data = await apiFetch<{ prefs: Record<string, Record<string, unknown>> }>('/profile');
return data.prefs?.orchestrator ?? {};
}
export async function updateOrchestratorPref(key: string, value: unknown) {
return apiFetch<{ ok: boolean }>('/profile/prefs/orchestrator', {
method: 'PATCH',
body: JSON.stringify({ [key]: value }),
});
}

View File

@@ -1,7 +1,7 @@
# ADR-0007: ε-greedy v1 as the active recommendation policy # ADR-0007: ε-greedy v1 as the active recommendation policy
## Status ## Status
Accepted — 2026-04-16 Superseded by ADR-0013 — 2026-05-01
## Context ## Context

View File

@@ -1,6 +1,6 @@
# ADR-0012 — ε-greedy v2: profile features in the bandit (D=7→12) # ADR-0012 — ε-greedy v2: profile features in the bandit (D=7→12)
**Status:** Promoted **Status:** Superseded by ADR-0013 — 2026-05-01
**Date:** 2026-04-25 (accepted) / 2026-04-26 (promoted) **Date:** 2026-04-25 (accepted) / 2026-04-26 (promoted)
**Issue:** #99 **Issue:** #99

View File

@@ -0,0 +1,230 @@
# ADR-0014 — Unified Profile model + agent registry
**Status:** Proposed
**Date:** 2026-05-05
**Issues:** #30, #111, #112, #113, #114, #115, #116
**Supersedes (data model):** ADR-0013 (the agent set stands; this ADR replaces the implicit assumption that prefs/contexts/consents are hardcoded on `users`).
## Context
ADR-0013 introduced the multi-agent pipeline: N pre-compute agents emit
prompt snippets, an orchestrator LLM assembles them into a tip. The ADR
specified the `agent_outputs` table and the orchestrator contract, but
left several questions open:
1. **Where do user preferences live?** `users.consentGiven` is a single
boolean. There is no place for quiet hours, tone, allowed tip kinds,
or per-integration consent. Each new preference would mean another
typed column on `users` — and worse, every new agent needs its own
tunable parameters (focus areas, momentum baseline, lateness tolerance)
that are clearly per-agent state, not global user state.
2. **How are agents discovered?** The orchestrator currently iterates a
hardcoded list. Adding an agent means touching the recommender, the
admin UI, and the prefs schema in three places.
3. **How does context (work / home / vacation) interact with agents?**
Some agents should be silenced in some contexts. There is no model.
4. **How is per-user agent configuration learned?** Issues #112#116
each want to auto-infer parameters (quiet hours, focus areas, etc.)
from history. Without a shared substrate they each reinvent storage,
recompute cadence, and cold-start fallback.
The current ADR-0013 design works for five agents. It will not work for
twenty without becoming a tangle.
## Decision
Three changes, designed to compose:
### 1. Agents are plugins with declared schemas
Every agent ships a manifest (Python, lives next to its code in
`ml/agents/<id>/manifest.py`):
```python
class AgentManifest:
id: str # 'time-of-day'
version: str # bump invalidates cached outputs + inferences
pref_schema: dict # JSON Schema for user-tunable knobs
context_schema: list[str] # signals it reads, e.g. ['todoist.tasks']
required_consents: list[str] # ['data:todoist', 'agent:time-of-day']
output_contract: dict # snippet shape (free text + optional tags)
ttl_sec: int # snippet freshness for agent_outputs
inferred_params: list[InferredParam] # see §3
```
The manifest is the **single point of registration**. The orchestrator,
admin UI, and inference framework all read from it. Adding an agent is
adding one directory in `ml/agents/` — no edits elsewhere.
A `GET /api/agents/registry` endpoint (TS recommender → Python proxy)
exposes manifests so the admin app can auto-render configuration UI from
each `pref_schema`.
### 2. Unified Profile data model
Three new tables replace the implicit "fields-on-users" pattern.
`users.consentGiven` collapses into `user_consents` (one row,
`consent_key='data:core'`); existing data migrates in a single
backfill.
```sql
-- Hybrid: typed columns where stable, KV where open-ended.
-- Stable globals stay on users (added in this ADR):
ALTER TABLE users ADD COLUMN tone TEXT; -- 'direct'|'gentle'|'motivational'
ALTER TABLE users ADD COLUMN tip_kinds_json TEXT; -- JSON: allowed tip kinds
-- Open-ended per-agent prefs land here:
CREATE TABLE user_preferences (
user_id TEXT NOT NULL REFERENCES users(id),
scope TEXT NOT NULL, -- 'orchestrator' | 'agent:<id>'
key TEXT NOT NULL, -- e.g. 'quietStart', 'focusAreas'
value_json TEXT NOT NULL, -- agent validates against its pref_schema on read
updated_at TEXT NOT NULL,
source TEXT NOT NULL DEFAULT 'user', -- 'user' | 'inferred'
PRIMARY KEY (user_id, scope, key)
);
CREATE TABLE user_consents (
user_id TEXT NOT NULL REFERENCES users(id),
consent_key TEXT NOT NULL, -- 'data:todoist' | 'data:calendar' | 'agent:focus-area'
granted_at TEXT NOT NULL,
revoked_at TEXT, -- null = currently active
PRIMARY KEY (user_id, consent_key)
);
CREATE TABLE user_contexts (
user_id TEXT NOT NULL REFERENCES users(id),
name TEXT NOT NULL, -- 'work' | 'home' | 'vacation' | user-named
active INTEGER NOT NULL DEFAULT 0, -- boolean
schedule_json TEXT, -- optional: when this context is active
created_at TEXT NOT NULL,
PRIMARY KEY (user_id, name)
);
```
Why hybrid (typed for stable globals, KV for per-agent):
- `tone` and allowed tip kinds are referenced by every recommendation —
putting them in JSON imposes a parse on every read.
- Per-agent prefs are open-ended (each agent declares its own keys) and
validated on read against the agent's `pref_schema`, so KV is correct.
`user_preferences.source = 'user' | 'inferred'` keeps explicit user
overrides distinguishable from inferred values (the inference framework
never overwrites a `source='user'` row).
`user_contexts` ships in this ADR with **manual toggle only**.
Auto-inference per agent type is tracked in #112#116; cross-agent
calendar/geo inference is out of scope.
### 3. Shared context-inference framework
Each `InferredParam` in a manifest declares:
```python
@dataclass
class InferredParam:
key: str # 'quietStart'
ttl_sec: int # how often to recompute
cold_start_default: Any # value used until enough history exists
min_history: int # event count threshold
infer: Callable[[UserHistory], Any] # pure function
```
The framework (`ml/agents/inference/`) owns:
- Scheduling (recomputes per-param via the existing pre-compute scheduler).
- Reading history from `tip_views` / `tip_feedback` / `agent_outputs`.
- Writing results to `user_preferences` with `source='inferred'`.
- Cold-start: returns `cold_start_default` until `min_history` is met.
- Versioning: bumping `agent.version` invalidates inferred rows for that agent.
- Observability: structured log per recompute (window size, output diff, latency).
Each per-agent issue (#112#116) implements only its `infer()` functions;
everything else is the framework.
## Read-through API
Stays small as N grows because every endpoint is registry-driven:
```
GET /api/profile → { user, prefs (grouped by scope), contexts, consents, agents[] }
PATCH /api/profile/prefs/:scope → upserts user_preferences rows (source='user')
PATCH /api/profile/consents → grant/revoke
PATCH /api/profile/contexts → activate/deactivate / create
GET /api/agents/registry → manifests; admin UI auto-renders forms from pref_schema
```
`GET /api/profile` is the read-through used by `ml/serving` and the web
client; it's the single endpoint each consumer calls instead of reading
the DB directly.
## Orchestrator flow under this ADR
```
1. Load Profile = { user, prefs, active context, consents } via /api/profile.
2. From agent registry, filter eligible agents:
- required consents granted
- not silenced by active context (declared per-agent)
- enabled in user_preferences (default: enabled)
3. Pull latest non-expired agent_outputs for the eligible set.
4. Build orchestrator prompt:
- global prefs (tone, allowed tip kinds)
- active context name as hint
- agent snippets in eligibility order
5. LLM → tip.
```
No hardcoded agent list anywhere in the recommender. The orchestrator
prompt template (`v4-orchestrator`) iterates whatever it was handed.
## Migration plan
One PR per step; each independently deployable.
1. **Schema** — add the three tables; add `tone` and `tip_kinds_json` to `users`.
2. **Backfill** — write `users.consentGiven` rows into `user_consents` as `data:core`. Keep the column for one release, then drop.
3. **Manifest plumbing**`ml/agents/<id>/manifest.py` for the existing five; `GET /api/agents/registry` proxy.
4. **Read-through API**`/api/profile` + sub-endpoints.
5. **Orchestrator cutover** — registry-driven eligibility filter.
6. **Inference framework** (#111) — land it; migrate `time-of-day` (#112) as the proof.
7. **Per-agent inference**#113#116 land independently against the framework.
8. **Drop `users.consentGiven`** after one release.
## Consequences
### Positive
- Adding an agent = one directory. Admin UI, prefs storage, consent
storage, and inference all auto-pick-up.
- Per-agent state lives next to the agent code; nothing global to edit.
- User-controlled prefs and inferred prefs use the same storage but stay
distinguishable (`source` column).
- Consent revocation is row-level and time-stamped; aligns with the
privacy stance in CLAUDE.md ("privacy is a feature, not a phase").
- Sets up cleanly for #27 (Calendar) and #28 (Health) — they register
their own consent keys without schema changes.
### Negative / risks
- **JSON validation on read** for per-agent prefs is later than column
typing. Mitigated by validating in the manifest's load function and
failing closed (use cold-start default if invalid).
- **Two-table reads** for the orchestrator (registry + profile + outputs)
add latency. Cached profile read keeps it sub-ms in practice.
- **Migration window** during which `users.consentGiven` and
`user_consents` both exist. Reads must consult both for one release;
writes go to `user_consents` only.
- **Auto-inference can mislead.** A wrong-but-confident inferred quiet
window silences the user when they want pings. Mitigation: every
inferred param is overrideable in admin/settings (`source='user'`
takes precedence), and inferences only kick in past their
`min_history` threshold.
## What this does NOT change
- ADR-0013's agent set, snippet contract, or `agent_outputs` table.
- ADR-0011's `userProfileFeatures` (ML-derived features, not user prefs).
- ADR-0008's LiteLLM gateway pattern.
- The orchestrator prompt template name (`v4-orchestrator`); the assembly
rule changes, the contract does not.

View File

@@ -0,0 +1,44 @@
# ADR-0015 — Data-source consents only; drop per-agent consent gate
**Date:** 2026-05-11
**Status:** Accepted
**Supersedes:** ADR-0014 §3 (consent model)
## Context
ADR-0014 introduced `required_consents` on agent manifests. In practice two
unrelated concepts were mixed into that field:
- `data:<source>` — which data source the agent reads.
- `agent:<id>` — whether the user opted into this specific agent.
No UI ever granted `agent:<id>` consents, so the eligibility filter at
`services/api/src/profile/eligibility.ts` dropped every agent for every real
user. The symptom was confirmed by MLflow trace
`tr-591449ea8a72af8e81b6a585234a86ab`: user `ODGp4Gkr7JWemMsqcMLMn` had five
fresh `agent_outputs` rows but the orchestrator received `agent_ids: []`.
## Decision
Collapse to a single consent dimension: **data source**.
1. `required_consents` entries must all start with `data:`. Agent manifests no
longer list `agent:<id>` entries.
2. Connecting a data source via the OAuth flow automatically grants
`data:<provider>` in `user_consents`. Disconnecting sets `revoked_at`.
3. `data:core` continues to be auto-granted on signup.
4. Per-agent control becomes a **preference** (`user_preferences[scope='agent:<id>', key='enabled']`), not a consent. The eligibility filter already honours this — the only change is removing the `agent:*` consent check that was always failing.
5. Eligibility rule (final): an agent is eligible iff every `data:*` it
declares is granted and not revoked, no active context is in
`silenced_in_contexts`, and the `enabled` preference is not `false`.
## Consequences
- Agents that only require `data:core` (time-of-day, momentum, recent-patterns)
become eligible immediately after signup.
- Agents requiring `data:todoist` or `data:google-health` become eligible as
soon as the user connects the integration — no extra consent step.
- A backfill migration grants `data:<provider>` for every existing active
`integration_tokens` row, unblocking users who connected before this change.
- `ml/agents/tests/test_manifest.py` asserts all `required_consents` start
with `data:`, preventing regression.

View File

@@ -25,12 +25,37 @@ Session auth
expires_at expires_at
revoked_at? revoked_at?
Profile profile User (extended) profile ADR-0014
user_id (pk) + tone 'direct' | 'gentle' | 'motivational'
timezone + tip_kinds_json jsonb: allowed tip kinds (stable globals)
quiet_hours jsonb: [{start,end,days}]
contexts jsonb: [{name,predicate}] introduced in Phase 2 UserPreference profile ADR-0014
consents jsonb: {integration: {read,write,retain_days}} user_id, scope, key (pk)
scope 'orchestrator' | 'agent:<id>'
value_json open-ended; agent validates against its pref_schema on read
source 'user' | 'inferred' (inferred never overwrites user)
updated_at
UserConsent profile ADR-0014
user_id, consent_key (pk)
consent_key 'data:todoist' | 'data:calendar' | 'agent:focus-area' | ...
granted_at
revoked_at? null = currently active
UserContext profile ADR-0014
user_id, name (pk) 'work' | 'home' | 'vacation' | user-named
active manual toggle in M2; auto-inference per agent in #112-#116
schedule_json? optional: when this context is active
created_at
AgentOutput recommender ADR-0013
id (pk)
user_id
agent_id e.g. 'overdue-task' (matches a manifest)
prompt_text snippet for the orchestrator prompt
signals_snapshot jsonb: inputs the agent consumed
computed_at, expires_at computed_at + manifest.ttl_sec
agent_version bump to invalidate cached outputs on logic changes
Credential integrations Credential integrations
user_id user_id
@@ -53,10 +78,10 @@ Event events
TipInstance recommender TipInstance recommender
tip_id (ulid) tip_id (ulid)
user_id user_id
policy_name "random" | "bandit.linucb" | "remote:v3" policy_name "v4-orchestrator" (ADR-0013) | legacy bandit names retained for history
policy_version policy_version
candidate_source "todoist" | "advice.library" | ... candidate_source "todoist" | "advice.library" | "agent-orchestrator" | ...
context_snapshot jsonb: features seen at decision time context_snapshot jsonb: features + agent snippets seen at decision time
tip jsonb: {kind,title,body,source,deep_link,meta} tip jsonb: {kind,title,body,source,deep_link,meta}
created_at created_at
shown_at? set when the client reports render shown_at? set when the client reports render

View File

@@ -48,6 +48,8 @@ User reactions (done / snooze / dismiss) are events too. They close the loop as
- **Feast** for feature store when we get there; homegrown adapter until then (Phase 1 seam). - **Feast** for feature store when we get there; homegrown adapter until then (Phase 1 seam).
- **MLflow** for model registry and experiment tracking; deployed at `o.alogins.net/mlflow`. - **MLflow** for model registry and experiment tracking; deployed at `o.alogins.net/mlflow`.
- **Auth.js** embedded behind an OIDC-shaped boundary (ADR-0004). Swap to a standalone OIDC provider when mobile ships. - **Auth.js** embedded behind an OIDC-shaped boundary (ADR-0004). Swap to a standalone OIDC provider when mobile ships.
- **Multi-agent recommendation** (ADR-0013) — pre-compute agents emit prompt snippets, an orchestrator LLM produces the tip. Replaced the ε-greedy bandit (ADR-0007/0012) for explainability, cold-start, and decoupling generation from selection.
- **Registry-driven agents + unified Profile** (ADR-0014) — agents are plugins with declared manifests; per-user prefs, contexts, and per-key consents live in shared tables; auto-inferred parameters share a common framework. Adding an agent is a manifest change.
- **k3s** as the first step beyond docker-compose — no "compose → full k8s" cliff. - **k3s** as the first step beyond docker-compose — no "compose → full k8s" cliff.
## AI stack ## AI stack
@@ -59,30 +61,43 @@ All LLM inference routes through **LiteLLM** (`llm.alogins.net`) backed by **Oll
**OpenWebUI** (`ai.alogins.net`) is the human-facing interface for prompt iteration and model testing during development. **OpenWebUI** (`ai.alogins.net`) is the human-facing interface for prompt iteration and model testing during development.
## Decision flow for a new tip (Phase 2 target) ## Decision flow for a new tip (M2, ADR-0013 + ADR-0014)
``` ```
┌────────────────────────────────────────────────┐
│ Pre-compute (every 15 min, per registered agent) │
│ ml/agents/<id> → prompt snippet → agent_outputs │
│ TTL per manifest; agent_version invalidates │
└────────────────────────────────────────────────┘
client ─► gateway ─► recommender (TS) client ─► gateway ─► recommender (TS)
├─► profile: GET /api/profile
│ (user, prefs, active context, consents)
├─► registry: GET /api/agents/registry
│ (manifests; eligibility filter inputs)
├─► outputs: pull freshest non-expired agent_outputs
│ for eligible agents (consents granted,
│ not silenced by active context, enabled)
ml/serving (Python) ml/serving (Python)
├─► context: ml/features/context.py ├─► assemble: v4-orchestrator prompt
(tasks + reactions + time patterns → prompt) = global prefs + active context + snippets
├─► generate: LiteLLM → Ollama ├─► generate: LiteLLM → Ollama → one tip
│ → N TipCandidates {content, kind, model, prompt_version}
─► score: bandit policy scores each candidate ─► persist: tip_scores {tip, contributing agents,
prompt_version, llm_model, latency}
├─► shadows: shadow policies log picks without serving ◄─ tip
└─► persist: tip_scores {candidate, policy, features, latency}
◄─ best TipCandidate
``` ```
**Phase 1 (shipped M1):** candidates come from Todoist task list, no LLM. The bandit scores tasks directly. **Evolution:**
- **Phase 1 (M1):** candidates from Todoist; ε-greedy bandit scored tasks directly (ADR-0007, ADR-0012). Superseded.
- **Phase 2 early (M2):** LLM-generated candidates ranked by bandit. Superseded mid-milestone.
- **Phase 2 current (M2):** multi-agent pipeline (ADR-0013), registry-driven and registry-extensible (ADR-0014). No bandit; the orchestrator LLM reasons over named agent snippets.
**Phase 2 (shipped M2):** LLM candidates are generated in parallel with Todoist fetch. Both pools are merged, scored by the bandit, and the winner served. `tip_scores` tracks `prompt_version`, `llm_model`, and `tip_kind` for every row. Feedback: `POST /feedback → events.emit(reaction)`. No online ML reward loop (ADR-0013 §Consequences); reactions are logged in `tip_feedback` for observability and potential future supervised learning.
Feedback: `POST /feedback → events.emit(reaction)` → online bandit update + `prompt_version` tracked for A/B analysis.

View File

@@ -26,7 +26,7 @@ User taps "Delete account" in settings → hard confirm → `User.deleted_at` se
## Scope boundaries ## Scope boundaries
Each integration declares the scopes it requests and the features it derives. The `Profile.consents` column is the source of truth; a scope removed from consent short-circuits derived-feature computation at the feature store. Each integration and each agent declares the consent keys it requires (`data:todoist`, `agent:focus-area`, ...) in its manifest. The `user_consents` table is the source of truth (per-key rows, revocation is a `revoked_at` write — never a delete, so audits stay clean). A revoked consent short-circuits derived-feature computation at the feature store and removes the dependent agent from the orchestrator's eligible set on the next tip. See ADR-0014.
## Audit ## Audit

View File

@@ -1,7 +1,8 @@
# syntax=docker/dockerfile:1.7 # syntax=docker/dockerfile:1.7
FROM node:22-slim AS base FROM node:22-slim AS base
RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates \ RUN apt-get update && apt-get install -y --no-install-recommends \
python3 make g++ ca-certificates \
&& rm -rf /var/lib/apt/lists/* \ && rm -rf /var/lib/apt/lists/* \
&& npm install -g pnpm && npm install -g pnpm
ENV CI=true \ ENV CI=true \

View File

@@ -16,7 +16,7 @@ COPY pnpm-lock.yaml ./
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm fetch RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm fetch
COPY . . COPY . .
RUN --mount=type=cache,id=pnpm,target=/pnpm/store \ RUN --mount=type=cache,id=pnpm,target=/pnpm/store \
pnpm install --frozen-lockfile --offline \ pnpm install --frozen-lockfile \
--filter @oo/api... --filter @oo/shared-types --filter @oo/api... --filter @oo/shared-types
RUN pnpm --filter @oo/shared-types build RUN pnpm --filter @oo/shared-types build
RUN pnpm --filter @oo/api build RUN pnpm --filter @oo/api build

View File

@@ -1,6 +1,11 @@
FROM python:3.12-slim FROM python:3.12-slim
WORKDIR /app WORKDIR /app/ml/serving
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential \
&& rm -rf /var/lib/apt/lists/*
COPY ml/serving/requirements.txt . COPY ml/serving/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt RUN pip install --no-cache-dir -r requirements.txt
COPY ml/serving/*.py . COPY ml/ /app/ml/
# PYTHONPATH=/app lets 'import ml.agents.*' resolve from /app/ml/agents/
ENV PYTHONPATH=/app
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

View File

@@ -13,6 +13,7 @@ WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules COPY --from=deps /app/node_modules ./node_modules
COPY --from=deps /app/packages/shared-types/node_modules ./packages/shared-types/node_modules COPY --from=deps /app/packages/shared-types/node_modules ./packages/shared-types/node_modules
COPY --from=deps /app/apps/web/node_modules ./apps/web/node_modules COPY --from=deps /app/apps/web/node_modules ./apps/web/node_modules
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml ./
COPY tsconfig.base.json ./ COPY tsconfig.base.json ./
COPY packages/shared-types ./packages/shared-types COPY packages/shared-types ./packages/shared-types
COPY apps/web ./apps/web COPY apps/web ./apps/web

View File

@@ -71,6 +71,7 @@ services:
environment: environment:
LITELLM_URL: ${LITELLM_URL:-http://host.docker.internal:4000} LITELLM_URL: ${LITELLM_URL:-http://host.docker.internal:4000}
OLLAMA_URL: ${OLLAMA_URL:-http://host.docker.internal:11434} OLLAMA_URL: ${OLLAMA_URL:-http://host.docker.internal:11434}
MLFLOW_TRACKING_URI: ${MLFLOW_TRACKING_URI:-http://mlflow:5000}
extra_hosts: extra_hosts:
- "host.docker.internal:host-gateway" - "host.docker.internal:host-gateway"
ports: ports:
@@ -81,6 +82,46 @@ services:
timeout: 5s timeout: 5s
retries: 5 retries: 5
# ── ai profile — Ollama + LiteLLM for local dev ──────────────────────────
# Start: docker compose --profile ai up
# Use when the Agap shared Ollama/LiteLLM services are not available locally.
# Set LITELLM_URL=http://localhost:4000 and OLLAMA_URL=http://localhost:11434
# in .env.local to point ml-serving at these containers instead of Agap.
ollama:
image: ollama/ollama:latest
profiles: [ai]
volumes:
- ollama-models:/root/.ollama
ports:
- "127.0.0.1:11434:11434"
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:11434/api/tags"]
interval: 15s
timeout: 5s
retries: 10
litellm:
image: ghcr.io/berriai/litellm:main-latest
profiles: [ai]
environment:
LITELLM_MASTER_KEY: ${LITELLM_MASTER_KEY:-sk-local-dev}
command: >
--model ollama/qwen2.5:1.5b
--model ollama/nomic-embed-text
--api_base http://ollama:11434
--port 4000
ports:
- "127.0.0.1:4000:4000"
depends_on:
ollama:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:4000/health"]
interval: 10s
timeout: 5s
retries: 5
# ── mlops profile — MLflow ──────────────────────────────────────────────── # ── mlops profile — MLflow ────────────────────────────────────────────────
# Start: docker compose --profile mlops up # Start: docker compose --profile mlops up
# MLflow UI: http://localhost:5000 or https://o.alogins.net/mlflow # MLflow UI: http://localhost:5000 or https://o.alogins.net/mlflow
@@ -111,11 +152,13 @@ services:
command: > command: >
mlflow server mlflow server
--backend-store-uri sqlite:////mlflow/mlflow.db --backend-store-uri sqlite:////mlflow/mlflow.db
--default-artifact-root /mlflow/artifacts --artifacts-destination /mlflow/artifacts
--serve-artifacts
--default-artifact-root mlflow-artifacts:/
--host 0.0.0.0 --host 0.0.0.0
--port 5000 --port 5000
--static-prefix /mlflow --static-prefix /mlflow
--allowed-hosts o.alogins.net,localhost --allowed-hosts o.alogins.net,localhost,localhost:5000,mlflow,mlflow:5000
--cors-allowed-origins https://o.alogins.net --cors-allowed-origins https://o.alogins.net
volumes: volumes:
- /mnt/ssd/dbs/oo/mlflow:/mlflow - /mnt/ssd/dbs/oo/mlflow:/mlflow
@@ -126,3 +169,6 @@ services:
interval: 10s interval: 10s
timeout: 5s timeout: 5s
retries: 5 retries: 5
volumes:
ollama-models:

0
ml/__init__.py Normal file
View File

4
ml/agents/__init__.py Normal file
View File

@@ -0,0 +1,4 @@
from .base import BaseAgent, AgentInput, AgentOutput
from .registry import get_agent, all_agents
__all__ = ["BaseAgent", "AgentInput", "AgentOutput", "get_agent", "all_agents"]

61
ml/agents/base.py Normal file
View File

@@ -0,0 +1,61 @@
"""Base class and shared data structures for all recommendation sub-agents."""
from __future__ import annotations
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from datetime import datetime, timedelta, timezone
from typing import ClassVar
@dataclass
class AgentInput:
"""Everything an agent may need to produce its prompt snippet."""
user_id: str
tasks: list[dict] # task signal dicts (content, priority, is_overdue, …)
profile: dict[str, float | None] # profile feature values keyed by feature name
feedback_history: list[dict] = field(default_factory=list) # [{action, dwell_ms, created_at}, …]
now: datetime = field(default_factory=lambda: datetime.now(timezone.utc))
# Per-agent inferred/user prefs loaded from user_preferences (ADR-0014 §3).
# Keys match the agent's pref_schema + inferred_params. 'user' source takes
# precedence over 'inferred' source; the caller resolves priority before
# passing this dict in.
agent_prefs: dict = field(default_factory=dict)
# Pre-fetched enrichment cache: {content_hash -> description}. Populated by
# the TS caller from the task_enrichments DB table to avoid redundant LLM calls.
enrichment_cache: dict = field(default_factory=dict)
@dataclass
class AgentOutput:
"""Result produced by an agent; persisted to agent_outputs table."""
user_id: str
agent_id: str
prompt_text: str # snippet passed to the orchestrator
signals_snapshot: dict # inputs consumed (for explainability / debugging)
computed_at: str # ISO 8601
expires_at: str # ISO 8601
agent_version: str
class BaseAgent(ABC):
agent_id: ClassVar[str]
ttl_seconds: ClassVar[int]
version: ClassVar[str]
@abstractmethod
def compute(self, inp: AgentInput) -> AgentOutput:
"""Analyse inp and return a prompt snippet describing what was found."""
...
def _make_output(self, inp: AgentInput, prompt_text: str, snapshot: dict) -> AgentOutput:
computed_at = inp.now.astimezone(timezone.utc).isoformat()
expires_at = (inp.now.astimezone(timezone.utc) + timedelta(seconds=self.ttl_seconds)).isoformat()
return AgentOutput(
user_id=inp.user_id,
agent_id=self.agent_id,
prompt_text=prompt_text,
signals_snapshot=snapshot,
computed_at=computed_at,
expires_at=expires_at,
agent_version=self.version,
)

290
ml/agents/clustering.py Normal file
View File

@@ -0,0 +1,290 @@
"""Semantic task clustering via nomic-embed-text (issue #97, #129).
Public API:
cluster_tasks(tasks) -> list[Cluster]
Each task dict must have a "content" key. Tasks without content are placed in a
fallback "other" bucket. If the embedding service is unreachable, falls back to
grouping by project_id so compute() always returns something useful.
Pipeline (ported from taskpile experiments/clustering_eval, prompt v1):
1. Expand each raw title via LiteLLM `tip-generator` (qwen2.5:1.5b) into a
3-sentence description. Cached in-memory by content hash within a compute
cycle so duplicate titles cost one LLM call.
2. Prefix the expanded text with "clustering: " (nomic-embed-text task prefix).
3. Batch-embed via LiteLLM `embedder` (nomic-embed-text).
Falls back to embedding raw titles when LLM expansion fails, and to
project-based grouping when embeddings are unavailable.
"""
from __future__ import annotations
import hashlib
import logging
import math
import os
from dataclasses import dataclass, field
import httpx
log = logging.getLogger(__name__)
# Cosine similarity threshold for merging tasks into the same cluster.
_SIM_THRESHOLD = 0.72
# Never produce more than this many clusters regardless of task count.
_MAX_CLUSTERS = 6
_EMBED_TIMEOUT = 15.0
_ENRICH_TIMEOUT = 30.0
_ENRICH_PROMPT_V1 = (
"You are helping categorize a personal task. "
"Write exactly 3 sentences in English describing what the task likely involves, "
"what context or skills it needs, and why it might matter. "
"Be concise and specific. Do not use bullet points or numbering.\n"
"Task: {title}\n"
"Description:"
)
@dataclass
class Cluster:
label: str # representative task content (shortest, most central)
tasks: list[dict] = field(default_factory=list)
@property
def task_count(self) -> int:
return len(self.tasks)
@property
def overdue_count(self) -> int:
return sum(1 for t in self.tasks if t.get("is_overdue"))
# ---------------------------------------------------------------------------
# LLM enrichment
# ---------------------------------------------------------------------------
def _content_hash(text: str) -> str:
return hashlib.md5(text.encode()).hexdigest()
def _enrich_title(title: str, litellm_url: str) -> str | None:
"""Expand a terse task title into a 3-sentence description via LiteLLM."""
try:
with httpx.Client(trust_env=False, timeout=_ENRICH_TIMEOUT) as c:
r = c.post(
f"{litellm_url}/chat/completions",
json={
"model": "tip-generator",
"messages": [{"role": "user", "content": _ENRICH_PROMPT_V1.format(title=title)}],
"max_tokens": 120,
"temperature": 0.3,
},
)
r.raise_for_status()
return r.json()["choices"][0]["message"]["content"].strip()
except Exception as exc:
log.debug("enrich_failed title=%r error=%s", title[:40], exc)
return None
def _enrich_batch(
titles: list[str],
persistent_cache: dict[str, str] | None = None,
) -> tuple[list[str], dict[str, str]]:
"""Return (descriptions, new_entries) for each title.
Checks persistent_cache (pre-fetched from DB) first, then falls back to
calling LiteLLM. new_entries contains only hashes generated this call —
the caller should persist these to the DB.
"""
litellm_url = os.getenv("LITELLM_URL")
if not litellm_url:
log.debug("enrich_batch: no LITELLM_URL, skipping enrichment")
return titles, {}
db_cache = persistent_cache or {}
session_cache: dict[str, str] = {} # dedup within this call
new_entries: dict[str, str] = {}
results = []
for title in titles:
h = _content_hash(title)
if h in db_cache:
results.append(db_cache[h])
elif h in session_cache:
results.append(session_cache[h])
else:
desc = _enrich_title(title, litellm_url)
value = desc if desc else title
session_cache[h] = value
if desc: # only persist successful enrichments
new_entries[h] = desc
results.append(value)
return results, new_entries
# ---------------------------------------------------------------------------
# Embedding
# ---------------------------------------------------------------------------
def _embed_via_litellm(texts: list[str], litellm_url: str) -> list[list[float]] | None:
"""Batch embed via LiteLLM OpenAI-compatible /embeddings endpoint."""
try:
with httpx.Client(trust_env=False, timeout=_EMBED_TIMEOUT) as c:
r = c.post(
f"{litellm_url}/embeddings",
json={"model": "embedder", "input": texts},
)
r.raise_for_status()
data = r.json().get("data", [])
ordered = sorted(data, key=lambda x: x["index"])
return [item["embedding"] for item in ordered]
except Exception as exc:
log.debug("litellm_embed_failed error=%s", exc)
return None
def _embed_via_ollama(texts: list[str], ollama_url: str) -> list[list[float]] | None:
"""Batch embed via Ollama /api/embed endpoint."""
try:
results = []
with httpx.Client(trust_env=False, timeout=_EMBED_TIMEOUT) as c:
for text in texts:
r = c.post(
f"{ollama_url}/api/embed",
json={"model": "nomic-embed-text", "input": text},
)
r.raise_for_status()
body = r.json()
# /api/embed returns {"embeddings": [[...]]}
embeddings = body.get("embeddings")
if not embeddings:
return None
results.append(embeddings[0])
return results
except Exception as exc:
log.debug("ollama_embed_failed error=%s", exc)
return None
def _embed_batch(texts: list[str]) -> list[list[float]] | None:
"""Embed a list of texts, preferring LiteLLM over direct Ollama."""
litellm_url = os.getenv("LITELLM_URL")
if litellm_url:
vecs = _embed_via_litellm(texts, litellm_url)
if vecs is not None:
return vecs
log.info("cluster: litellm embed failed, trying ollama fallback")
ollama_url = os.getenv("OLLAMA_URL", "http://host.docker.internal:11434")
return _embed_via_ollama(texts, ollama_url)
# ---------------------------------------------------------------------------
# Clustering
# ---------------------------------------------------------------------------
def _cosine(a: list[float], b: list[float]) -> float:
dot = sum(x * y for x, y in zip(a, b))
na = math.sqrt(sum(x * x for x in a))
nb = math.sqrt(sum(x * x for x in b))
if na == 0 or nb == 0:
return 0.0
return dot / (na * nb)
def _greedy_cluster(items: list[tuple[dict, list[float]]]) -> list[Cluster]:
"""Single-pass greedy clustering: each item joins the first existing cluster
whose centroid is above _SIM_THRESHOLD, else starts a new one."""
clusters: list[tuple[list[float], Cluster]] = [] # (centroid, cluster)
for task, vec in items:
best_idx = -1
best_sim = _SIM_THRESHOLD - 1e-9
for i, (centroid, _) in enumerate(clusters):
sim = _cosine(centroid, vec)
if sim > best_sim:
best_sim = sim
best_idx = i
if best_idx >= 0 and len(clusters) < _MAX_CLUSTERS:
centroid, cluster = clusters[best_idx]
cluster.tasks.append(task)
# Update centroid as running mean.
n = len(cluster.tasks)
new_centroid = [(c * (n - 1) + v) / n for c, v in zip(centroid, vec)]
clusters[best_idx] = (new_centroid, cluster)
elif len(clusters) < _MAX_CLUSTERS:
label = task.get("content", "Tasks")[:60]
cluster = Cluster(label=label, tasks=[task])
clusters.append((vec, cluster))
else:
# Overflow: append to closest cluster even below threshold.
best_i = max(range(len(clusters)), key=lambda i: _cosine(clusters[i][0], vec))
clusters[best_i][1].tasks.append(task)
return [c for _, c in clusters]
def _fallback_by_project(tasks: list[dict]) -> list[Cluster]:
"""Group by project_id when embeddings are unavailable."""
buckets: dict[str, Cluster] = {}
for task in tasks:
pid = task.get("project_id") or task.get("project") or "default"
if pid not in buckets:
label = pid if pid != "default" else "Tasks"
buckets[pid] = Cluster(label=label)
buckets[pid].tasks.append(task)
return list(buckets.values())
def cluster_tasks(
tasks: list[dict],
ollama_url: str | None = None, # kept for test compatibility; env vars take precedence
enrichment_cache: dict[str, str] | None = None,
) -> tuple[list[Cluster], dict[str, str]]:
"""Cluster tasks by semantic similarity.
Returns (clusters, new_enrichments). new_enrichments contains LLM-generated
descriptions produced this call that were not in the persistent cache — the
caller should persist these. Falls back to project-based grouping if the
embedding service is unavailable or tasks have no content.
"""
if not tasks:
return [], {}
# Separate tasks with usable content from those without.
with_content = [(t, t.get("content", "").strip()) for t in tasks]
embeddable = [(t, c) for t, c in with_content if c]
no_content = [t for t, c in with_content if not c]
if not embeddable:
return _fallback_by_project(tasks), {}
task_objs = [t for t, _ in embeddable]
raw_titles = [c for _, c in embeddable]
# Step 1: LLM-enrich titles → richer semantic signal before embedding.
descriptions, new_enrichments = _enrich_batch(raw_titles, persistent_cache=enrichment_cache)
# Attach enriched description to each task dict so consumers (e.g. focus-area)
# can show the expanded text instead of the terse raw title.
for task, desc in zip(task_objs, descriptions):
task["enriched_description"] = desc
# Step 2: Prefix with nomic-embed-text task prefix, then batch-embed.
prefixed = [f"clustering: {d}" for d in descriptions]
vecs = _embed_batch(prefixed)
if vecs is None or len(vecs) != len(prefixed):
log.info("cluster_tasks: embedding unavailable, falling back to project grouping")
return _fallback_by_project(tasks), new_enrichments
embedded = list(zip(task_objs, vecs))
clusters = _greedy_cluster(embedded)
if no_content:
clusters.append(Cluster(label="Other tasks", tasks=no_content))
return clusters, new_enrichments

70
ml/agents/focus_area.py Normal file
View File

@@ -0,0 +1,70 @@
from __future__ import annotations
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .clustering import cluster_tasks
from .manifest import AgentManifest
MANIFEST = AgentManifest(
id="focus-area",
version="3.0.0", # output all clusters as context; no scoring (#129)
description="Clusters tasks semantically, enriches titles via LLM, and outputs a full area summary with expanded descriptions for the orchestrator.",
pref_schema={"type": "object", "additionalProperties": False, "properties": {}},
context_schema=["todoist.tasks"],
required_consents=["data:core", "data:todoist"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=86_400,
inferred_params=[],
)
class FocusAreaAgent(BaseAgent):
"""Clusters tasks and outputs a full area summary for the orchestrator."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version # 3.0.0
def compute(self, inp: AgentInput) -> AgentOutput:
if not inp.tasks:
return self._make_output(
inp,
"No tasks available to identify focus areas.",
{"cluster_count": 0},
)
clusters, new_enrichments = cluster_tasks(inp.tasks, enrichment_cache=inp.enrichment_cache)
if not clusters:
return self._make_output(
inp,
"No tasks available to identify focus areas.",
{"cluster_count": 0},
)
lines = [f"The user's tasks are grouped into {len(clusters)} area(s):"]
for i, cluster in enumerate(clusters, 1):
descs = [
t.get("enriched_description") or t.get("content", "")
for t in cluster.tasks
if t.get("content")
]
descs = [d.strip() for d in descs if d.strip()]
descs_str = "; ".join(f'"{d}"' for d in descs[:8])
if len(descs) > 8:
descs_str += f" (and {len(descs) - 8} more)"
lines.append(f"{i}. {cluster.label}{cluster.task_count} task(s): {descs_str}")
lines.append("(Task titles may be in any language — always write the tip in English.)")
snapshot = {
"cluster_count": len(clusters),
"clusters": [
{"label": c.label, "task_count": c.task_count,
"tasks": [t.get("content", "") for t in c.tasks]}
for c in clusters
],
"_new_enrichments": new_enrichments,
}
return self._make_output(inp, "\n".join(lines), snapshot)

134
ml/agents/health_vitals.py Normal file
View File

@@ -0,0 +1,134 @@
from __future__ import annotations
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .manifest import AgentManifest, InferredParam
from .inference.history import UserHistory
def _infer_step_goal(history: UserHistory) -> int:
"""Return median daily step count as the personal goal baseline (min 1000)."""
if not history.task_completions:
return 7_000
# task_completions reused as a generic history mechanism here;
# step history arrives via agent_prefs.step_history when available.
return 7_000
MANIFEST = AgentManifest(
id="health-vitals",
version="1.0.0",
description="Summarises today's health signals: steps, sleep, activity, and heart rate.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"step_goal": {
"type": "integer",
"minimum": 1000,
"default": 7000,
"description": "Daily step goal.",
},
"sleep_goal_hours": {
"type": "number",
"minimum": 4,
"maximum": 12,
"default": 7,
"description": "Target sleep duration in hours.",
},
},
},
context_schema=["google-health.steps", "google-health.sleep", "google-health.activity", "google-health.heart_rate"],
required_consents=["data:core", "data:google-health"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=1800, # refresh every 30 min — health data changes during the day
silenced_in_contexts=[],
inferred_params=[
InferredParam(
key="step_goal",
ttl_sec=7 * 86_400,
cold_start_default=7000,
min_history=0,
infer=lambda h: 7000, # static default; override via user pref
),
],
)
class HealthVitalsAgent(BaseAgent):
"""Summarises today's health signals into an orchestrator prompt snippet."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
step_goal = int(inp.agent_prefs.get("step_goal", 7000))
sleep_goal = float(inp.agent_prefs.get("sleep_goal_hours", 7.0))
health = [t for t in inp.tasks if t.get("source") == "google-health"]
if not health:
prompt = "No health data available from Google Fit today. (Always write the tip in English.)"
return self._make_output(inp, prompt, {"no_data": True})
steps_sig = next((t for t in health if str(t.get("id", "")).endswith(":steps")), None)
sleep_sig = next((t for t in health if str(t.get("id", "")).endswith(":sleep")), None)
activity_sig = next((t for t in health if str(t.get("id", "")).endswith(":activity")), None)
hr_sig = next((t for t in health if str(t.get("id", "")).endswith(":heart_rate")), None)
insights: list[str] = []
snapshot: dict = {}
if steps_sig is not None:
steps = int(steps_sig.get("step_count", 0))
pct = round(steps / step_goal * 100) if step_goal else 0
snapshot["step_count"] = steps
snapshot["step_goal_pct"] = pct
if pct < 30:
insights.append(f"only {steps:,} steps today ({pct}% of {step_goal:,} goal — significantly behind)")
elif pct < 60:
insights.append(f"{steps:,} steps today ({pct}% of {step_goal:,} goal)")
elif pct >= 100:
insights.append(f"{steps:,} steps today (daily goal reached!)")
else:
insights.append(f"{steps:,} steps today ({pct}% of goal)")
if sleep_sig is not None:
hours = float(sleep_sig.get("sleep_hours", 0))
deficit = max(0.0, sleep_goal - hours)
snapshot["sleep_hours"] = hours
snapshot["sleep_deficit_hours"] = deficit
if deficit >= 1.5:
insights.append(f"only {hours:.1f}h sleep last night ({deficit:.1f}h below the {sleep_goal:.0f}h goal)")
elif deficit > 0:
insights.append(f"{hours:.1f}h sleep last night (slightly below {sleep_goal:.0f}h goal)")
else:
insights.append(f"{hours:.1f}h sleep last night (goal met)")
if activity_sig is not None:
active_mins = int(activity_sig.get("active_minutes", 0))
calories = int(activity_sig.get("calories_burned", 0))
snapshot["active_minutes"] = active_mins
snapshot["calories_burned"] = calories
if active_mins < 10:
insights.append(f"only {active_mins} active minutes today — largely sedentary")
elif active_mins >= 30:
insights.append(f"{active_mins} active minutes and {calories} kcal burned today")
if hr_sig is not None:
bpm = int(hr_sig.get("resting_bpm", 0))
snapshot["resting_bpm"] = bpm
if bpm > 90:
insights.append(f"elevated resting heart rate: {bpm} bpm")
elif bpm > 0:
insights.append(f"resting heart rate: {bpm} bpm")
if not insights:
prompt = "Health data is available but no notable signals today. (Always write the tip in English.)"
else:
body = "; ".join(insights)
prompt = f"Health snapshot: {body}. (Always write the tip in English.)"
return self._make_output(inp, prompt, snapshot)

View File

@@ -0,0 +1,9 @@
"""Shared context-inference framework (ADR-0014 §3, issue #111).
Each agent's manifest declares InferredParams; this package owns the
scheduling contract, history data model, and write path to user_preferences.
"""
from .framework import run_inference
from .history import FeedbackEvent, TaskCompletion, UserHistory
__all__ = ["run_inference", "FeedbackEvent", "TaskCompletion", "UserHistory"]

View File

@@ -0,0 +1,59 @@
"""run_inference — core of the context-inference framework (ADR-0014 §3).
Contract:
run_inference(manifest, history) → dict[key, value]
Semantics:
- For each InferredParam in manifest.inferred_params:
- If len(history.events) < param.min_history → emit cold_start_default.
- Otherwise → call param.infer(history) and emit the result.
- Returns {key: value} ready for the caller to persist to user_preferences
with source='inferred'.
- User overrides (source='user') are handled by the caller's upsert logic;
this function has no DB access.
"""
from __future__ import annotations
import logging
import time
from typing import Any
from ..manifest import AgentManifest
from .history import UserHistory
log = logging.getLogger(__name__)
def run_inference(manifest: AgentManifest, history: UserHistory) -> dict[str, Any]:
"""Evaluate all InferredParams for an agent and return {key: inferred_value}."""
result: dict[str, Any] = {}
n = len(history.events)
for param in manifest.inferred_params:
t0 = time.monotonic()
if param.infer is None:
result[param.key] = param.cold_start_default
continue
if n < param.min_history:
value = param.cold_start_default
source = "cold_start"
else:
try:
value = param.infer(history)
source = "inferred"
except Exception as exc:
log.warning(
"inference_error agent=%s param=%s error=%s — using cold_start_default",
manifest.id, param.key, exc,
)
value = param.cold_start_default
source = "error_fallback"
latency_ms = round((time.monotonic() - t0) * 1000, 1)
log.info(
"inference_param agent=%s param=%s source=%s value=%r history_len=%d latency_ms=%s",
manifest.id, param.key, source, value, n, latency_ms,
)
result[param.key] = value
return result

View File

@@ -0,0 +1,49 @@
"""UserHistory — normalised view of a user's feedback events for inference."""
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import datetime, timezone
@dataclass
class FeedbackEvent:
action: str # 'done' | 'dismiss' | 'snooze' | 'helpful' | 'not_helpful'
dwell_ms: int | None
created_at: str # ISO 8601
@property
def hour(self) -> int:
"""Hour of day (0-23) when the feedback was recorded."""
try:
dt = datetime.fromisoformat(self.created_at.replace("Z", "+00:00"))
except ValueError:
return 12
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt.hour
@dataclass
class TaskCompletion:
"""A completed task that had a due date — used for lateness inference."""
project_id: str | None
completed_at: str # ISO 8601
due_at: str # ISO 8601
@property
def lateness_days(self) -> float:
"""Days between due_at and completed_at. Negative = completed early."""
try:
def _parse(s: str) -> datetime:
dt = datetime.fromisoformat(s.replace("Z", "+00:00"))
return dt if dt.tzinfo else dt.replace(tzinfo=timezone.utc)
return (_parse(self.completed_at) - _parse(self.due_at)).total_seconds() / 86_400
except ValueError:
return 0.0
@dataclass
class UserHistory:
user_id: str
events: list[FeedbackEvent] = field(default_factory=list)
task_completions: list[TaskCompletion] = field(default_factory=list)

70
ml/agents/manifest.py Normal file
View File

@@ -0,0 +1,70 @@
"""Agent manifest dataclass (ADR-0014).
A manifest is the single point of registration for an agent. The orchestrator,
admin UI, registry endpoint, and inference framework all read from it. Adding
an agent is adding a manifest + agent class — never editing a list elsewhere.
The manifest lives next to the agent code (each agent module in ml/agents/
exposes a module-level `MANIFEST` constant). The registry surfaces both the
agent instance and its manifest.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Any, Callable
@dataclass(frozen=True)
class InferredParam:
"""One auto-inferred preference key (#111-#116).
The inference framework owns scheduling, history reads, persistence, and
cold-start. Each agent's `inferred_params` list declares what to infer and
how, leaving each agent to implement just `infer()`.
"""
key: str # e.g. 'quietStart'
ttl_sec: int # how often to recompute
cold_start_default: Any # value used until min_history is met
min_history: int # event count threshold
# Pure function: given a UserHistory snapshot, return the inferred value.
# Typed as a generic callable here; concrete signature lives in the framework.
infer: Callable[[Any], Any] | None = None
@dataclass(frozen=True)
class AgentManifest:
"""Declarative description of an agent — see ADR-0014 §1."""
id: str # 'time-of-day'
version: str # bump invalidates cached outputs + inferences
description: str # one-line human summary for admin UI
pref_schema: dict # JSON Schema for user-tunable knobs
context_schema: list[str] # signals it reads, e.g. ['todoist.tasks']
required_consents: list[str] # ['data:todoist', 'agent:time-of-day']
output_contract: dict # snippet shape (free text + optional tags)
ttl_sec: int # snippet freshness for agent_outputs
silenced_in_contexts: list[str] = field(default_factory=list) # active context names that suppress this agent
inferred_params: list[InferredParam] = field(default_factory=list)
def to_dict(self) -> dict:
"""Serialise for the registry endpoint. `inferred_params` drops `infer`
(callable) since the wire format only carries metadata."""
return {
"id": self.id,
"version": self.version,
"description": self.description,
"pref_schema": self.pref_schema,
"context_schema": self.context_schema,
"required_consents": self.required_consents,
"output_contract": self.output_contract,
"ttl_sec": self.ttl_sec,
"silenced_in_contexts": list(self.silenced_in_contexts),
"inferred_params": [
{
"key": p.key,
"ttl_sec": p.ttl_sec,
"cold_start_default": p.cold_start_default,
"min_history": p.min_history,
}
for p in self.inferred_params
],
}

249
ml/agents/momentum.py Normal file
View File

@@ -0,0 +1,249 @@
from __future__ import annotations
import math
import statistics
from collections import defaultdict
from datetime import datetime, timedelta, timezone
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .inference.history import UserHistory
from .manifest import AgentManifest, InferredParam
def _parse_dt(iso: str) -> datetime:
try:
dt = datetime.fromisoformat(iso.replace("Z", "+00:00"))
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
except ValueError:
return datetime.min.replace(tzinfo=timezone.utc)
def _daily_done_counts(history: UserHistory, window_days: int = 28) -> list[int]:
"""Count done-action events per calendar day over the last window_days days."""
if not history.events:
return []
latest = max(_parse_dt(e.created_at) for e in history.events)
cutoff = latest - timedelta(days=window_days)
by_day: dict[tuple[int, int, int], int] = defaultdict(int)
for e in history.events:
if e.action == "done":
dt = _parse_dt(e.created_at)
if dt >= cutoff:
by_day[(dt.year, dt.month, dt.day)] += 1
# Return counts for every day in the window, including zero-completion days.
counts = []
for offset in range(window_days):
day = (latest - timedelta(days=offset)).date()
counts.append(by_day.get((day.year, day.month, day.day), 0))
return counts
def _infer_baseline_completions_per_day(history: UserHistory) -> float:
counts = _daily_done_counts(history)
return statistics.mean(counts) if counts else 1.0
def _infer_stdev(history: UserHistory) -> float:
counts = _daily_done_counts(history)
if len(counts) < 2:
return 1.0
sd = statistics.stdev(counts)
return max(sd, 0.1) # floor so we never divide by zero in z-score
def _infer_engagement_trend(history: UserHistory) -> str:
"""Compare done-rate in the most recent 7 days vs the 7 days before that."""
events = sorted(history.events, key=lambda e: e.created_at)
if not events:
return "stable"
try:
latest = datetime.fromisoformat(events[-1].created_at.replace("Z", "+00:00"))
except ValueError:
return "stable"
cutoff_recent = latest - timedelta(days=7)
cutoff_older = latest - timedelta(days=14)
recent = [e for e in events if _parse_dt(e.created_at) >= cutoff_recent]
older = [e for e in events if cutoff_older <= _parse_dt(e.created_at) < cutoff_recent]
if len(older) < 3:
return "stable"
recent_rate = sum(1 for e in recent if e.action == "done") / max(len(recent), 1)
older_rate = sum(1 for e in older if e.action == "done") / max(len(older), 1)
delta = recent_rate - older_rate
if delta > 0.10:
return "up"
if delta < -0.10:
return "down"
return "stable"
MANIFEST = AgentManifest(
id="momentum",
version="1.2.0", # #114: baseline + stdev inferred params; z-score snippet language
description="Characterises the user's recent engagement trend from profile features.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"low_engagement_threshold_pct": {
"type": "integer",
"minimum": 0,
"maximum": 100,
"default": 25,
"description": "Completion rate below which momentum hints at low engagement.",
},
"baseline_completions_per_day": {
"type": "number",
"minimum": 0,
"default": 1.0,
"description": "User's normal daily done-task rate (inferred from 28d history).",
},
"stdev": {
"type": "number",
"minimum": 0,
"default": 1.0,
"description": "Stdev of daily completion counts; used for z-score normalisation.",
},
"momentum_window": {
"type": "integer",
"minimum": 1,
"default": 7,
"description": "Days of recent history to measure current momentum against baseline.",
},
},
},
context_schema=["profile.features"],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=21_600,
inferred_params=[
InferredParam(
key="engagement_trend",
ttl_sec=21_600,
cold_start_default="stable",
min_history=10,
infer=_infer_engagement_trend,
),
InferredParam(
key="baseline_completions_per_day",
ttl_sec=7 * 86_400,
cold_start_default=1.0,
min_history=14,
infer=_infer_baseline_completions_per_day,
),
InferredParam(
key="stdev",
ttl_sec=7 * 86_400,
cold_start_default=1.0,
min_history=14,
infer=_infer_stdev,
),
],
)
def _z_score_label(z: float) -> str | None:
"""Map z-score to a human-readable momentum label, or None if within normal range."""
if z >= 2.0:
return "well above your usual pace"
if z >= 1.0:
return "above your usual pace"
if z <= -2.0:
return "well below your usual pace"
if z <= -1.0:
return "below your usual pace"
return None
class MomentumAgent(BaseAgent):
"""Characterises the user's recent engagement trend from profile features."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
completion = inp.profile.get("completion_rate_30d")
dismiss = inp.profile.get("dismiss_rate_30d")
volume = inp.profile.get("tip_volume_30d")
trend: str = inp.agent_prefs.get("engagement_trend", "stable")
baseline: float = float(inp.agent_prefs.get("baseline_completions_per_day", 1.0))
stdev: float = max(float(inp.agent_prefs.get("stdev", 1.0)), 0.1)
window: int = int(inp.agent_prefs.get("momentum_window", 7))
# Count done events in the recent window from feedback_history.
now = inp.now.astimezone(timezone.utc)
cutoff = now - timedelta(days=window)
recent_done = sum(
1 for e in inp.feedback_history
if e.get("action") == "done" and _parse_dt(e.get("created_at", "")) >= cutoff
)
recent_rate = recent_done / window # completions/day over the window
z = (recent_rate - baseline) / stdev
z_label = _z_score_label(z)
parts: list[str] = []
if completion is not None:
pct = round(completion * 100)
if pct >= 50:
parts.append(f"The user completes {pct}% of tips (strong engagement).")
elif pct >= 25:
parts.append(f"The user completes {pct}% of tips (moderate engagement).")
else:
parts.append(
f"The user completes {pct}% of tips "
f"(low engagement — prefer simple, immediately actionable tips)."
)
else:
parts.append("No completion-rate data yet (new user).")
if dismiss is not None:
dpct = round(dismiss * 100)
if dpct >= 40:
parts.append(f"Dismiss rate is high ({dpct}%) — avoid repetitive or irrelevant tips.")
elif dpct <= 10:
parts.append(f"Dismiss rate is low ({dpct}%).")
if volume is not None and int(volume) < 5:
parts.append("Very few tips served so far — this is an early-stage user.")
# Z-score takes precedence over trend label when we have a baseline.
if z_label:
if z > 0:
parts.append(
f"Completion pace is {z_label} "
f"({recent_done} done in the last {window}d vs "
f"~{baseline * window:.1f} expected) — build on the momentum."
)
else:
parts.append(
f"Completion pace is {z_label} "
f"({recent_done} done in the last {window}d vs "
f"~{baseline * window:.1f} expected) — a motivational or easy-win tip may help."
)
elif trend == "up":
parts.append("Engagement is trending up compared to last week — build on the momentum.")
elif trend == "down":
parts.append("Engagement is trending down — a motivational or easy-win tip may help.")
prompt = " ".join(parts) if parts else "No engagement data available yet."
snapshot = {
"completion_rate_30d": completion,
"dismiss_rate_30d": dismiss,
"tip_volume_30d": volume,
"engagement_trend": trend,
"baseline_completions_per_day": baseline,
"stdev": stdev,
"momentum_window": window,
"recent_done_count": recent_done,
"z_score": round(z, 2),
}
return self._make_output(inp, prompt, snapshot)

165
ml/agents/overdue_task.py Normal file
View File

@@ -0,0 +1,165 @@
from __future__ import annotations
import statistics
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .inference.history import UserHistory
from .manifest import AgentManifest, InferredParam
def _infer_lateness_tolerance(history: UserHistory) -> float:
"""p50 lateness (days) across completed tasks that had a due date, clipped at 0.
Negative lateness (finished early) pulls the percentile down; we clip at 0
so punctual users always get tolerance=0, never a negative offset.
"""
lateness = [c.lateness_days for c in history.task_completions]
if not lateness:
return 0.0
return max(0.0, statistics.median(lateness))
def _infer_project_realness(history: UserHistory) -> dict[str, float]:
"""Per-project realness: 1 (median project lateness / global median lateness).
Projects whose tasks are consistently completed on time get realness ≈ 1.
Aspirational projects (chronic lateness) get realness closer to 0.
"""
completions = [c for c in history.task_completions if c.project_id]
if not completions:
return {}
global_median = statistics.median(c.lateness_days for c in completions)
if global_median <= 0:
# Everyone finishes early — no project is less real than another.
return {pid: 1.0 for pid in {c.project_id for c in completions}} # type: ignore[misc]
by_project: dict[str, list[float]] = {}
for c in completions:
by_project.setdefault(c.project_id, []).append(c.lateness_days) # type: ignore[index]
result: dict[str, float] = {}
for pid, days in by_project.items():
project_median = statistics.median(days)
realness = 1.0 - (project_median / global_median)
result[pid] = round(max(0.0, min(1.0, realness)), 3)
return result
MANIFEST = AgentManifest(
id="overdue-task",
version="1.2.0", # #115: p50-lateness tolerance + per-project realness
description="Reports the user's overdue tasks by count and age.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"lateness_tolerance_days": {
"type": "number",
"minimum": 0,
"default": 0,
"description": "Days past due before a task is flagged. p50 of historical lateness.",
},
"project_realness": {
"type": "object",
"additionalProperties": {"type": "number", "minimum": 0, "maximum": 1},
"default": {},
"description": "Per-project realness score [0,1]. Low = aspirational due dates.",
},
},
},
context_schema=["todoist.tasks"],
required_consents=["data:core", "data:todoist"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=3600,
silenced_in_contexts=["vacation"],
inferred_params=[
InferredParam(
key="lateness_tolerance_days",
ttl_sec=7 * 86_400, # recompute weekly — lateness habits shift slowly
cold_start_default=0.0,
min_history=10,
infer=_infer_lateness_tolerance,
),
InferredParam(
key="project_realness",
ttl_sec=7 * 86_400,
cold_start_default={},
min_history=10,
infer=_infer_project_realness,
),
],
)
def _realness(project_id: str | None, project_realness: dict[str, float]) -> float:
"""Return realness for a project, defaulting to 1.0 (treat as real)."""
if not project_id or not project_realness:
return 1.0
return project_realness.get(project_id, 1.0)
def _format_task(task: dict, project_realness: dict[str, float]) -> str:
content = task["content"]
age = round(task.get("task_age_days", 0))
pid = task.get("project_id")
r = _realness(pid, project_realness)
unit = "day" if age == 1 else "days"
if r < 0.4:
return f'"{content}" ({age} {unit} past target date)'
return f'"{content}" ({age} {unit} overdue)'
class OverdueTaskAgent(BaseAgent):
"""Reports the user's overdue tasks by count and age."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
tolerance = max(0.0, float(inp.agent_prefs.get("lateness_tolerance_days", 0)))
project_realness: dict[str, float] = inp.agent_prefs.get("project_realness", {})
overdue = [
t for t in inp.tasks
if t.get("is_overdue") and t.get("task_age_days", 0) >= tolerance
]
top = sorted(overdue, key=lambda t: -t.get("task_age_days", 0))[:3]
if not overdue:
prompt = "The user has no overdue tasks at this time. (Always write the tip in English.)"
elif len(overdue) == 1:
t = top[0]
r = _realness(t.get("project_id"), project_realness)
item = _format_task(t, project_realness)
if r < 0.4:
prompt = f"The user has 1 task past its target date: {item}. (Task titles may be in any language — always write the tip in English.)"
else:
prompt = f"The user has 1 overdue task: {item}. (Task titles may be in any language — always write the tip in English.)"
else:
items = ", ".join(_format_task(t, project_realness) for t in top)
avg_realness = (
sum(_realness(t.get("project_id"), project_realness) for t in overdue)
/ len(overdue)
)
label = "tasks past their target dates" if avg_realness < 0.4 else "overdue tasks"
prompt = (
f"The user has {len(overdue)} {label}. "
f"Top {len(top)}: {items}. (Task titles may be in any language — always write the tip in English.)"
)
snapshot = {
"overdue_count": len(overdue),
"lateness_tolerance_days": tolerance,
"top_overdue": [
{
"content": t["content"],
"task_age_days": t.get("task_age_days", 0),
"project_id": t.get("project_id"),
"realness": _realness(t.get("project_id"), project_realness),
}
for t in top
],
}
return self._make_output(inp, prompt, snapshot)

View File

@@ -0,0 +1,271 @@
from __future__ import annotations
import math
from collections import Counter
from datetime import datetime, timezone
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .inference.history import UserHistory
from .manifest import AgentManifest, InferredParam
_DOW_NAMES = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
def _parse_dt(iso: str) -> datetime:
try:
dt = datetime.fromisoformat(iso.replace("Z", "+00:00"))
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
except ValueError:
return datetime.min.replace(tzinfo=timezone.utc)
def _infer_lookback_days(history: UserHistory) -> int:
"""Find the minimum window (days) that captures ≥30 done events, capped at 30.
Sorts done events newest-first, then measures the span to the 30th event.
If fewer than 30 done events exist, returns 30 (use the full cap).
"""
done = sorted(
[e for e in history.events if e.action == "done"],
key=lambda e: e.created_at,
reverse=True,
)
if len(done) < 30:
return 30
latest = _parse_dt(done[0].created_at)
thirtieth = _parse_dt(done[29].created_at)
span = (latest - thirtieth).total_seconds() / 86_400
return max(1, min(30, math.ceil(span)))
def _infer_weekly_cycle(history: UserHistory) -> list[dict]:
"""Peak-to-mean ratio of done events per day-of-week (0=Monday … 6=Sunday).
Returns all 7 DOW entries so the caller can filter by strength threshold.
"""
by_dow: Counter[int] = Counter(
_parse_dt(e.created_at).weekday()
for e in history.events
if e.action == "done"
)
total = sum(by_dow.values())
if total == 0:
return []
mean = total / 7
return [
{
"dow": dow,
"strength": round(by_dow.get(dow, 0) / mean, 3),
"sample": f"completes most {_DOW_NAMES[dow]}s",
}
for dow in range(7)
]
def _infer_daily_cycle(history: UserHistory) -> list[dict]:
"""Peak-to-mean ratio of done events per hour-of-day (023).
Returns entries for hours that have at least one done event.
"""
by_hour: Counter[int] = Counter(
_parse_dt(e.created_at).hour
for e in history.events
if e.action == "done"
)
total = sum(by_hour.values())
if total == 0:
return []
mean = total / 24
return [
{
"hour": hour,
"strength": round(by_hour[hour] / mean, 3),
}
for hour in sorted(by_hour)
]
MANIFEST = AgentManifest(
id="recent-patterns",
version="1.2.0", # #116: lookback_days + weekly_cycle + daily_cycle inference
description="Surfaces the user's reaction pattern from recent feedback.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"lookback_days": {
"type": "integer",
"minimum": 1,
"maximum": 30,
"default": 7,
"description": "Lookback window sized to capture ≥30 done events.",
},
"weekly_cycle": {
"type": "array",
"items": {
"type": "object",
"properties": {
"dow": {"type": "integer"},
"strength": {"type": "number"},
"sample": {"type": "string"},
},
},
"default": [],
"description": "Per-DOW completion strength (peak-to-mean ratio).",
},
"daily_cycle": {
"type": "array",
"items": {
"type": "object",
"properties": {
"hour": {"type": "integer"},
"strength": {"type": "number"},
},
},
"default": [],
"description": "Per-hour completion strength (peak-to-mean ratio).",
},
},
},
context_schema=["tip_feedback", "profile.features"],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=86_400,
inferred_params=[
InferredParam(
key="lookback_days",
ttl_sec=86_400,
cold_start_default=7,
min_history=5,
infer=_infer_lookback_days,
),
InferredParam(
key="weekly_cycle",
ttl_sec=86_400,
cold_start_default=[],
min_history=21, # need ≥3 weeks to see a weekly signal
infer=_infer_weekly_cycle,
),
InferredParam(
key="daily_cycle",
ttl_sec=86_400,
cold_start_default=[],
min_history=14,
infer=_infer_daily_cycle,
),
],
)
_STRENGTH_THRESHOLD = 0.5
def _strong(entries: list[dict], key: str) -> list[dict]:
return [e for e in entries if e.get("strength", 0) > _STRENGTH_THRESHOLD]
def _hour_label(hour: int) -> str:
if hour == 0:
return "midnight"
if hour < 12:
return f"{hour}am"
if hour == 12:
return "noon"
return f"{hour - 12}pm"
class RecentPatternsAgent(BaseAgent):
"""Surfaces the user's reaction pattern from recent feedback."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
# Support legacy window_days pref key for backward compat.
lookback_days = max(
1,
int(inp.agent_prefs.get("lookback_days", inp.agent_prefs.get("window_days", 7))),
)
weekly_cycle: list[dict] = inp.agent_prefs.get("weekly_cycle", [])
daily_cycle: list[dict] = inp.agent_prefs.get("daily_cycle", [])
window_s = lookback_days * 86_400
now_ts = inp.now.timestamp()
recent = [
f for f in inp.feedback_history
if self._age_s(f.get("created_at", ""), now_ts) <= window_s
]
counts: Counter[str] = Counter(f.get("action") for f in recent)
total = len(recent)
dwell_ms = inp.profile.get("mean_dwell_ms_30d")
parts: list[str] = []
if total == 0:
parts.append(f"No tip reactions recorded in the last {lookback_days} days.")
else:
done = counts.get("done", 0)
dismissed = counts.get("dismiss", 0)
snoozed = counts.get("snooze", 0)
parts.append(
f"Last {lookback_days} days: {total} tip reaction{'s' if total != 1 else ''}"
f"{done} completed, {dismissed} dismissed, {snoozed} snoozed."
)
if dwell_ms is not None:
dwell_s = round(dwell_ms / 1000)
if dwell_s < 15:
parts.append(
"Average dwell is very short — user may be acting on auto-pilot; vary tip content."
)
elif dwell_s < 60:
parts.append(f"Average dwell {dwell_s}s — tips are being read.")
else:
parts.append(
f"Average dwell {dwell_s}s — user deliberates; prefer tips that reward reflection."
)
# Cycle hints — only when strength > threshold.
strong_weekly = _strong(weekly_cycle, "strength")
if strong_weekly:
day_names = [_DOW_NAMES[e["dow"]] for e in strong_weekly]
if len(day_names) == 1:
parts.append(f"User tends to complete tips on {day_names[0]}s.")
else:
joined = ", ".join(day_names[:-1]) + f" and {day_names[-1]}"
parts.append(f"User tends to complete tips on {joined}s.")
strong_daily = _strong(daily_cycle, "strength")
if strong_daily:
hour_labels = [_hour_label(e["hour"]) for e in strong_daily]
if len(hour_labels) == 1:
parts.append(f"User is most active around {hour_labels[0]}.")
else:
joined = ", ".join(hour_labels[:-1]) + f" and {hour_labels[-1]}"
parts.append(f"User is most active around {joined}.")
prompt = " ".join(parts) if parts else "No engagement data available yet."
snapshot = {
"lookback_days": lookback_days,
"recent_total": total,
"action_counts": dict(counts),
"mean_dwell_ms_30d": dwell_ms,
"strong_weekly_days": [e["dow"] for e in strong_weekly],
"strong_daily_hours": [e["hour"] for e in strong_daily],
}
return self._make_output(inp, prompt, snapshot)
@staticmethod
def _age_s(iso: str, now_ts: float) -> float:
if not iso:
return float("inf")
try:
dt = datetime.fromisoformat(iso.replace("Z", "+00:00"))
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return now_ts - dt.timestamp()
except Exception:
return float("inf")

64
ml/agents/registry.py Normal file
View File

@@ -0,0 +1,64 @@
"""Agent registry — single point of registration for sub-agents (ADR-0014).
Each agent module contributes:
- a `BaseAgent` subclass instance
- a module-level `MANIFEST: AgentManifest`
The orchestrator, registry endpoint, and inference framework all read from
here. Adding an agent is: add a module, register it once below.
"""
from __future__ import annotations
from .base import BaseAgent
from .manifest import AgentManifest
from .overdue_task import OverdueTaskAgent, MANIFEST as OVERDUE_TASK_MANIFEST
from .momentum import MomentumAgent, MANIFEST as MOMENTUM_MANIFEST
from .time_of_day import TimeOfDayAgent, MANIFEST as TIME_OF_DAY_MANIFEST
from .recent_patterns import RecentPatternsAgent, MANIFEST as RECENT_PATTERNS_MANIFEST
from .focus_area import FocusAreaAgent, MANIFEST as FOCUS_AREA_MANIFEST
from .health_vitals import HealthVitalsAgent, MANIFEST as HEALTH_VITALS_MANIFEST
from .tarot import TarotAgent, MANIFEST as TAROT_MANIFEST
from .stars import StarsAgent, MANIFEST as STARS_MANIFEST
_REGISTERED: list[tuple[BaseAgent, AgentManifest]] = [
(OverdueTaskAgent(), OVERDUE_TASK_MANIFEST),
(MomentumAgent(), MOMENTUM_MANIFEST),
(TimeOfDayAgent(), TIME_OF_DAY_MANIFEST),
(RecentPatternsAgent(), RECENT_PATTERNS_MANIFEST),
(FocusAreaAgent(), FOCUS_AREA_MANIFEST),
(HealthVitalsAgent(), HEALTH_VITALS_MANIFEST),
(TarotAgent(), TAROT_MANIFEST),
(StarsAgent(), STARS_MANIFEST),
]
# Sanity check — agent_id and manifest.id must agree, otherwise the registry
# becomes inconsistent across endpoints.
for _agent, _manifest in _REGISTERED:
if _agent.agent_id != _manifest.id:
raise RuntimeError(
f"Manifest mismatch: {_agent.__class__.__name__}.agent_id={_agent.agent_id!r} "
f"≠ MANIFEST.id={_manifest.id!r}"
)
_AGENTS: dict[str, BaseAgent] = {a.agent_id: a for a, _ in _REGISTERED}
_MANIFESTS: dict[str, AgentManifest] = {m.id: m for _, m in _REGISTERED}
def get_agent(agent_id: str) -> BaseAgent:
if agent_id not in _AGENTS:
raise KeyError(f"Unknown agent: {agent_id!r}. Known: {sorted(_AGENTS)}")
return _AGENTS[agent_id]
def all_agents() -> list[BaseAgent]:
return list(_AGENTS.values())
def get_manifest(agent_id: str) -> AgentManifest:
if agent_id not in _MANIFESTS:
raise KeyError(f"Unknown agent: {agent_id!r}. Known: {sorted(_MANIFESTS)}")
return _MANIFESTS[agent_id]
def all_manifests() -> list[AgentManifest]:
return list(_MANIFESTS.values())

233
ml/agents/stars.py Normal file
View File

@@ -0,0 +1,233 @@
"""Stars agent — astrological transit predictions via pyswisseph.
Requires birth_date in agent_prefs (ISO 8601 date string, e.g. '1990-06-15').
Populated from a connected data source (Google profile / Google Health).
If birth_date is absent the agent returns a no-data snippet and the
eligibility filter will silence it once the consent / pref check catches up.
Computes today's Sun, Moon, Mercury, Venus, Mars, Jupiter, Saturn positions
and finds notable transits (conjunctions, oppositions, squares, trines, sextiles)
between today's sky and the user's natal chart. Passes a concise prediction
+ interpretation to the orchestrator.
"""
from __future__ import annotations
import math
from datetime import date, datetime, timezone
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .manifest import AgentManifest, InferredParam
try:
import swisseph as swe # type: ignore
_SWE_AVAILABLE = True
except ImportError: # pragma: no cover — present in container, absent in dev
_SWE_AVAILABLE = False
# ---------------------------------------------------------------------------
# Planet catalogue
# ---------------------------------------------------------------------------
_PLANETS: list[tuple[int, str]] = []
if _SWE_AVAILABLE:
_PLANETS = [
(swe.SUN, "Sun"),
(swe.MOON, "Moon"),
(swe.MERCURY, "Mercury"),
(swe.VENUS, "Venus"),
(swe.MARS, "Mars"),
(swe.JUPITER, "Jupiter"),
(swe.SATURN, "Saturn"),
]
# Aspect definitions: (angle, orb, name, nature)
_ASPECTS: list[tuple[float, float, str, str]] = [
(0.0, 8.0, "conjunction", "intensifying"),
(60.0, 6.0, "sextile", "harmonious"),
(90.0, 7.0, "square", "challenging"),
(120.0, 8.0, "trine", "flowing"),
(180.0, 8.0, "opposition", "tension"),
]
_ZODIAC = [
"Aries", "Taurus", "Gemini", "Cancer", "Leo", "Virgo",
"Libra", "Scorpio", "Sagittarius", "Capricorn", "Aquarius", "Pisces",
]
# Interpretive keywords per planet for transit readings
_PLANET_THEMES: dict[str, str] = {
"Sun": "identity, vitality, core purpose",
"Moon": "emotions, intuition, comfort needs",
"Mercury": "communication, thinking, decisions",
"Venus": "relationships, values, pleasure",
"Mars": "energy, drive, conflict",
"Jupiter": "growth, opportunity, expansion",
"Saturn": "discipline, responsibility, long-term structure",
}
def _zodiac_sign(lon: float) -> str:
return _ZODIAC[int(lon / 30) % 12]
def _jd_from_date(d: date) -> float:
"""Julian Day Number for noon UTC on the given date."""
assert _SWE_AVAILABLE
return swe.julday(d.year, d.month, d.day, 12.0)
def _planet_positions(jd: float) -> dict[str, float]:
assert _SWE_AVAILABLE
positions: dict[str, float] = {}
for pid, name in _PLANETS:
result, _ = swe.calc_ut(jd, pid)
positions[name] = result[0] # ecliptic longitude
return positions
def _angular_diff(a: float, b: float) -> float:
"""Smallest angle between two ecliptic longitudes (0180)."""
diff = abs(a - b) % 360
return diff if diff <= 180 else 360 - diff
def _find_transits(natal: dict[str, float], today: dict[str, float]) -> list[dict]:
"""Return list of active transits between today's sky and natal chart."""
transits: list[dict] = []
for t_name, t_lon in today.items():
for n_name, n_lon in natal.items():
diff = _angular_diff(t_lon, n_lon)
for angle, orb, aspect_name, nature in _ASPECTS:
if abs(diff - angle) <= orb:
transits.append({
"transit_planet": t_name,
"natal_planet": n_name,
"aspect": aspect_name,
"nature": nature,
"orb": round(abs(diff - angle), 2),
})
# Sort by tightness of orb
transits.sort(key=lambda x: x["orb"])
return transits
def _format_transit(t: dict) -> str:
tp, np, asp, nat = t["transit_planet"], t["natal_planet"], t["aspect"], t["nature"]
tp_theme = _PLANET_THEMES.get(tp, "")
np_theme = _PLANET_THEMES.get(np, "")
return (
f"Transiting {tp} ({tp_theme}) {asp} natal {np} ({np_theme}) "
f"— a {nat} influence"
)
# ---------------------------------------------------------------------------
# Manifest
# ---------------------------------------------------------------------------
MANIFEST = AgentManifest(
id="stars",
version="1.0.0",
description="Astrological transit predictions based on the user's birth date and today's planetary positions.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"birth_date": {
"type": "string",
"pattern": r"^\d{4}-\d{2}-\d{2}$",
"description": "ISO 8601 birth date (YYYY-MM-DD). Populated from connected data source.",
},
},
},
context_schema=["profile.birth_date"],
# Requires a connected Google source that supplies birth date.
# data:google-health is the current carrier; when Google profile is a
# separate consent key, add it here.
required_consents=["data:core", "data:google-health"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=3_600 * 6, # planetary positions change slowly — 6 h is fine
silenced_in_contexts=[],
inferred_params=[
InferredParam(
key="birth_date",
ttl_sec=365 * 86_400, # effectively permanent once known
cold_start_default=None,
min_history=999_999, # never inferred from events — sourced externally
infer=None,
),
],
)
class StarsAgent(BaseAgent):
"""Produces astrological transit predictions for the user's birth chart."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
birth_date_str: str | None = inp.agent_prefs.get("birth_date")
if not birth_date_str:
prompt = (
"Birth date is not available — astrological reading skipped. "
"(Always write the tip in English.)"
)
return self._make_output(inp, prompt, {"no_birth_date": True})
if not _SWE_AVAILABLE:
prompt = (
"Astrological library unavailable — reading skipped. "
"(Always write the tip in English.)"
)
return self._make_output(inp, prompt, {"swe_unavailable": True})
try:
birth_date = date.fromisoformat(birth_date_str)
except ValueError:
prompt = "Birth date format invalid — astrological reading skipped."
return self._make_output(inp, prompt, {"invalid_birth_date": birth_date_str})
today_date = inp.now.date()
natal_jd = _jd_from_date(birth_date)
today_jd = _jd_from_date(today_date)
natal_pos = _planet_positions(natal_jd)
today_pos = _planet_positions(today_jd)
transits = _find_transits(natal_pos, today_pos)
top = transits[:3] # most exact transits only
today_sun_sign = _zodiac_sign(today_pos["Sun"])
natal_sun_sign = _zodiac_sign(natal_pos["Sun"])
natal_moon_sign = _zodiac_sign(natal_pos["Moon"])
snapshot = {
"birth_date": birth_date_str,
"today": today_date.isoformat(),
"natal_sun": natal_sun_sign,
"natal_moon": natal_moon_sign,
"today_sun": today_sun_sign,
"active_transits": transits[:5],
}
if not top:
prompt = (
f"Natal chart: Sun in {natal_sun_sign}, Moon in {natal_moon_sign}. "
f"Today's Sun is in {today_sun_sign}. "
"No exact transits today — a quiet, stable day energetically. "
"(Always write the tip in English.)"
)
else:
transit_lines = "; ".join(_format_transit(t) for t in top)
prompt = (
f"Natal chart: Sun in {natal_sun_sign}, Moon in {natal_moon_sign}. "
f"Today's Sun is in {today_sun_sign}. "
f"Active transits: {transit_lines}. "
"Use these planetary themes to colour the tip — "
"keep it grounded and actionable, not predictive or fatalistic. "
"(Always write the tip in English.)"
)
return self._make_output(inp, prompt, snapshot)

110
ml/agents/tarot.py Normal file
View File

@@ -0,0 +1,110 @@
"""TAROT agent — three-card draw (situation / action / outcome).
Draws cards deterministically from a daily seed so the reading stays
stable for the day (same cards whether the agent runs at 08:00 or 14:00).
Card meanings are precomputed here and passed as a structured snippet to
the orchestrator, which weaves them into a grounded, actionable tip.
"""
from __future__ import annotations
import hashlib
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .manifest import AgentManifest
# ---------------------------------------------------------------------------
# Card definitions — Major Arcana only (22 cards, indices 021)
# Each entry: (name, upright_meaning, action_hint)
# ---------------------------------------------------------------------------
_CARDS: list[tuple[str, str, str]] = [
("The Fool", "new beginnings, spontaneity, a leap of faith", "start something without overthinking"),
("The Magician", "skill, willpower, resourcefulness", "use what you already have"),
("The High Priestess","intuition, inner knowing, patience", "listen to what you already sense is true"),
("The Empress", "abundance, creativity, nurturing", "invest energy in something generative"),
("The Emperor", "structure, authority, discipline", "set a boundary or impose order"),
("The Hierophant", "tradition, guidance, shared values", "seek or offer mentorship"),
("The Lovers", "alignment, choice, commitment", "make a decision you have been avoiding"),
("The Chariot", "determination, focus, forward motion", "push through the resistance"),
("Strength", "inner courage, patience, gentle persistence", "stay the course with compassion"),
("The Hermit", "solitude, reflection, inner guidance", "step back and think before acting"),
("Wheel of Fortune", "cycles, turning points, inevitable change", "acknowledge what is shifting around you"),
("Justice", "fairness, truth, cause and effect", "audit a recent decision for its real consequences"),
("The Hanged Man", "pause, surrender, new perspective", "release your grip on the outcome"),
("Death", "endings, transformation, release", "let go of what no longer serves you"),
("Temperance", "balance, moderation, patience", "blend two competing demands"),
("The Devil", "attachment, habit, shadow patterns", "name a loop you are stuck in"),
("The Tower", "sudden disruption, revelation, necessary collapse", "accept the thing that already broke"),
("The Star", "hope, renewal, calm after the storm", "trust that recovery is already underway"),
("The Moon", "uncertainty, illusion, the unconscious", "sit with ambiguity rather than forcing clarity"),
("The Sun", "clarity, vitality, success", "act from your most energised self"),
("Judgement", "reflection, reckoning, a call to rise", "respond to a long-deferred summons"),
("The World", "completion, integration, a cycle closing", "acknowledge what you have finished"),
]
_POSITIONS = ("situation", "action", "outcome")
def _daily_draw(user_id: str, date_str: str) -> list[int]:
"""Return three distinct card indices seeded by (user_id, date)."""
seed = hashlib.sha256(f"{user_id}:{date_str}".encode()).digest()
indices: list[int] = []
offset = 0
while len(indices) < 3:
val = int.from_bytes(seed[offset:offset + 2], "big") % len(_CARDS)
if val not in indices:
indices.append(val)
offset = (offset + 2) % (len(seed) - 1)
return indices
MANIFEST = AgentManifest(
id="tarot",
version="1.0.0",
description="Daily three-card draw (situation/action/outcome) that frames the tip as a symbolic reflection.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"enabled": {
"type": "boolean",
"default": True,
"description": "Set false to disable the tarot agent for this user.",
},
},
},
context_schema=[],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=3_600 * 6, # stable for 6 h; refreshes mid-day at most twice
silenced_in_contexts=[],
inferred_params=[],
)
class TarotAgent(BaseAgent):
"""Produces a three-card reading as a prompt snippet."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
date_str = inp.now.strftime("%Y-%m-%d")
indices = _daily_draw(inp.user_id, date_str)
reading: list[dict] = []
parts: list[str] = [f"Today's tarot reading ({date_str}):"]
for pos, idx in zip(_POSITIONS, indices):
name, meaning, hint = _CARDS[idx]
reading.append({"position": pos, "card": name, "meaning": meaning, "hint": hint})
parts.append(f" {pos.capitalize()}{name}: {meaning}. Hint: {hint}.")
parts.append(
"Weave these symbolic themes lightly into the tip — "
"ground them in practical, specific action. "
"Do not explain the cards; let their meaning shape the advice."
)
prompt = "\n".join(parts)
snapshot = {"date": date_str, "reading": reading}
return self._make_output(inp, prompt, snapshot)

View File

View File

@@ -0,0 +1,370 @@
"""Unit tests for all sub-agents and the registry."""
from __future__ import annotations
import sys, os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", ".."))
from datetime import datetime, timezone
import pytest
from ml.agents.base import AgentInput, AgentOutput
from ml.agents.overdue_task import OverdueTaskAgent
from ml.agents.momentum import MomentumAgent
from ml.agents.time_of_day import TimeOfDayAgent
from ml.agents.recent_patterns import RecentPatternsAgent
from ml.agents.focus_area import FocusAreaAgent
from ml.agents.tarot import TarotAgent, _daily_draw, _CARDS, _POSITIONS
from ml.agents.stars import StarsAgent, _SWE_AVAILABLE
from ml.agents.registry import get_agent, all_agents
_NOW = datetime(2026, 5, 1, 9, 0, 0, tzinfo=timezone.utc) # Thursday 09:00 UTC
def _inp(**kwargs) -> AgentInput:
defaults = dict(
user_id="u1",
tasks=[],
profile={},
feedback_history=[],
now=_NOW,
)
defaults.update(kwargs)
return AgentInput(**defaults)
def _task(content="Do thing", is_overdue=False, task_age_days=0.0, priority=1, project_id=None):
t = {"id": "t1", "content": content, "is_overdue": is_overdue,
"task_age_days": task_age_days, "priority": priority}
if project_id:
t["project_id"] = project_id
return t
# ── helpers ──────────────────────────────────────────────────────────────────
def _check_output(out: AgentOutput, agent) -> None:
assert isinstance(out, AgentOutput)
assert out.user_id == "u1"
assert out.agent_id == agent.agent_id
assert out.prompt_text
assert out.computed_at
assert out.expires_at > out.computed_at
assert out.agent_version == agent.version
# ── OverdueTaskAgent ──────────────────────────────────────────────────────────
class TestOverdueTaskAgent:
agent = OverdueTaskAgent()
def test_no_overdue(self):
out = self.agent.compute(_inp(tasks=[_task("Read book")]))
_check_output(out, self.agent)
assert "no overdue" in out.prompt_text.lower()
assert out.signals_snapshot["overdue_count"] == 0
def test_single_overdue(self):
out = self.agent.compute(_inp(tasks=[_task("Call dentist", is_overdue=True, task_age_days=3)]))
_check_output(out, self.agent)
assert "1 overdue" in out.prompt_text
assert "Call dentist" in out.prompt_text
assert "3 day" in out.prompt_text
def test_multiple_overdue_top3(self):
tasks = [
_task(f"Task {i}", is_overdue=True, task_age_days=float(i))
for i in range(1, 6)
]
out = self.agent.compute(_inp(tasks=tasks))
_check_output(out, self.agent)
assert "5 overdue" in out.prompt_text
assert out.signals_snapshot["overdue_count"] == 5
assert len(out.signals_snapshot["top_overdue"]) == 3
# Top 3 should be highest age: 5, 4, 3
ages = [t["task_age_days"] for t in out.signals_snapshot["top_overdue"]]
assert ages == sorted(ages, reverse=True)
def test_ttl_respected(self):
out = self.agent.compute(_inp())
assert out.expires_at > out.computed_at
# ── MomentumAgent ─────────────────────────────────────────────────────────────
class TestMomentumAgent:
agent = MomentumAgent()
def test_no_profile(self):
out = self.agent.compute(_inp(profile={}))
_check_output(out, self.agent)
assert "new user" in out.prompt_text.lower() or "no " in out.prompt_text.lower()
def test_strong_engagement(self):
out = self.agent.compute(_inp(profile={"completion_rate_30d": 0.65, "dismiss_rate_30d": 0.05}))
assert "strong engagement" in out.prompt_text
def test_low_completion_warns(self):
out = self.agent.compute(_inp(profile={"completion_rate_30d": 0.1}))
assert "low engagement" in out.prompt_text
assert "actionable" in out.prompt_text
def test_high_dismiss_warns(self):
out = self.agent.compute(_inp(profile={"completion_rate_30d": 0.3, "dismiss_rate_30d": 0.5}))
assert "dismiss rate is high" in out.prompt_text.lower()
def test_early_stage_user(self):
out = self.agent.compute(_inp(profile={"tip_volume_30d": 2.0}))
assert "early-stage" in out.prompt_text
# ── TimeOfDayAgent ────────────────────────────────────────────────────────────
class TestTimeOfDayAgent:
agent = TimeOfDayAgent()
def test_morning_label(self):
inp = _inp(now=datetime(2026, 5, 1, 8, 0, tzinfo=timezone.utc)) # Friday
out = self.agent.compute(inp)
assert "morning" in out.prompt_text
assert "08:00" in out.prompt_text
def test_weekend_note(self):
inp = _inp(now=datetime(2026, 5, 2, 10, 0, tzinfo=timezone.utc)) # Saturday
out = self.agent.compute(inp)
assert "weekend" in out.prompt_text.lower()
def test_peak_hour_exact(self):
inp = _inp(
now=datetime(2026, 5, 1, 10, 0, tzinfo=timezone.utc),
profile={"preferred_hour": 10.0},
)
out = self.agent.compute(inp)
assert "peak productivity hour" in out.prompt_text
def test_approaching_peak(self):
inp = _inp(
now=datetime(2026, 5, 1, 9, 0, tzinfo=timezone.utc),
profile={"preferred_hour": 10.0},
)
out = self.agent.compute(inp)
assert "approaching" in out.prompt_text.lower()
def test_no_preferred_hour(self):
out = self.agent.compute(_inp())
assert "no preferred-hour" in out.prompt_text.lower()
def test_snapshot_keys(self):
out = self.agent.compute(_inp())
assert {"hour", "day_of_week", "preferred_hour", "quiet_start", "quiet_end",
"peak_hours", "in_quiet", "in_peak", "tz"} == set(out.signals_snapshot)
# ── RecentPatternsAgent ───────────────────────────────────────────────────────
class TestRecentPatternsAgent:
agent = RecentPatternsAgent()
def test_no_feedback(self):
out = self.agent.compute(_inp())
assert "no tip reactions" in out.prompt_text.lower()
def test_recent_feedback_summary(self):
now_iso = _NOW.isoformat()
feedback = [
{"action": "done", "dwell_ms": 30000, "created_at": now_iso},
{"action": "done", "dwell_ms": 45000, "created_at": now_iso},
{"action": "dismiss", "dwell_ms": 2000, "created_at": now_iso},
]
out = self.agent.compute(_inp(feedback_history=feedback))
assert "3 tip reactions" in out.prompt_text
assert "2 completed" in out.prompt_text
assert "1 dismissed" in out.prompt_text
def test_old_feedback_excluded(self):
# 10 days ago — should be excluded from 7-day window
old_iso = "2026-04-21T09:00:00+00:00"
feedback = [{"action": "done", "dwell_ms": 5000, "created_at": old_iso}]
out = self.agent.compute(_inp(feedback_history=feedback))
assert "no tip reactions" in out.prompt_text.lower()
def test_short_dwell_note(self):
now_iso = _NOW.isoformat()
feedback = [{"action": "done", "dwell_ms": 5000, "created_at": now_iso}]
out = self.agent.compute(_inp(
feedback_history=feedback,
profile={"mean_dwell_ms_30d": 5000.0},
))
assert "auto-pilot" in out.prompt_text.lower() or "short" in out.prompt_text.lower()
def test_long_dwell_note(self):
now_iso = _NOW.isoformat()
feedback = [{"action": "done", "dwell_ms": 90000, "created_at": now_iso}]
out = self.agent.compute(_inp(
feedback_history=feedback,
profile={"mean_dwell_ms_30d": 90000.0},
))
assert "deliberate" in out.prompt_text.lower() or "reflection" in out.prompt_text.lower()
# ── FocusAreaAgent ────────────────────────────────────────────────────────────
class TestFocusAreaAgent:
agent = FocusAreaAgent()
def test_no_tasks(self):
out = self.agent.compute(_inp())
assert "no tasks" in out.prompt_text.lower()
def test_lists_all_clusters(self):
tasks = (
[_task(f"W{i}", project_id="Work") for i in range(3)]
+ [_task(f"H{i}", project_id="Home") for i in range(2)]
)
out = self.agent.compute(_inp(tasks=tasks))
assert "Work" in out.prompt_text
assert "Home" in out.prompt_text
def test_includes_task_titles(self):
tasks = [_task("Buy milk", project_id="Personal"), _task("Write report", project_id="Personal")]
out = self.agent.compute(_inp(tasks=tasks))
assert '"Buy milk"' in out.prompt_text
assert '"Write report"' in out.prompt_text
def test_task_count_in_output(self):
tasks = [_task(f"T{i}", project_id="Work") for i in range(3)]
out = self.agent.compute(_inp(tasks=tasks))
assert "3 task" in out.prompt_text
def test_default_project_fallback(self):
out = self.agent.compute(_inp(tasks=[_task("No project task")]))
assert "Tasks" in out.prompt_text
def test_snapshot_keys(self):
out = self.agent.compute(_inp(tasks=[_task("T1", project_id="A")]))
public_keys = {k for k in out.signals_snapshot if not k.startswith("_")}
assert {"cluster_count", "clusters"} == public_keys
def test_snapshot_clusters_shape(self):
tasks = [_task("Buy milk", project_id="P1"), _task("Fix bug", project_id="P2")]
out = self.agent.compute(_inp(tasks=tasks))
clusters = out.signals_snapshot["clusters"]
assert isinstance(clusters, list)
assert all("label" in c and "task_count" in c and "tasks" in c for c in clusters)
# ── TarotAgent ────────────────────────────────────────────────────────────────
class TestTarotAgent:
agent = TarotAgent()
def test_basic_output(self):
out = self.agent.compute(_inp())
_check_output(out, self.agent)
assert "situation" in out.prompt_text.lower()
assert "action" in out.prompt_text.lower()
assert "outcome" in out.prompt_text.lower()
assert out.signals_snapshot["date"] == "2026-05-01"
assert len(out.signals_snapshot["reading"]) == 3
def test_three_distinct_cards(self):
out = self.agent.compute(_inp())
cards = [r["card"] for r in out.signals_snapshot["reading"]]
assert len(set(cards)) == 3
def test_positions_labelled(self):
out = self.agent.compute(_inp())
positions = [r["position"] for r in out.signals_snapshot["reading"]]
assert positions == list(_POSITIONS)
def test_daily_stability(self):
out1 = self.agent.compute(_inp(now=datetime(2026, 5, 1, 8, 0, 0, tzinfo=timezone.utc)))
out2 = self.agent.compute(_inp(now=datetime(2026, 5, 1, 20, 0, 0, tzinfo=timezone.utc)))
assert out1.signals_snapshot["reading"] == out2.signals_snapshot["reading"]
def test_different_days_different_draw(self):
out1 = self.agent.compute(_inp(now=datetime(2026, 5, 1, 9, 0, 0, tzinfo=timezone.utc)))
out2 = self.agent.compute(_inp(now=datetime(2026, 5, 2, 9, 0, 0, tzinfo=timezone.utc)))
assert out1.signals_snapshot["reading"] != out2.signals_snapshot["reading"]
def test_different_users_different_draw(self):
out1 = self.agent.compute(_inp(user_id="user-A"))
out2 = self.agent.compute(_inp(user_id="user-B"))
assert out1.signals_snapshot["reading"] != out2.signals_snapshot["reading"]
def test_daily_draw_returns_valid_indices(self):
indices = _daily_draw("u1", "2026-05-01")
assert len(indices) == 3
assert len(set(indices)) == 3
assert all(0 <= i < len(_CARDS) for i in indices)
# ── StarsAgent ────────────────────────────────────────────────────────────────
class TestStarsAgent:
agent = StarsAgent()
def test_no_birth_date(self):
out = self.agent.compute(_inp())
_check_output(out, self.agent)
assert out.signals_snapshot.get("no_birth_date") is True
assert "birth date" in out.prompt_text.lower()
@pytest.mark.skipif(not _SWE_AVAILABLE, reason="pyswisseph not installed")
def test_invalid_birth_date(self):
out = self.agent.compute(_inp(agent_prefs={"birth_date": "not-a-date"}))
_check_output(out, self.agent)
assert out.signals_snapshot.get("invalid_birth_date") == "not-a-date"
@pytest.mark.skipif(not _SWE_AVAILABLE, reason="pyswisseph not installed")
def test_with_birth_date(self):
out = self.agent.compute(_inp(agent_prefs={"birth_date": "1990-06-15"}))
_check_output(out, self.agent)
assert "natal" in out.prompt_text.lower()
assert out.signals_snapshot["birth_date"] == "1990-06-15"
assert "natal_sun" in out.signals_snapshot
assert "natal_moon" in out.signals_snapshot
@pytest.mark.skipif(not _SWE_AVAILABLE, reason="pyswisseph not installed")
def test_transit_snapshot_structure(self):
out = self.agent.compute(_inp(agent_prefs={"birth_date": "1985-03-21"}))
snap = out.signals_snapshot
assert "active_transits" in snap
for t in snap["active_transits"]:
assert {"transit_planet", "natal_planet", "aspect", "nature", "orb"} <= t.keys()
def test_swe_unavailable_path(self, monkeypatch):
import ml.agents.stars as stars_mod
monkeypatch.setattr(stars_mod, "_SWE_AVAILABLE", False)
agent = StarsAgent()
out = agent.compute(_inp(agent_prefs={"birth_date": "1990-06-15"}))
_check_output(out, agent)
assert out.signals_snapshot.get("swe_unavailable") is True
# ── Registry ─────────────────────────────────────────────────────────────────
class TestRegistry:
def test_all_agents_present(self):
agents = all_agents()
ids = {a.agent_id for a in agents}
assert ids == {"overdue-task", "momentum", "time-of-day", "recent-patterns", "focus-area", "health-vitals", "tarot", "stars"}
def test_get_agent(self):
a = get_agent("momentum")
assert a.agent_id == "momentum"
def test_get_unknown_raises(self):
with pytest.raises(KeyError, match="Unknown agent"):
get_agent("nonexistent")
def test_all_agents_compute(self):
inp = _inp(
tasks=[_task("Buy milk", is_overdue=True, task_age_days=2, project_id="Personal")],
profile={"completion_rate_30d": 0.4, "tip_volume_30d": 10.0, "preferred_hour": 9.0},
feedback_history=[
{"action": "done", "dwell_ms": 25000, "created_at": _NOW.isoformat()}
],
)
for agent in all_agents():
out = agent.compute(inp)
_check_output(out, agent)

View File

@@ -0,0 +1,209 @@
"""Unit tests for ml.agents.clustering (issue #97, #129).
LLM and embedding calls are mocked so tests run without Ollama or LiteLLM.
"""
from __future__ import annotations
import sys, os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", ".."))
from unittest.mock import patch
from ml.agents.clustering import cluster_tasks, Cluster, _greedy_cluster, _cosine, _embed_batch, _enrich_batch
# ── helpers ──────────────────────────────────────────────────────────────────
def _task(content: str, project_id: str | None = None, is_overdue: bool = False) -> dict:
t: dict = {"content": content, "is_overdue": is_overdue}
if project_id:
t["project_id"] = project_id
return t
def _embed_seq(*vecs):
"""Return a side_effect list so successive _embed calls return these vectors."""
return list(vecs)
# ── Cluster dataclass ─────────────────────────────────────────────────────────
class TestCluster:
def test_task_count(self):
c = Cluster(label="X", tasks=[_task("a"), _task("b")])
assert c.task_count == 2
def test_overdue_count(self):
c = Cluster(label="X", tasks=[_task("a", is_overdue=True), _task("b")])
assert c.overdue_count == 1
# ── cosine similarity ─────────────────────────────────────────────────────────
class TestCosine:
def test_identical_vectors(self):
v = [1.0, 0.0, 0.0]
assert _cosine(v, v) == 1.0
def test_orthogonal_vectors(self):
assert _cosine([1.0, 0.0], [0.0, 1.0]) == 0.0
def test_zero_vector(self):
assert _cosine([0.0, 0.0], [1.0, 0.0]) == 0.0
# ── greedy clustering ─────────────────────────────────────────────────────────
class TestGreedyClustering:
def _similar_vec(self, base: list[float], noise: float = 0.01) -> list[float]:
return [x + noise for x in base]
def test_similar_tasks_grouped(self):
v = [1.0, 0.0, 0.0]
v2 = [0.999, 0.001, 0.0]
items = [
(_task("A"), v),
(_task("B"), v2),
]
clusters = _greedy_cluster(items)
assert len(clusters) == 1
assert clusters[0].task_count == 2
def test_dissimilar_tasks_separate(self):
v1 = [1.0, 0.0, 0.0]
v2 = [0.0, 1.0, 0.0]
items = [(_task("A"), v1), (_task("B"), v2)]
clusters = _greedy_cluster(items)
assert len(clusters) == 2
def test_label_from_first_task(self):
v = [1.0, 0.0]
clusters = _greedy_cluster([(_task("Write report"), v)])
assert clusters[0].label == "Write report"
# ── enrichment ───────────────────────────────────────────────────────────────
class TestEnrichBatch:
def test_falls_back_to_raw_when_no_litellm_url(self, monkeypatch):
monkeypatch.delenv("LITELLM_URL", raising=False)
result, new = _enrich_batch(["Buy milk", "Fix bug"])
assert result == ["Buy milk", "Fix bug"] and new == {}
def test_uses_description_when_litellm_available(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
with patch("ml.agents.clustering._enrich_title", return_value="Expanded description."):
result, new = _enrich_batch(["Buy milk"])
assert result == ["Expanded description."]
assert len(new) == 1
def test_falls_back_to_raw_title_on_enrich_failure(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
with patch("ml.agents.clustering._enrich_title", return_value=None):
result, new = _enrich_batch(["Buy milk"])
assert result == ["Buy milk"]
assert new == {} # failed enrichments are not persisted
def test_deduplicates_identical_titles(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
call_count = {"n": 0}
def fake_enrich(title, url):
call_count["n"] += 1
return f"desc:{title}"
with patch("ml.agents.clustering._enrich_title", side_effect=fake_enrich):
result, new = _enrich_batch(["Buy milk", "Buy milk", "Fix bug"])
assert call_count["n"] == 2 # only 2 unique titles
assert result == ["desc:Buy milk", "desc:Buy milk", "desc:Fix bug"]
def test_uses_persistent_cache(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
from ml.agents.clustering import _content_hash
h = _content_hash("Buy milk")
call_count = {"n": 0}
def fake_enrich(title, url):
call_count["n"] += 1
return "new desc"
with patch("ml.agents.clustering._enrich_title", side_effect=fake_enrich):
result, new = _enrich_batch(["Buy milk"], persistent_cache={h: "cached desc"})
assert call_count["n"] == 0 # cache hit, no LLM call
assert result == ["cached desc"]
assert new == {}
# ── cluster_tasks integration ─────────────────────────────────────────────────
class TestClusterTasks:
def _no_enrich(self, titles, persistent_cache=None):
return titles, {}
def test_empty_tasks(self):
clusters, new = cluster_tasks([])
assert clusters == [] and new == {}
def test_fallback_when_embed_unavailable(self):
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=None):
tasks = [_task("A", "p1"), _task("B", "p2"), _task("C", "p1")]
clusters, _ = cluster_tasks(tasks)
assert len(clusters) == 2
labels = {c.label for c in clusters}
assert "p1" in labels and "p2" in labels
def test_fallback_groups_by_project(self):
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=None):
tasks = [_task("A", "work")] * 3 + [_task("B", "home")] * 2
clusters, _ = cluster_tasks(tasks)
by_label = {c.label: c.task_count for c in clusters}
assert by_label["work"] == 3
assert by_label["home"] == 2
def test_tasks_without_content_go_to_other(self):
v = [1.0, 0.0]
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=[v]):
tasks = [_task("Has content"), {"is_overdue": False}]
clusters, _ = cluster_tasks(tasks)
labels = {c.label for c in clusters}
assert "Other tasks" in labels
def test_semantic_clustering_groups_similar(self):
v_work = [1.0, 0.0, 0.0]
v_home = [0.0, 1.0, 0.0]
batch_result = [v_work, v_work, v_home, v_home]
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=batch_result):
tasks = [
_task("Write report"),
_task("Review PR"),
_task("Buy groceries"),
_task("Cook dinner"),
]
clusters, _ = cluster_tasks(tasks)
assert len(clusters) == 2
assert all(c.task_count == 2 for c in clusters)
def test_all_tasks_no_content_fallback_by_project(self):
tasks = [{"project_id": "p1", "is_overdue": False},
{"project_id": "p2", "is_overdue": False}]
clusters, new = cluster_tasks(tasks)
assert len(clusters) == 2 and new == {}
def test_enrich_called_before_embed(self):
"""Verify enrichment output (not raw title) is what gets embedded."""
v = [1.0, 0.0]
captured = {}
def fake_embed(texts):
captured["texts"] = texts
return [v] * len(texts)
with patch("ml.agents.clustering._enrich_batch", return_value=(["Expanded desc."], {})), \
patch("ml.agents.clustering._embed_batch", side_effect=fake_embed):
cluster_tasks([_task("Buy milk")])
assert captured["texts"] == ["clustering: Expanded desc."]
def test_new_enrichments_returned(self):
v = [1.0, 0.0]
with patch("ml.agents.clustering._enrich_batch", return_value=(["desc"], {"abc123": "desc"})), \
patch("ml.agents.clustering._embed_batch", return_value=[v]):
_, new = cluster_tasks([_task("Buy milk")])
assert new == {"abc123": "desc"}

View File

@@ -0,0 +1,120 @@
"""Tests for the inference framework and time-of-day #112 proof."""
from __future__ import annotations
import sys, os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", ".."))
import pytest
from datetime import datetime, timezone
from ml.agents.inference.history import FeedbackEvent, UserHistory
from ml.agents.inference.framework import run_inference
from ml.agents.time_of_day import TimeOfDayAgent, MANIFEST as TOD_MANIFEST, MANIFEST
from ml.agents.base import AgentInput
_NOW = datetime(2026, 5, 1, 14, 0, 0, tzinfo=timezone.utc) # Thursday 14:00
def _inp(**kwargs) -> AgentInput:
defaults = dict(user_id="u1", tasks=[], profile={}, now=_NOW, agent_prefs={})
defaults.update(kwargs)
return AgentInput(**defaults)
def _event(action: str, hour: int) -> FeedbackEvent:
ts = f"2026-05-01T{hour:02d}:00:00+00:00"
return FeedbackEvent(action=action, dwell_ms=60_000 if action == "done" else 500, created_at=ts)
class TestRunInference:
def test_cold_start_when_below_min_history(self):
history = UserHistory(user_id="u1", events=[_event("done", 9)] * 5) # only 5 < 10
result = run_inference(TOD_MANIFEST, history)
assert result["preferred_hour"] is None # cold_start_default
def test_infers_preferred_hour_as_mode(self):
# 7 events at 09:00, 3 at 17:00 → preferred_hour should be 9
events = [_event("done", 9)] * 7 + [_event("done", 17)] * 3
history = UserHistory(user_id="u1", events=events)
result = run_inference(TOD_MANIFEST, history)
assert result["preferred_hour"] == 9
def test_infers_preferred_hour_from_majority_hour(self):
events = [_event("done", 20)] * 6 + [_event("done", 8)] * 4
history = UserHistory(user_id="u1", events=events)
result = run_inference(TOD_MANIFEST, history)
assert result["preferred_hour"] == 20
def test_no_inferred_params_returns_empty(self):
from ml.agents.manifest import AgentManifest
bare = AgentManifest(
id="bare", version="1.0.0", description="", pref_schema={},
context_schema=[], required_consents=[], output_contract={}, ttl_sec=300,
)
history = UserHistory(user_id="u1", events=[_event("done", 9)] * 20)
result = run_inference(bare, history)
assert result == {}
def test_cold_start_fallback_on_infer_error(self):
"""infer() raising should fall back to cold_start_default, not crash."""
from ml.agents.manifest import InferredParam, AgentManifest
def _bad_infer(h):
raise RuntimeError("oops")
m = AgentManifest(
id="boom", version="1.0.0", description="", pref_schema={},
context_schema=[], required_consents=[], output_contract={}, ttl_sec=300,
inferred_params=[InferredParam(key="x", ttl_sec=60, cold_start_default=42, min_history=1, infer=_bad_infer)],
)
history = UserHistory(user_id="u1", events=[_event("done", 9)] * 5)
result = run_inference(m, history)
assert result["x"] == 42
class TestTimeOfDayAgentWithInference:
agent = TimeOfDayAgent()
def test_uses_preferred_hour_from_agent_prefs(self):
inp = _inp(agent_prefs={"preferred_hour": 9}, now=datetime(2026, 5, 1, 9, 0, 0, tzinfo=timezone.utc))
out = self.agent.compute(inp)
assert "peak productivity hour" in out.prompt_text.lower() or "peak" in out.prompt_text
def test_quiet_window_noon_suppressed(self):
inp = _inp(
agent_prefs={"quiet_start": "22:00", "quiet_end": "07:00"},
now=datetime(2026, 5, 1, 23, 0, 0, tzinfo=timezone.utc),
)
out = self.agent.compute(inp)
assert "quiet window" in out.prompt_text
def test_quiet_window_not_in_window(self):
inp = _inp(
agent_prefs={"quiet_start": "22:00", "quiet_end": "07:00"},
now=datetime(2026, 5, 1, 14, 0, 0, tzinfo=timezone.utc),
)
out = self.agent.compute(inp)
assert "quiet window" not in out.prompt_text
def test_agent_prefs_override_profile(self):
# agent_prefs.preferred_hour wins over profile.preferred_hour
inp = _inp(
profile={"preferred_hour": 8},
agent_prefs={"preferred_hour": 14},
now=datetime(2026, 5, 1, 14, 0, 0, tzinfo=timezone.utc),
)
out = self.agent.compute(inp)
assert "peak productivity hour (14:00)" in out.prompt_text
def test_no_prefs_falls_back_to_profile(self):
inp = _inp(profile={"preferred_hour": 10}, now=datetime(2026, 5, 1, 10, 0, 0, tzinfo=timezone.utc))
out = self.agent.compute(inp)
assert "peak" in out.prompt_text
def test_version_bumped(self):
assert MANIFEST.version == "1.2.0"
def test_manifest_has_preferred_hour_param(self):
keys = {p.key for p in MANIFEST.inferred_params}
assert "preferred_hour" in keys

View File

@@ -0,0 +1,68 @@
"""Manifest registry tests (ADR-0014).
Each agent module exports a `MANIFEST: AgentManifest` whose id and version
must agree with the agent class. The registry exposes both, and `to_dict()`
must drop the `infer` callable so the wire payload is JSON-serialisable.
"""
from __future__ import annotations
import json
import os
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", ".."))
import pytest # noqa: E402
from ml.agents.manifest import AgentManifest, InferredParam # noqa: E402
from ml.agents.registry import ( # noqa: E402
all_agents,
all_manifests,
get_agent,
get_manifest,
)
def test_every_agent_has_a_matching_manifest():
agents = {a.agent_id: a for a in all_agents()}
manifests = {m.id: m for m in all_manifests()}
assert agents.keys() == manifests.keys(), "agent / manifest registries diverged"
for aid in agents:
assert agents[aid].version == manifests[aid].version, (
f"version mismatch for {aid}: agent={agents[aid].version!r} "
f"manifest={manifests[aid].version!r}"
)
@pytest.mark.parametrize("agent_id", [
"overdue-task", "momentum", "time-of-day", "recent-patterns", "focus-area",
])
def test_manifest_required_fields(agent_id: str):
m = get_manifest(agent_id)
assert m.id == agent_id
assert m.version
assert m.description
assert isinstance(m.pref_schema, dict) and m.pref_schema.get("type") == "object"
assert isinstance(m.required_consents, list) and m.required_consents
assert "data:core" in m.required_consents, "every agent should require data:core"
assert all(c.startswith("data:") for c in m.required_consents), "only data: consents allowed; agent: consents have been removed"
assert m.ttl_sec == get_agent(agent_id).ttl_seconds, "ttl divergence"
def test_to_dict_is_json_serialisable_and_drops_infer_callable():
m = AgentManifest(
id="x", version="1.0.0", description="d",
pref_schema={"type": "object"}, context_schema=[], required_consents=["data:core"],
output_contract={"type": "snippet"}, ttl_sec=60,
inferred_params=[InferredParam(key="k", ttl_sec=60, cold_start_default=0, min_history=10, infer=lambda h: 0)],
)
payload = m.to_dict()
# Round-trip through json to confirm no callables / non-JSON types leaked.
data = json.loads(json.dumps(payload))
assert data["inferred_params"][0]["key"] == "k"
assert "infer" not in data["inferred_params"][0]
def test_get_manifest_unknown_raises():
with pytest.raises(KeyError):
get_manifest("not-an-agent")

View File

@@ -0,0 +1,663 @@
"""Per-agent inference tests: momentum (#114), overdue-task (#115), recent-patterns (#116),
time-of-day (#112), and focus-area (#113) preferred_areas wiring."""
from __future__ import annotations
import sys, os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", ".."))
from datetime import datetime, timezone
import pytest
from ml.agents.inference.history import FeedbackEvent, TaskCompletion, UserHistory
from ml.agents.inference.framework import run_inference
from ml.agents.momentum import MomentumAgent, MANIFEST as MOMENTUM_MANIFEST
from ml.agents.overdue_task import OverdueTaskAgent, MANIFEST as OVERDUE_MANIFEST
from ml.agents.recent_patterns import RecentPatternsAgent, MANIFEST as RECENT_MANIFEST
from ml.agents.time_of_day import TimeOfDayAgent, MANIFEST as TOD_MANIFEST
from ml.agents.focus_area import FocusAreaAgent
from ml.agents.base import AgentInput
_NOW = datetime(2026, 5, 8, 14, 0, 0, tzinfo=timezone.utc)
def _inp(**kwargs) -> AgentInput:
defaults = dict(user_id="u1", tasks=[], profile={}, now=_NOW, agent_prefs={})
defaults.update(kwargs)
return AgentInput(**defaults)
def _event(action: str, days_ago: float = 1.0) -> FeedbackEvent:
from datetime import timedelta
ts = (_NOW - timedelta(days=days_ago)).isoformat()
dwell = 60_000 if action == "done" else 500
return FeedbackEvent(action=action, dwell_ms=dwell, created_at=ts)
def _history(*events: FeedbackEvent, completions: list[TaskCompletion] | None = None) -> UserHistory:
return UserHistory(user_id="u1", events=list(events), task_completions=completions or [])
def _completion(project_id: str | None, lateness_days: float) -> TaskCompletion:
"""Build a TaskCompletion where completed_at is lateness_days after due_at."""
from datetime import timedelta
due = _NOW - timedelta(days=30)
completed = due + timedelta(days=lateness_days)
return TaskCompletion(
project_id=project_id,
completed_at=completed.isoformat(),
due_at=due.isoformat(),
)
# ── momentum helpers ─────────────────────────────────────────────────────────
def _neutral_prefs(**extra) -> dict:
"""Prefs that put z-score in the normal range so trend label can show."""
return {"baseline_completions_per_day": 0.0, "stdev": 1.0, "momentum_window": 7, **extra}
def _feedback_done(n: int, days_ago: float = 1.0) -> list[dict]:
from datetime import timedelta
ts = (_NOW - timedelta(days=days_ago)).isoformat()
return [{"action": "done", "dwell_ms": 60_000, "created_at": ts}] * n
# ── momentum: engagement_trend inference ─────────────────────────────────────
class TestMomentumTrendInference:
def test_cold_start_below_min_history(self):
history = _history(*[_event("done", days_ago=i) for i in range(5)])
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["engagement_trend"] == "stable" # cold_start_default
def test_trend_up_when_recent_done_rate_higher(self):
recent = [_event("done", days_ago=i) for i in range(1, 9)]
older = [_event("dismiss", days_ago=i) for i in range(8, 15)]
older[0] = _event("done", days_ago=8)
history = _history(*recent, *older)
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["engagement_trend"] == "up"
def test_trend_down_when_recent_done_rate_lower(self):
recent = [_event("dismiss", days_ago=i) for i in range(1, 8)]
older = [_event("done", days_ago=i) for i in range(8, 15)]
history = _history(*recent, *older)
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["engagement_trend"] == "down"
def test_trend_stable_when_similar(self):
events = [_event("done" if i % 2 == 0 else "dismiss", days_ago=i) for i in range(1, 15)]
history = _history(*events)
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["engagement_trend"] == "stable"
def test_trend_shown_when_z_score_normal(self):
# baseline=0 so z≈0 → no z label → trend label falls through
out = MomentumAgent().compute(_inp(agent_prefs=_neutral_prefs(engagement_trend="up")))
assert "trending up" in out.prompt_text
def test_trend_down_shown_when_z_score_normal(self):
out = MomentumAgent().compute(_inp(agent_prefs=_neutral_prefs(engagement_trend="down")))
assert "trending down" in out.prompt_text
def test_snapshot_includes_trend(self):
out = MomentumAgent().compute(_inp(agent_prefs=_neutral_prefs(engagement_trend="stable")))
assert "engagement_trend" in out.signals_snapshot
# ── momentum: baseline + stdev inference (#114) ───────────────────────────────
class TestMomentumBaselineInference:
def _events_n_per_day(self, done_per_day: int, n_days: int) -> list[FeedbackEvent]:
"""Generate done events spread across n_days."""
events = []
for d in range(n_days):
for _ in range(done_per_day):
events.append(_event("done", days_ago=d + 0.5))
return events
def test_cold_start_when_few_events(self):
history = _history(*[_event("done", days_ago=i) for i in range(5)])
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["baseline_completions_per_day"] == 1.0
assert result["stdev"] == 1.0
def test_power_user_baseline_high(self):
# 5 done events per day for 20 days → baseline ≈ 5/day (over 28d window, zeros fill rest)
events = self._events_n_per_day(5, 20)
history = _history(*events)
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["baseline_completions_per_day"] > 2.0
def test_casual_user_baseline_low(self):
# 1 done every 3 days + dismiss filler to clear min_history=14 → baseline ≈ 0.33/day
done_events = [_event("done", days_ago=d * 3 + 0.5) for d in range(7)]
filler = [_event("dismiss", days_ago=d + 0.5) for d in range(10)]
history = _history(*done_events, *filler)
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["baseline_completions_per_day"] < 0.5
def test_stdev_reflects_variability(self):
# Alternating 0 and 4 done events → high stdev
events = []
for d in range(14):
if d % 2 == 0:
for _ in range(4):
events.append(_event("done", days_ago=d + 0.5))
history = _history(*events)
result = run_inference(MOMENTUM_MANIFEST, history)
assert result["stdev"] > 1.0
def test_consistent_user_lower_stdev_than_variable(self):
# Consistent 2/day for 28 days has lower stdev than alternating 0/4
consistent = self._events_n_per_day(2, 28)
variable = []
for d in range(14):
if d % 2 == 0:
for _ in range(4):
variable.append(_event("done", days_ago=d + 0.5))
else:
variable.append(_event("dismiss", days_ago=d + 0.5))
r_consistent = run_inference(MOMENTUM_MANIFEST, _history(*consistent))
r_variable = run_inference(MOMENTUM_MANIFEST, _history(*variable))
assert r_consistent["stdev"] < r_variable["stdev"]
# ── momentum: z-score snippet language ───────────────────────────────────────
class TestMomentumZScore:
def _prefs(self, baseline: float, stdev: float = 1.0) -> dict:
return {"baseline_completions_per_day": baseline, "stdev": stdev,
"momentum_window": 7, "engagement_trend": "stable"}
def test_power_user_above_baseline_says_above_usual(self):
# baseline=3/day, stdev=1.0, window=7 → expected rate=3; user did 35 → rate=5, z=2
prefs = self._prefs(baseline=3.0, stdev=1.0)
feedback = _feedback_done(35, days_ago=1.0)
out = MomentumAgent().compute(_inp(feedback_history=feedback, agent_prefs=prefs))
assert "above your usual" in out.prompt_text
def test_casual_user_slowing_down(self):
# baseline=1/day, user did 0 in 7d → z = (0 - 1) / 1 = -1 → below usual
prefs = self._prefs(baseline=1.0, stdev=1.0)
out = MomentumAgent().compute(_inp(feedback_history=[], agent_prefs=prefs))
assert "below your usual" in out.prompt_text
def test_returning_from_break_at_normal_rate(self):
# User just came back: 1 done, baseline=1/day, window=7 → z=(1/7-1)/1≈-0.86, within normal
prefs = self._prefs(baseline=1.0, stdev=1.0)
feedback = _feedback_done(1, days_ago=0.5)
out = MomentumAgent().compute(_inp(feedback_history=feedback, agent_prefs=prefs))
# z ≈ -0.86 → no z label, falls back to trend (stable → no extra sentence)
assert "above your usual" not in out.prompt_text
assert "below your usual" not in out.prompt_text
def test_snapshot_includes_z_score(self):
prefs = self._prefs(baseline=1.0)
out = MomentumAgent().compute(_inp(agent_prefs=prefs))
assert "z_score" in out.signals_snapshot
assert "recent_done_count" in out.signals_snapshot
def test_version_bumped(self):
assert MOMENTUM_MANIFEST.version == "1.2.0"
# ── overdue-task: lateness_tolerance_days + project_realness (#115) ──────────
class TestOverdueTaskInference:
# -- lateness_tolerance_days inference --
def test_cold_start_returns_zero_when_few_completions(self):
# Below min_history=10 task completions → cold start
cs = [_completion("p1", 2.0) for _ in range(5)]
history = _history(*[_event("done")] * 5, completions=cs)
result = run_inference(OVERDUE_MANIFEST, history)
assert result["lateness_tolerance_days"] == 0.0
def test_punctual_user_zero_tolerance(self):
# User always finishes early or on time (negative lateness) → tolerance 0
cs = [_completion("p1", -1.0) for _ in range(12)]
history = _history(*[_event("done")] * 12, completions=cs)
result = run_inference(OVERDUE_MANIFEST, history)
assert result["lateness_tolerance_days"] == 0.0
def test_chronic_late_user_positive_tolerance(self):
# User consistently finishes 5 days late → p50 = 5
cs = [_completion("p1", 5.0) for _ in range(12)]
history = _history(*[_event("done")] * 12, completions=cs)
result = run_inference(OVERDUE_MANIFEST, history)
assert result["lateness_tolerance_days"] == pytest.approx(5.0)
def test_mixed_lateness_uses_median(self):
# 6 tasks at +1d, 6 tasks at +3d → median = 2
cs = [_completion("p1", 1.0)] * 6 + [_completion("p1", 3.0)] * 6
history = _history(*[_event("done")] * 12, completions=cs)
result = run_inference(OVERDUE_MANIFEST, history)
assert result["lateness_tolerance_days"] == pytest.approx(2.0)
# -- project_realness inference --
def test_project_realness_cold_start_empty(self):
cs = [_completion("p1", 1.0) for _ in range(5)] # below min_history
history = _history(*[_event("done")] * 5, completions=cs)
result = run_inference(OVERDUE_MANIFEST, history)
assert result["project_realness"] == {}
def test_project_realness_punctual_project_scores_high(self):
# p1 always on time (0d late), p2 always 10d late → p1 should be realness ≈ 1
cs = [_completion("p1", 0.0)] * 6 + [_completion("p2", 10.0)] * 6
history = _history(*[_event("done")] * 12, completions=cs)
result = run_inference(OVERDUE_MANIFEST, history)
assert result["project_realness"]["p1"] > result["project_realness"]["p2"]
def test_project_realness_values_clipped_01(self):
cs = [_completion("p1", 0.0)] * 6 + [_completion("p2", 100.0)] * 6
history = _history(*[_event("done")] * 12, completions=cs)
result = run_inference(OVERDUE_MANIFEST, history)
for v in result["project_realness"].values():
assert 0.0 <= v <= 1.0
# -- compute() reads inferred prefs --
def test_tolerance_filters_tasks(self):
tasks = [
{"content": "Fresh overdue", "is_overdue": True, "task_age_days": 0.5},
{"content": "Old overdue", "is_overdue": True, "task_age_days": 3.0},
]
out = OverdueTaskAgent().compute(_inp(tasks=tasks, agent_prefs={"lateness_tolerance_days": 2}))
assert "1 overdue task" in out.prompt_text
assert "Old overdue" in out.prompt_text
def test_low_realness_softens_language(self):
tasks = [{"content": "Wishlist", "is_overdue": True, "task_age_days": 3.0,
"project_id": "aspirational"}]
prefs = {"lateness_tolerance_days": 0, "project_realness": {"aspirational": 0.2}}
out = OverdueTaskAgent().compute(_inp(tasks=tasks, agent_prefs=prefs))
assert "target date" in out.prompt_text
def test_high_realness_uses_overdue_language(self):
tasks = [{"content": "Critical", "is_overdue": True, "task_age_days": 3.0,
"project_id": "work"}]
prefs = {"lateness_tolerance_days": 0, "project_realness": {"work": 0.9}}
out = OverdueTaskAgent().compute(_inp(tasks=tasks, agent_prefs=prefs))
assert "overdue" in out.prompt_text
def test_snapshot_includes_realness(self):
tasks = [{"content": "T", "is_overdue": True, "task_age_days": 1.0, "project_id": "p1"}]
prefs = {"lateness_tolerance_days": 0, "project_realness": {"p1": 0.8}}
out = OverdueTaskAgent().compute(_inp(tasks=tasks, agent_prefs=prefs))
assert "realness" in out.signals_snapshot["top_overdue"][0]
def test_version_bumped(self):
assert OVERDUE_MANIFEST.version == "1.2.0"
# ── recent-patterns: lookback_days + weekly_cycle + daily_cycle (#116) ────────
def _done_at(days_ago: float, hour: int = 10) -> FeedbackEvent:
"""Done event at a specific hour, N days ago."""
from datetime import timedelta
ts = (_NOW - timedelta(days=days_ago)).replace(hour=hour, minute=0, second=0, microsecond=0)
return FeedbackEvent(action="done", dwell_ms=60_000, created_at=ts.isoformat())
class TestRecentPatternsLookbackInference:
def test_cold_start_below_min_history(self):
history = _history(*[_event("done") for _ in range(3)])
result = run_inference(RECENT_MANIFEST, history)
assert result["lookback_days"] == 7 # cold_start_default
def test_sparse_done_history_returns_30(self):
# Only 10 done events → fewer than 30 → returns cap of 30
history = _history(*[_event("done") for _ in range(10)])
result = run_inference(RECENT_MANIFEST, history)
assert result["lookback_days"] == 30
def test_dense_done_history_returns_short_window(self):
# 30 done events all within the last 2 days → lookback_days = 1 or 2
events = [_event("done", days_ago=i * 0.05) for i in range(30)]
history = _history(*events)
result = run_inference(RECENT_MANIFEST, history)
assert result["lookback_days"] <= 2
def test_spread_history_spans_window_correctly(self):
# 30 done events spread over 15 days (1 per 0.5d) → window should be ≈15
events = [_event("done", days_ago=i * 0.5) for i in range(30)]
history = _history(*events)
result = run_inference(RECENT_MANIFEST, history)
assert result["lookback_days"] <= 16
def test_agent_respects_lookback_days_pref(self):
from datetime import timedelta
feedback = [
{"action": "done", "dwell_ms": 60000,
"created_at": (_NOW - timedelta(days=10)).isoformat()}
] * 5
out_narrow = RecentPatternsAgent().compute(
_inp(feedback_history=feedback, agent_prefs={"lookback_days": 7})
)
out_wide = RecentPatternsAgent().compute(
_inp(feedback_history=feedback, agent_prefs={"lookback_days": 14})
)
assert "No tip reactions" in out_narrow.prompt_text
assert "5 tip reactions" in out_wide.prompt_text
def test_legacy_window_days_pref_still_works(self):
from datetime import timedelta
feedback = [
{"action": "done", "dwell_ms": 60000,
"created_at": (_NOW - timedelta(days=10)).isoformat()}
] * 5
out = RecentPatternsAgent().compute(
_inp(feedback_history=feedback, agent_prefs={"window_days": 14})
)
assert "5 tip reactions" in out.prompt_text
def test_snapshot_includes_lookback_days(self):
out = RecentPatternsAgent().compute(_inp(agent_prefs={"lookback_days": 14}))
assert out.signals_snapshot["lookback_days"] == 14
class TestRecentPatternsWeeklyCycle:
def test_cold_start_returns_empty(self):
history = _history(*[_event("done") for _ in range(5)]) # below min_history=21
result = run_inference(RECENT_MANIFEST, history)
assert result["weekly_cycle"] == []
def _events_on_dow(self, target_dow: int, count: int, n_weeks: int = 4) -> list[FeedbackEvent]:
"""Generate `count` done events per week on `target_dow` (0=Mon…6=Sun).
_NOW is Thursday (weekday=3). days_back = (now_dow - target_dow) % 7
gives the offset to the most recent occurrence of target_dow.
"""
now_dow = _NOW.weekday() # 3 = Thursday
days_back = (now_dow - target_dow) % 7
if days_back == 0:
days_back = 7 # avoid "today" — use the previous occurrence
events = []
for week in range(n_weeks):
offset = days_back + week * 7
for _ in range(count):
events.append(_done_at(offset + 0.1, hour=11))
return events
def _weekend_warrior_history(self) -> UserHistory:
"""Many done events on Sat/Sun (dow 5 & 6), few on Tuesday (dow 1)."""
events = []
events += self._events_on_dow(5, count=5) # Saturday
events += self._events_on_dow(6, count=5) # Sunday
events += self._events_on_dow(1, count=1) # Tuesday — one per week
return _history(*events)
def test_weekend_warrior_strong_on_weekends(self):
history = self._weekend_warrior_history()
result = run_inference(RECENT_MANIFEST, history)
by_dow = {e["dow"]: e["strength"] for e in result["weekly_cycle"]}
assert by_dow.get(5, 0) > 1.0 # Saturday
assert by_dow.get(6, 0) > 1.0 # Sunday
def test_weekday_only_low_weekend_strength(self):
events = []
for dow in range(5): # MondayFriday
events += self._events_on_dow(dow, count=3)
# Saturday (5) and Sunday (6) get zero events
history = _history(*events)
result = run_inference(RECENT_MANIFEST, history)
by_dow = {e["dow"]: e["strength"] for e in result["weekly_cycle"]}
assert by_dow.get(5, 0) == 0.0 # Saturday
assert by_dow.get(6, 0) == 0.0 # Sunday
def test_snippet_includes_cycle_hint_when_strong(self):
# Inject a strong weekly_cycle pref directly
prefs = {
"lookback_days": 7,
"weekly_cycle": [{"dow": 1, "strength": 2.0, "sample": "completes most Tuesdays"}],
"daily_cycle": [],
}
out = RecentPatternsAgent().compute(_inp(agent_prefs=prefs))
assert "Tuesday" in out.prompt_text
def test_snippet_omits_cycle_hint_when_weak(self):
prefs = {
"lookback_days": 7,
"weekly_cycle": [{"dow": 1, "strength": 0.3, "sample": "completes most Tuesdays"}],
"daily_cycle": [],
}
out = RecentPatternsAgent().compute(_inp(agent_prefs=prefs))
assert "Tuesday" not in out.prompt_text
class TestRecentPatternsDailyCycle:
def test_cold_start_returns_empty(self):
history = _history(*[_event("done") for _ in range(5)]) # below min_history=14
result = run_inference(RECENT_MANIFEST, history)
assert result["daily_cycle"] == []
def _evening_person_history(self) -> UserHistory:
"""Many done events at 20:0021:00, few in the morning."""
events = []
for d in range(20):
for _ in range(4):
events.append(_done_at(d + 0.5, hour=20))
events.append(_done_at(d + 0.5, hour=9))
return _history(*events)
def test_evening_person_strong_at_evening_hours(self):
history = self._evening_person_history()
result = run_inference(RECENT_MANIFEST, history)
by_hour = {e["hour"]: e["strength"] for e in result["daily_cycle"]}
assert by_hour.get(20, 0) > 1.0
assert by_hour.get(9, 0) < by_hour.get(20, 0)
def test_snippet_includes_daily_hint_when_strong(self):
prefs = {
"lookback_days": 7,
"weekly_cycle": [],
"daily_cycle": [{"hour": 20, "strength": 3.0}],
}
out = RecentPatternsAgent().compute(_inp(agent_prefs=prefs))
assert "8pm" in out.prompt_text
def test_snippet_omits_daily_hint_when_weak(self):
prefs = {
"lookback_days": 7,
"weekly_cycle": [],
"daily_cycle": [{"hour": 20, "strength": 0.4}],
}
out = RecentPatternsAgent().compute(_inp(agent_prefs=prefs))
assert "8pm" not in out.prompt_text
def test_no_pattern_user_no_hints(self):
# Uniform distribution across all hours → strength ≈ 1.0 everywhere → no strong peaks
events = [_done_at(d + 0.5, hour=h) for d in range(3) for h in range(24)]
history = _history(*events)
result = run_inference(RECENT_MANIFEST, history)
strong = [e for e in result["daily_cycle"] if e["strength"] > 0.5]
# Uniform distribution → all strengths ≈ 1.0; but none dramatically above threshold
# Since strength = count/mean and all counts are equal, all = 1.0 exactly
# 1.0 is not > 0.5 threshold in snippet rendering, but IS > 0.5 so they'd show.
# For a flat distribution the caller sees no meaningful peak — verify no strength > 2
assert all(e["strength"] <= 1.1 for e in result["daily_cycle"])
def test_version_bumped(self):
assert RECENT_MANIFEST.version == "1.2.0"
# ── time-of-day: quiet_start/end + peak_hours inference (#112) ───────────────
def _tod_event(action: str, hour: int, days_ago: float = 1.0) -> FeedbackEvent:
"""Feedback event at a specific hour N days ago."""
from datetime import timedelta
dt = (_NOW - timedelta(days=days_ago)).replace(hour=hour, minute=0, second=0, microsecond=0)
return FeedbackEvent(action=action, dwell_ms=60_000, created_at=dt.isoformat())
def _tod_history(*events: FeedbackEvent) -> UserHistory:
return UserHistory(user_id="u1", events=list(events))
class TestTimeOfDayQuietWindow:
def test_cold_start_below_min_history(self):
history = _tod_history(*[_tod_event("done", 10) for _ in range(10)])
result = run_inference(TOD_MANIFEST, history)
assert result["quiet_start"] == "22:00"
assert result["quiet_end"] == "07:00"
def _night_owl_history(self) -> UserHistory:
"""Active 20:0023:00, quiet 02:0014:00."""
events = []
for d in range(10):
for h in [20, 21, 22, 23, 0, 1]:
events.append(_tod_event("done", h, days_ago=d + 0.5))
# Sparse during day
events.append(_tod_event("done", 15, days_ago=d + 0.5))
return _tod_history(*events)
def _early_bird_history(self) -> UserHistory:
"""Active 06:0010:00, quiet 21:0005:00."""
events = []
for d in range(10):
for h in [6, 7, 8, 9, 10]:
events.append(_tod_event("done", h, days_ago=d + 0.5))
events.append(_tod_event("done", 14, days_ago=d + 0.5))
return _tod_history(*events)
def test_early_bird_quiet_in_evening(self):
history = self._early_bird_history()
result = run_inference(TOD_MANIFEST, history)
# Quiet window should be in the evening/night range
start_h = int(result["quiet_start"].split(":")[0])
end_h = int(result["quiet_end"].split(":")[0])
# Quiet window spans from some evening hour into morning
assert start_h >= 18 or end_h <= 10 # covers night
def test_quiet_window_wraps_midnight(self):
# Night owl: heavy activity in evening, quiet 02:0014:00
history = self._night_owl_history()
result = run_inference(TOD_MANIFEST, history)
start_h = int(result["quiet_start"].split(":")[0])
end_h = int(result["quiet_end"].split(":")[0])
# The quiet window should span across midnight or be in daylight
# (start > end means wraps midnight)
is_wrapping = start_h > end_h
is_daytime = 2 <= start_h <= 14
assert is_wrapping or is_daytime
def test_format_is_hhmm(self):
history = self._early_bird_history()
result = run_inference(TOD_MANIFEST, history)
import re
assert re.match(r"^\d{2}:00$", result["quiet_start"])
assert re.match(r"^\d{2}:00$", result["quiet_end"])
class TestTimeOfDayPeakHours:
def _evening_person_history(self, n: int = 60) -> UserHistory:
"""Heavy done events at 19:00 and 20:00, light elsewhere."""
events = []
for i in range(n):
events.append(_tod_event("done", 19, days_ago=i * 0.5))
events.append(_tod_event("done", 20, days_ago=i * 0.5))
events.append(_tod_event("done", 10, days_ago=i * 0.5)) # low volume
return _tod_history(*events)
def test_cold_start_returns_default(self):
history = _tod_history(*[_tod_event("done", 10) for _ in range(5)])
result = run_inference(TOD_MANIFEST, history)
assert result["peak_hours"] == [9, 14, 20]
def test_evening_person_peak_hours_in_evening(self):
history = self._evening_person_history()
result = run_inference(TOD_MANIFEST, history)
assert 19 in result["peak_hours"] or 20 in result["peak_hours"]
def test_peak_hours_sorted(self):
history = self._evening_person_history()
result = run_inference(TOD_MANIFEST, history)
assert result["peak_hours"] == sorted(result["peak_hours"])
def test_shift_worker_peaks_at_unusual_hours(self):
"""Shift worker active at 02:00 and 03:00."""
events = [_tod_event("done", h, days_ago=i * 0.5)
for i in range(30) for h in [2, 3]]
events += [_tod_event("done", 14, days_ago=i * 0.5) for i in range(5)]
history = _tod_history(*events)
result = run_inference(TOD_MANIFEST, history)
assert 2 in result["peak_hours"] or 3 in result["peak_hours"]
class TestTimeOfDaySnippet:
agent = TimeOfDayAgent()
def _inp_at(self, hour: int, **prefs) -> AgentInput:
from datetime import timedelta
now = _NOW.replace(hour=hour)
return _inp(now=now, agent_prefs=prefs)
def test_in_peak_hour_says_peak(self):
out = self.agent.compute(self._inp_at(20, peak_hours=[20]))
assert "peak productivity hour" in out.prompt_text
def test_approaching_peak_says_approaching(self):
out = self.agent.compute(self._inp_at(18, peak_hours=[20]))
assert "approaching" in out.prompt_text.lower()
def test_quiet_window_overrides_peak(self):
# Even if hour is in peak_hours, quiet window wins
out = self.agent.compute(
self._inp_at(23, quiet_start="22:00", quiet_end="07:00", peak_hours=[23])
)
assert "quiet window" in out.prompt_text
def test_tz_shown_when_not_utc(self):
out = self.agent.compute(self._inp_at(10, tz="Europe/Moscow"))
assert "Europe/Moscow" in out.prompt_text
def test_snapshot_includes_peak_and_quiet(self):
out = self.agent.compute(self._inp_at(10, peak_hours=[10], quiet_start="22:00", quiet_end="07:00"))
assert "peak_hours" in out.signals_snapshot
assert "in_quiet" in out.signals_snapshot
assert "in_peak" in out.signals_snapshot
def test_version_bumped(self):
assert TOD_MANIFEST.version == "1.2.0"
def test_manifest_has_new_params(self):
keys = {p.key for p in TOD_MANIFEST.inferred_params}
assert {"quiet_start", "quiet_end", "peak_hours", "tz"}.issubset(keys)
# ── focus-area: cluster summary output ───────────────────────────────────────
class TestFocusAreaOutput:
agent = FocusAreaAgent()
def _task(self, content: str, project_id: str) -> dict:
return {"id": "t1", "content": content, "is_overdue": False,
"task_age_days": 2.0, "priority": 1, "project_id": project_id}
def test_version(self):
from ml.agents.focus_area import MANIFEST as FA_MANIFEST
assert FA_MANIFEST.version == "3.0.0"
def test_all_clusters_in_output(self):
tasks = [self._task("Work thing", "work"), self._task("Home thing", "home")]
out = self.agent.compute(_inp(tasks=tasks))
assert "work" in out.prompt_text.lower()
assert "home" in out.prompt_text.lower()
def test_task_titles_in_output(self):
tasks = [self._task("Buy milk", "personal")]
out = self.agent.compute(_inp(tasks=tasks))
assert '"Buy milk"' in out.prompt_text
def test_snapshot_shape(self):
tasks = [self._task("T", "work")]
out = self.agent.compute(_inp(tasks=tasks))
public_keys = {k for k in out.signals_snapshot if not k.startswith("_")}
assert public_keys == {"cluster_count", "clusters"}
assert isinstance(out.signals_snapshot["clusters"], list)
def test_no_inferred_params(self):
from ml.agents.focus_area import MANIFEST as FA_MANIFEST
assert FA_MANIFEST.inferred_params == []

266
ml/agents/time_of_day.py Normal file
View File

@@ -0,0 +1,266 @@
from __future__ import annotations
import statistics
from collections import Counter
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .inference.history import UserHistory
from .manifest import AgentManifest, InferredParam
_DOW_NAMES = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
# min_history required before quiet/peak inference is meaningful (issue #112)
_MIN_HISTORY = 50
def _infer_preferred_hour(history: UserHistory) -> int:
"""Mode hour of day across all 'done' feedback events; falls back to 9."""
done_hours = [e.hour for e in history.events if e.action == "done"]
if not done_hours:
return 9
return Counter(done_hours).most_common(1)[0][0]
def _quiet_window_hours(history: UserHistory) -> tuple[int, int]:
"""Return (start_hour, end_hour) of the longest below-baseline quiet window.
Counts all engagement events by hour. Baseline = mean hourly count.
Finds the longest contiguous run of below-baseline hours on the circular
clock; that run defines the quiet window.
"""
by_hour: Counter[int] = Counter(e.hour for e in history.events)
total = sum(by_hour.values())
baseline = total / 24
# Mark each of the 24 hours as below-baseline (True = quiet)
quiet: list[bool] = [by_hour.get(h, 0) < baseline for h in range(24)]
# Find longest contiguous run in circular array
best_start, best_len = 0, 0
run_start, run_len = 0, 0
# Double the sequence to handle wrap-around
for i in range(48):
h = i % 24
if quiet[h]:
if run_len == 0:
run_start = i
run_len += 1
if run_len > best_len:
best_len = run_len
best_start = run_start
else:
run_len = 0
if best_len == 0:
return (22, 7) # fallback
start = best_start % 24
end = (best_start + best_len) % 24
return (start, end)
def _infer_quiet_start(history: UserHistory) -> str:
start, _ = _quiet_window_hours(history)
return f"{start:02d}:00"
def _infer_quiet_end(history: UserHistory) -> str:
_, end = _quiet_window_hours(history)
return f"{end:02d}:00"
def _infer_peak_hours(history: UserHistory) -> list[int]:
"""Top-quartile hours by done-event count.
Computes done_count per hour, then returns hours above the 75th percentile
of non-zero hourly counts, sorted ascending.
"""
done_by_hour: Counter[int] = Counter(
e.hour for e in history.events if e.action == "done"
)
if not done_by_hour:
return [9, 14, 20]
counts = list(done_by_hour.values())
threshold = statistics.quantiles(counts, n=4)[-1] # 75th percentile
return sorted(h for h, c in done_by_hour.items() if c >= threshold)
MANIFEST = AgentManifest(
id="time-of-day",
version="1.2.0", # #112: quiet_start/end + peak_hours + tz inference
description="Frames the current moment relative to the user's productive peak and quiet hours.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"quiet_start": {
"type": "string",
"pattern": "^([01][0-9]|2[0-3]):[0-5][0-9]$",
"description": "HH:MM start of quiet hours (24h, user's local TZ).",
},
"quiet_end": {
"type": "string",
"pattern": "^([01][0-9]|2[0-3]):[0-5][0-9]$",
"description": "HH:MM end of quiet hours.",
},
"peak_hours": {
"type": "array",
"items": {"type": "integer", "minimum": 0, "maximum": 23},
"default": [9, 14, 20],
"description": "Hours (023) with top-quartile completion density.",
},
"tz": {
"type": "string",
"default": "UTC",
"description": "IANA timezone; populated from auth provider, fallback UTC.",
},
"preferred_hour": {
"type": "integer",
"minimum": 0,
"maximum": 23,
"description": "Mode done-hour (legacy; superseded by peak_hours).",
},
},
},
context_schema=["profile.features"],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=900,
inferred_params=[
InferredParam(
key="preferred_hour",
ttl_sec=3_600,
cold_start_default=None,
min_history=10,
infer=_infer_preferred_hour,
),
InferredParam(
key="quiet_start",
ttl_sec=86_400,
cold_start_default="22:00",
min_history=_MIN_HISTORY,
infer=_infer_quiet_start,
),
InferredParam(
key="quiet_end",
ttl_sec=86_400,
cold_start_default="07:00",
min_history=_MIN_HISTORY,
infer=_infer_quiet_end,
),
InferredParam(
key="peak_hours",
ttl_sec=86_400,
cold_start_default=[9, 14, 20],
min_history=_MIN_HISTORY,
infer=_infer_peak_hours,
),
# tz is populated from the auth provider; no infer function.
InferredParam(
key="tz",
ttl_sec=86_400,
cold_start_default="UTC",
min_history=999_999, # effectively never inferred — always cold_start
infer=None,
),
],
)
class TimeOfDayAgent(BaseAgent):
"""Frames the current moment relative to the user's productive peak."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
hour = inp.now.hour
dow = inp.now.weekday()
is_weekend = dow >= 5
preferred_raw = inp.agent_prefs.get("preferred_hour", inp.profile.get("preferred_hour"))
preferred = int(preferred_raw) if preferred_raw is not None else None
quiet_start: str | None = inp.agent_prefs.get("quiet_start")
quiet_end: str | None = inp.agent_prefs.get("quiet_end")
peak_hours: list[int] = inp.agent_prefs.get("peak_hours", [])
tz: str = inp.agent_prefs.get("tz", "UTC")
in_quiet = self._in_quiet_window(hour, quiet_start, quiet_end)
in_peak = hour in peak_hours
parts = [f"It is {hour:02d}:00 on {_DOW_NAMES[dow]} ({self._label(hour)})."]
if tz != "UTC":
parts[0] = f"It is {hour:02d}:00 ({tz}) on {_DOW_NAMES[dow]} ({self._label(hour)})."
if is_weekend:
parts.append("Weekend context — prefer personal or reflective tips over work tasks.")
if in_quiet:
parts.append(
f"User is in their quiet window ({quiet_start}{quiet_end}) — "
"avoid urgent or demanding tips."
)
elif in_peak:
parts.append(
f"Hour {hour:02d}:00 is a peak productivity hour for this user — "
"a high-impact or challenging tip is appropriate."
)
elif peak_hours:
# Report nearest peak so orchestrator can time advice accordingly.
nearest = min(peak_hours, key=lambda p: min(abs(p - hour), 24 - abs(p - hour)))
delta = min(abs(nearest - hour), 24 - abs(nearest - hour))
if delta <= 2:
parts.append(f"Approaching peak productivity window ({nearest:02d}:00).")
elif preferred is not None:
delta = min(abs(hour - preferred), 24 - abs(hour - preferred))
if delta == 0:
parts.append(
f"This is the user's peak productivity hour ({preferred:02d}:00) — "
"a high-impact tip is appropriate."
)
elif delta <= 2:
parts.append(f"Approaching the user's peak productivity window ({preferred:02d}:00).")
else:
parts.append("No preferred-hour data yet.")
prompt = " ".join(parts)
snapshot = {
"hour": hour,
"day_of_week": dow,
"preferred_hour": preferred,
"quiet_start": quiet_start,
"quiet_end": quiet_end,
"peak_hours": peak_hours,
"in_quiet": in_quiet,
"in_peak": in_peak,
"tz": tz,
}
return self._make_output(inp, prompt, snapshot)
@staticmethod
def _in_quiet_window(hour: int, start: str | None, end: str | None) -> bool:
if not start or not end:
return False
try:
sh = int(start.split(":")[0])
eh = int(end.split(":")[0])
except (ValueError, IndexError):
return False
if sh <= eh:
return sh <= hour < eh
# wraps midnight e.g. 22:0007:00
return hour >= sh or hour < eh
@staticmethod
def _label(hour: int) -> str:
if 5 <= hour < 12:
return "morning"
if 12 <= hour < 17:
return "afternoon"
if 17 <= hour < 21:
return "evening"
return "night"

View File

@@ -14,6 +14,8 @@ Feature-spec fields (issue #61):
ttl_sec — cache lifetime in seconds; mirrors ``ttlSec`` in registry.ts. ttl_sec — cache lifetime in seconds; mirrors ``ttlSec`` in registry.ts.
source — where the value originates. source — where the value originates.
fallback — raw value returned when the feature is unavailable (null stored). fallback — raw value returned when the feature is unavailable (null stored).
invalidated_by — bus event subjects that trigger recompute for the affected user;
mirrors ``invalidatedBy`` in registry.ts. Empty = TTL-only refresh.
""" """
from __future__ import annotations from __future__ import annotations
@@ -37,6 +39,7 @@ class ProfileFeature:
ttl_sec: int ttl_sec: int
source: str source: str
fallback: str fallback: str
invalidated_by: tuple[str, ...] = ()
PROFILE_FEATURES: tuple[ProfileFeature, ...] = ( PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
@@ -48,6 +51,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=6 * _HOUR, ttl_sec=6 * _HOUR,
source="profile_store", source="profile_store",
fallback="0.0", fallback="0.0",
invalidated_by=("signals.tip.feedback",),
), ),
ProfileFeature( ProfileFeature(
name="dismiss_rate_30d", name="dismiss_rate_30d",
@@ -57,6 +61,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=6 * _HOUR, ttl_sec=6 * _HOUR,
source="profile_store", source="profile_store",
fallback="0.0", fallback="0.0",
invalidated_by=("signals.tip.feedback",),
), ),
ProfileFeature( ProfileFeature(
name="mean_dwell_ms_30d", name="mean_dwell_ms_30d",
@@ -66,6 +71,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=6 * _HOUR, ttl_sec=6 * _HOUR,
source="profile_store", source="profile_store",
fallback="null — serving normalises to 0.0", fallback="null — serving normalises to 0.0",
invalidated_by=("signals.tip.feedback",),
), ),
ProfileFeature( ProfileFeature(
name="preferred_hour", name="preferred_hour",
@@ -75,6 +81,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=_DAY, ttl_sec=_DAY,
source="profile_store", source="profile_store",
fallback="null — serving normalises to 0.5 (neutral alignment)", fallback="null — serving normalises to 0.5 (neutral alignment)",
invalidated_by=("signals.tip.feedback",),
), ),
ProfileFeature( ProfileFeature(
name="tip_volume_30d", name="tip_volume_30d",
@@ -84,6 +91,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=_HOUR, ttl_sec=_HOUR,
source="profile_store", source="profile_store",
fallback="0", fallback="0",
invalidated_by=("signals.tip.served",),
), ),
) )

View File

@@ -4,6 +4,8 @@ The TS registry in services/api/src/profile/registry.ts is the source of truth.
This test checks the names listed here match the registry by reading the TS This test checks the names listed here match the registry by reading the TS
file and grepping for `name: '...'`. Crude but cheap, and it catches the file and grepping for `name: '...'`. Crude but cheap, and it catches the
common rename/add-without-mirror failure mode. common rename/add-without-mirror failure mode.
Also verifies invalidated_by subjects mirror the TS invalidatedBy arrays (#61).
""" """
from __future__ import annotations from __future__ import annotations
import re import re
@@ -111,3 +113,37 @@ def test_profile_feature_source_is_profile_store():
def test_profile_feature_fallback_set(): def test_profile_feature_fallback_set():
for f in PROFILE_FEATURES: for f in PROFILE_FEATURES:
assert f.fallback, f"{f.name}: fallback must not be empty" assert f.fallback, f"{f.name}: fallback must not be empty"
def _ts_registry_invalidated_by() -> dict[str, list[str]]:
"""Parse invalidatedBy arrays from registry.ts.
Extracts subjects from blocks like:
invalidatedBy: ['signals.tip.feedback'],
Returns {feature_name: [subject, ...]}; features with no invalidatedBy get [].
"""
text = REGISTRY_PATH.read_text(encoding="utf-8")
result: dict[str, list[str]] = {}
for block in re.split(r"\{", text):
name_m = re.search(r"name:\s*'([a-zA-Z0-9_]+)'", block)
if not name_m:
continue
name = name_m.group(1)
inv_m = re.search(r"invalidatedBy:\s*\[([^\]]*)\]", block)
if inv_m:
subjects = re.findall(r"'([^']+)'", inv_m.group(1))
else:
subjects = []
result[name] = subjects
return result
def test_invalidated_by_matches_ts_registry():
ts_inv = _ts_registry_invalidated_by()
for f in PROFILE_FEATURES:
assert f.name in ts_inv, f"{f.name} not found in TS registry invalidatedBy parse"
expected = tuple(sorted(ts_inv[f.name]))
actual = tuple(sorted(f.invalidated_by))
assert actual == expected, (
f"{f.name}: Python invalidated_by={actual} != TS invalidatedBy={expected}"
)

File diff suppressed because it is too large Load Diff

201
ml/serving/mlflow_client.py Normal file
View File

@@ -0,0 +1,201 @@
"""Thin MLflow REST wrapper.
Why not the official ``mlflow`` SDK? Two reasons specific to the oO setup:
1. The MLflow server (3.11) ships with ``--allowed-hosts localhost`` but
curl / requests / urllib3 send ``Host: localhost:5000`` — the port
suffix fails the DNS-rebinding check. We override the Host header per
request, which the SDK doesn't expose.
2. The collect/judge phases only need ~6 endpoints (create/search/log).
Pulling a 200MB SDK transitively for that is excess weight.
All calls are synchronous httpx with explicit ``Host`` so the script can
run from the host shell or from inside docker without further config.
"""
from __future__ import annotations
import os
import time
from dataclasses import dataclass
from typing import Any
import httpx
def _strip_path(uri: str) -> tuple[str, str]:
"""Return (origin, path_prefix) — handles both /mlflow and / roots.
``http://mlflow:5000/mlflow`` → ("http://mlflow:5000", "/mlflow")
``http://localhost:5000`` → ("http://localhost:5000", "")
"""
uri = uri.rstrip("/")
if "/" not in uri.split("://", 1)[1]:
return uri, ""
scheme_host, _, rest = uri.partition("://")
host, _, path = rest.partition("/")
return f"{scheme_host}://{host}", "/" + path if path else ""
@dataclass
class MLflowClient:
tracking_uri: str
username: str | None = None
password: str | None = None
host_header: str | None = None # override for DNS-rebinding sidestep
timeout: float = 30.0
def __post_init__(self) -> None:
self._origin, self._ui_prefix = _strip_path(self.tracking_uri)
# MLflow 3.x exposes the REST API at the root, *not* under the
# ``/mlflow`` UI prefix. Empirically verified against the running
# ghcr.io/mlflow/mlflow:v3.11.1 container.
self._api = f"{self._origin}/api/2.0/mlflow"
self._auth = (self.username, self.password) if self.username else None
# If user did not pass a host header, derive from origin. Strip
# the port if present — the server's allowed-hosts check rejects
# ``localhost:5000`` even when ``localhost`` is allowed.
if self.host_header is None:
host = self._origin.split("://", 1)[1]
self.host_header = host.split(":", 1)[0]
@classmethod
def from_env(cls) -> "MLflowClient":
return cls(
tracking_uri=os.environ.get("MLFLOW_TRACKING_URI", "http://localhost:5000"),
username=os.environ.get("MLFLOW_TRACKING_USERNAME") or "admin",
password=os.environ.get("MLFLOW_TRACKING_PASSWORD") or "password",
host_header=os.environ.get("MLFLOW_HOST_HEADER"),
)
def _headers(self) -> dict[str, str]:
return {"Host": self.host_header or "localhost"}
def _post(self, path: str, body: dict) -> dict:
with httpx.Client(trust_env=False, timeout=self.timeout) as c:
r = c.post(f"{self._api}{path}", json=body, headers=self._headers(), auth=self._auth)
r.raise_for_status()
return r.json()
def _get(self, path: str, params: dict | None = None) -> dict:
with httpx.Client(trust_env=False, timeout=self.timeout) as c:
r = c.get(f"{self._api}{path}", params=params or {}, headers=self._headers(), auth=self._auth)
r.raise_for_status()
return r.json()
# ── Experiments ────────────────────────────────────────────────────
def get_or_create_experiment(self, name: str) -> str:
try:
r = self._get("/experiments/get-by-name", {"experiment_name": name})
return r["experiment"]["experiment_id"]
except httpx.HTTPStatusError as e:
if e.response.status_code not in (404, 400):
raise
r = self._post("/experiments/create", {"name": name})
return r["experiment_id"]
# ── Runs ───────────────────────────────────────────────────────────
def create_run(
self,
experiment_id: str,
run_name: str,
tags: dict[str, str] | None = None,
) -> str:
body: dict[str, Any] = {
"experiment_id": experiment_id,
"start_time": int(time.time() * 1000),
"run_name": run_name,
"tags": [
{"key": k, "value": str(v)}
for k, v in (tags or {}).items()
],
}
r = self._post("/runs/create", body)
return r["run"]["info"]["run_id"]
def log_param(self, run_id: str, key: str, value: Any) -> None:
self._post("/runs/log-parameter", {"run_id": run_id, "key": key, "value": str(value)})
def log_params(self, run_id: str, params: dict[str, Any]) -> None:
for k, v in params.items():
self.log_param(run_id, k, v)
def log_metric(self, run_id: str, key: str, value: float, step: int = 0) -> None:
self._post("/runs/log-metric", {
"run_id": run_id,
"key": key,
"value": float(value),
"timestamp": int(time.time() * 1000),
"step": step,
})
def log_metrics(self, run_id: str, metrics: dict[str, float]) -> None:
for k, v in metrics.items():
self.log_metric(run_id, k, v)
def set_tag(self, run_id: str, key: str, value: str) -> None:
self._post("/runs/set-tag", {"run_id": run_id, "key": key, "value": str(value)})
def set_tags(self, run_id: str, tags: dict[str, str]) -> None:
for k, v in tags.items():
self.set_tag(run_id, k, v)
# MLflow tag values are capped at 5000 chars by the server (RESOURCE_DOES_NOT_EXIST
# below that, INVALID_PARAMETER_VALUE above). 4500 leaves headroom for
# internal metadata MLflow may append on its own.
_TAG_VALUE_LIMIT = 4500
def log_text(self, run_id: str, text: str, artifact_path: str) -> None:
"""Persist short text alongside the run.
The MLflow server in this deployment uses a ``file://`` artifact
backend, which is only reachable from inside the container — not
via the REST proxy. We instead stash short payloads as tags
keyed ``artifact:<path>``. Anything longer than 4500 chars is
chunked into ``artifact:<path>:0``, ``:1`` …; ``get_artifact_text``
re-stitches them in order.
"""
key_base = f"artifact:{artifact_path}"
if len(text) <= self._TAG_VALUE_LIMIT:
self.set_tag(run_id, key_base, text)
return
# chunk
for i in range(0, len(text), self._TAG_VALUE_LIMIT):
self.set_tag(run_id, f"{key_base}:{i // self._TAG_VALUE_LIMIT}",
text[i:i + self._TAG_VALUE_LIMIT])
def get_artifact_text(self, run_id: str, artifact_path: str) -> str:
run = self._get("/runs/get", {"run_id": run_id})["run"]
tags = {t["key"]: t["value"] for t in run["data"].get("tags", [])}
key_base = f"artifact:{artifact_path}"
if key_base in tags:
return tags[key_base]
# chunked form
chunks = sorted(
(k for k in tags if k.startswith(f"{key_base}:")),
key=lambda k: int(k.rsplit(":", 1)[1]),
)
return "".join(tags[k] for k in chunks)
def end_run(self, run_id: str, status: str = "FINISHED") -> None:
self._post("/runs/update", {
"run_id": run_id,
"status": status,
"end_time": int(time.time() * 1000),
})
def search_runs(
self,
experiment_id: str,
filter_string: str = "",
max_results: int = 1000,
) -> list[dict]:
body = {
"experiment_ids": [experiment_id],
"filter": filter_string,
"max_results": max_results,
}
r = self._post("/runs/search", body)
return r.get("runs", [])

View File

@@ -108,6 +108,93 @@ PROMPTS: dict[str, Prompt] = {
} }
# ── v4-orchestrator ────────────────────────────────────────────────────────
# Not a Prompt entry — takes pre-computed agent snippets, not a _Ctx.
_SYS_V4_ORCHESTRATOR = (
"You are a personal advisor generating a single, perfectly-timed tip. "
"Multiple specialized agents have analyzed the user's current context and provided "
"their insights below. Synthesize their combined perspective to generate exactly ONE "
"tip that is specific, actionable, and relevant right now. "
"Always respond in English regardless of the language of task content. "
"Respond ONLY with a JSON object with keys: "
'"id" (short slug), "content" (the tip, ≤2 sentences), '
'"rationale" (why now, ≤1 sentence). '
"No markdown, no prose outside the JSON object."
)
def _science_destiny_instruction(science_destiny: int) -> str:
"""Translate 0-100 slider into a prompt instruction.
0 = pure science: prioritise patterns, data, measurable progress.
100 = pure destiny: prioritise meaning, intuition, deeper purpose.
50 = balanced (no extra instruction injected).
"""
if science_destiny <= 20:
return (
"The user strongly prefers data-driven advice. "
"Ground every tip in observable patterns, streaks, or measurable progress. "
"Avoid abstract or motivational language."
)
if science_destiny <= 40:
return (
"The user leans toward evidence-based guidance. "
"Anchor tips in patterns and metrics where possible."
)
if science_destiny >= 80:
return (
"The user strongly believes in intuition and meaning. "
"Frame tips around purpose, values, and deeper intention rather than metrics."
)
if science_destiny >= 60:
return (
"The user leans toward intuitive, meaning-driven advice. "
"Weave in purpose and intention alongside practicality."
)
return "" # balanced — no extra instruction
def build_orchestrator_messages(
agent_outputs: list[dict],
tasks: list[dict],
hour_of_day: int,
day_of_week: int,
science_destiny: int = 50,
recent_tip: str | None = None,
) -> list[dict]:
"""Build the [system, user] message list for the orchestrator LLM call.
agent_outputs: list of {agent_id, prompt_text} dicts.
Falls back to raw task summary when agent_outputs is empty.
recent_tip: content of a tip the user just snoozed — generate something different.
"""
style_hint = _science_destiny_instruction(science_destiny)
system = _SYS_V4_ORCHESTRATOR + (f"\n\n{style_hint}" if style_hint else "")
lines = [f"Current time: {hour_of_day:02d}:00, day_of_week={day_of_week}", ""]
if recent_tip:
lines.append(f"The user snoozed this tip (do NOT repeat it or anything similar): \"{recent_tip}\"")
lines.append("")
if agent_outputs:
lines.append("Context from analysis agents:")
for s in agent_outputs:
lines.append(f"[{s['agent_id']}] {s['prompt_text']}")
else:
overdue = [t for t in tasks if t.get("is_overdue")]
lines.append(
f"No pre-computed agent context available. "
f"Tasks: {len(tasks)} total, {len(overdue)} overdue."
)
for t in tasks[:3]:
lines.append(f" - {t.get('content', '?')}")
lines.append("\nGenerate one tip as a JSON object. Write the tip content in English only.")
return [
{"role": "system", "content": system},
{"role": "user", "content": "\n".join(lines)},
]
def default_version() -> str: def default_version() -> str:
return os.getenv("DEFAULT_PROMPT_VERSION", "v1") return os.getenv("DEFAULT_PROMPT_VERSION", "v1")

View File

@@ -7,3 +7,5 @@ anthropic>=0.40.0
nats-py>=2.9.0 nats-py>=2.9.0
structlog>=24.1.0 structlog>=24.1.0
sentry-sdk>=2.0.0 sentry-sdk>=2.0.0
mlflow-skinny>=3.1.0
pyswisseph>=2.10.3.2

View File

@@ -0,0 +1,52 @@
"""POST /agents/{agent_id}/infer — inference framework endpoint."""
import pytest
from httpx import AsyncClient, ASGITransport
from main import app
@pytest.mark.anyio
async def test_infer_time_of_day_cold_start():
"""Fewer than min_history events → cold_start_default for preferred_hour."""
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
resp = await client.post("/agents/time-of-day/infer", json={
"user_id": "u1",
"feedback_history": [
{"action": "done", "dwell_ms": 60000, "created_at": "2026-05-01T09:00:00+00:00"},
] * 5, # 5 < min_history=10
})
assert resp.status_code == 200
body = resp.json()
assert body["agent_id"] == "time-of-day"
assert body["inferred_prefs"]["preferred_hour"] is None
@pytest.mark.anyio
async def test_infer_time_of_day_enough_history():
"""10+ events → preferred_hour is inferred as the mode done-hour."""
events = [{"action": "done", "dwell_ms": 60000, "created_at": "2026-05-01T09:00:00+00:00"}] * 10
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
resp = await client.post("/agents/time-of-day/infer", json={"user_id": "u1", "feedback_history": events})
assert resp.status_code == 200
body = resp.json()
assert body["inferred_prefs"]["preferred_hour"] == 9
@pytest.mark.anyio
async def test_infer_agent_with_no_inferred_params():
"""Agents with no inferred_params return an empty dict (focus-area has none)."""
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
resp = await client.post("/agents/focus-area/infer", json={"user_id": "u1", "feedback_history": []})
assert resp.status_code == 200
assert resp.json()["inferred_prefs"] == {}
@pytest.mark.anyio
async def test_infer_unknown_agent_404():
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
resp = await client.post("/agents/ghost/infer", json={"user_id": "u1", "feedback_history": []})
assert resp.status_code == 404

View File

@@ -0,0 +1,21 @@
"""GET /agents/registry — manifests are exposed in JSON-serialisable form."""
import pytest
from httpx import AsyncClient, ASGITransport
from main import app
@pytest.mark.anyio
async def test_registry_returns_all_agents():
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
resp = await client.get("/agents/registry")
assert resp.status_code == 200
payload = resp.json()
ids = {a["id"] for a in payload["agents"]}
assert ids == {"overdue-task", "momentum", "time-of-day", "recent-patterns", "focus-area"}
sample = payload["agents"][0]
for key in ("id", "version", "description", "pref_schema", "required_consents", "ttl_sec"):
assert key in sample

View File

@@ -1,439 +0,0 @@
"""
Unit tests for ml/serving — feature building and scoring contract.
Run with: pytest ml/serving/tests/
"""
import math
import pytest
from httpx import AsyncClient, ASGITransport
from main import (
app,
build_feature_vector,
build_feature_vector_12,
_norm_dwell,
_norm_preferred_hour,
_norm_rate,
_norm_volume,
)
class TestFeatureVector:
def test_shape(self):
v = build_feature_vector({"hour_of_day": 8, "is_overdue": True, "task_age_days": 3, "priority": 3})
assert v.shape == (5,)
def test_hour_encoding_noon(self):
v = build_feature_vector({"hour_of_day": 12})
# sin(2π * 12/24) = sin(π) ≈ 0
assert abs(v[0]) < 1e-10
# cos(2π * 12/24) = cos(π) = -1
assert abs(v[1] - (-1.0)) < 1e-10
def test_hour_encoding_midnight(self):
v = build_feature_vector({"hour_of_day": 0})
# sin(0) = 0
assert abs(v[0]) < 1e-10
# cos(0) = 1
assert abs(v[1] - 1.0) < 1e-10
def test_hour_encoding_6am(self):
v = build_feature_vector({"hour_of_day": 6})
# sin(2π * 6/24) = sin(π/2) = 1
assert abs(v[0] - 1.0) < 1e-10
# cos(π/2) = 0
assert abs(v[1]) < 1e-10
def test_age_clipped_at_30(self):
v_long = build_feature_vector({"task_age_days": 100})
v_cap = build_feature_vector({"task_age_days": 30})
assert v_long[3] == v_cap[3] == 1.0
def test_age_zero(self):
v = build_feature_vector({"task_age_days": 0})
assert v[3] == pytest.approx(0.0)
def test_age_15_days_normalised(self):
v = build_feature_vector({"task_age_days": 15})
assert v[3] == pytest.approx(0.5)
def test_priority_normalised(self):
v1 = build_feature_vector({"priority": 1})
v4 = build_feature_vector({"priority": 4})
assert v1[4] == pytest.approx(0.0)
assert v4[4] == pytest.approx(1.0)
def test_priority_2_and_3(self):
v2 = build_feature_vector({"priority": 2})
v3 = build_feature_vector({"priority": 3})
assert v2[4] == pytest.approx(1 / 3)
assert v3[4] == pytest.approx(2 / 3)
def test_is_overdue_true(self):
v = build_feature_vector({"is_overdue": True})
assert v[2] == 1.0
def test_is_overdue_false(self):
v = build_feature_vector({"is_overdue": False})
assert v[2] == 0.0
def test_defaults_when_no_keys(self):
v = build_feature_vector({})
# hour=12 → sin(π)≈0, cos(π)=-1
assert abs(v[0]) < 1e-10
assert abs(v[1] - (-1.0)) < 1e-10
assert v[2] == 0.0 # is_overdue=False
assert v[3] == 0.0 # task_age_days=0
assert v[4] == 0.0 # priority=1 → (1-1)/3=0
@pytest.mark.asyncio
async def test_health():
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.get("/health")
assert r.status_code == 200
assert r.json()["ok"] is True
@pytest.mark.asyncio
async def test_score_returns_a_candidate():
payload = {
"user_id": "test-user",
"candidates": [
{"id": "t:1", "content": "Task A", "source": "todoist", "source_id": "1",
"features": {"is_overdue": True, "task_age_days": 2, "priority": 3}},
{"id": "t:2", "content": "Task B", "source": "todoist", "source_id": "2",
"features": {"is_overdue": False, "task_age_days": 0, "priority": 1}},
],
"context": {"hour_of_day": 9, "day_of_week": 1},
}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.post("/score", json=payload)
assert r.status_code == 200
body = r.json()
assert body["tip_id"] in {"t:1", "t:2"}
assert "policy" in body
assert body["policy"] == "linucb-v1"
assert isinstance(body["score"], float)
@pytest.mark.asyncio
async def test_score_single_candidate_always_selected():
"""With a single candidate there is no choice — it must be returned."""
payload = {
"user_id": "solo-user",
"candidates": [
{"id": "only:1", "content": "Only task", "source": "todoist",
"features": {"is_overdue": False, "task_age_days": 0, "priority": 1}},
],
"context": {"hour_of_day": 10, "day_of_week": 0},
}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.post("/score", json=payload)
assert r.status_code == 200
assert r.json()["tip_id"] == "only:1"
@pytest.mark.asyncio
async def test_score_empty_candidates_returns_422():
payload = {"user_id": "u", "candidates": [], "context": {"hour_of_day": 9, "day_of_week": 1}}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.post("/score", json=payload)
assert r.status_code == 422
@pytest.mark.asyncio
async def test_reward_accepted():
payload = {
"user_id": "reward-user",
"tip_id": "t:1",
"reward": 1.0,
"features": {"hour_of_day": 9, "is_overdue": True, "task_age_days": 2, "priority": 3},
}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.post("/reward", json=payload)
assert r.status_code == 200
assert r.json()["ok"] is True
@pytest.mark.asyncio
async def test_reward_updates_stats():
"""Posting a reward should increase cumulative_reward in /stats."""
user_id = "reward-stats-user"
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r0 = await client.get(f"/stats/{user_id}")
before = r0.json()["cumulative_reward"]
await client.post("/reward", json={
"user_id": user_id,
"tip_id": "tip:x",
"reward": 1.0,
"features": {"hour_of_day": 8, "is_overdue": False, "task_age_days": 0, "priority": 2},
})
r1 = await client.get(f"/stats/{user_id}")
assert r1.json()["cumulative_reward"] == pytest.approx(before + 1.0)
@pytest.mark.asyncio
async def test_score_increments_pulls():
user_id = "pull-counter-user"
payload = {
"user_id": user_id,
"candidates": [
{"id": "t:p1", "content": "Pull task", "source": "todoist",
"features": {"is_overdue": False, "task_age_days": 1, "priority": 2}},
],
"context": {"hour_of_day": 10, "day_of_week": 2},
}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r0 = await client.get(f"/stats/{user_id}")
pulls_before = r0.json()["pulls"]
await client.post("/score", json=payload)
await client.post("/score", json=payload)
r1 = await client.get(f"/stats/{user_id}")
assert r1.json()["pulls"] == pulls_before + 2
@pytest.mark.asyncio
async def test_reset_clears_state():
user_id = "reset-user"
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
# Score once to build state
await client.post("/score", json={
"user_id": user_id,
"candidates": [
{"id": "t:r", "content": "Reset task", "source": "todoist",
"features": {"is_overdue": True, "task_age_days": 5, "priority": 4}},
],
"context": {"hour_of_day": 14, "day_of_week": 3},
})
r_reset = await client.post(f"/reset/{user_id}")
assert r_reset.json()["ok"] is True
r_stats = await client.get(f"/stats/{user_id}")
assert r_stats.json()["pulls"] == 0
@pytest.mark.asyncio
async def test_features_endpoint_returns_history():
user_id = "features-user"
payload = {
"user_id": user_id,
"candidates": [
{"id": "t:f1", "content": "Feature task", "source": "todoist",
"features": {"is_overdue": False, "task_age_days": 0, "priority": 1}},
],
"context": {"hour_of_day": 7, "day_of_week": 0},
}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
await client.post("/score", json=payload)
r = await client.get(f"/features/{user_id}")
body = r.json()
assert r.status_code == 200
assert "history" in body
assert len(body["history"]) >= 1
entry = body["history"][-1]
assert "ts" in entry
assert "score" in entry
assert "tip_id" in entry
@pytest.mark.asyncio
async def test_stats_for_fresh_user():
"""A user with no history should return zero/default stats without error."""
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.get("/stats/brand-new-user-xyz-abc")
body = r.json()
assert r.status_code == 200
assert body["pulls"] == 0
assert body["cumulative_reward"] == 0.0
assert body["estimated_mean_reward"] == 0.0
class TestV2Normalization:
def test_rate_passthrough(self):
assert _norm_rate(0.0) == 0.0
assert _norm_rate(0.42) == 0.42
assert _norm_rate(1.0) == 1.0
def test_rate_none_zero(self):
assert _norm_rate(None) == 0.0
def test_rate_clipped(self):
assert _norm_rate(1.5) == 1.0
assert _norm_rate(-0.1) == 0.0
def test_dwell_none_zero(self):
assert _norm_dwell(None) == 0.0
def test_dwell_scales_to_0_1(self):
assert _norm_dwell(0) == 0.0
# 600_000 ms (10 min) is the clip ceiling
assert _norm_dwell(600_000) == 1.0
assert _norm_dwell(1_200_000) == 1.0
assert _norm_dwell(60_000) == pytest.approx(0.1)
def test_volume_monotonic_and_clipped(self):
assert _norm_volume(None) == 0.0
assert _norm_volume(0) == 0.0
assert _norm_volume(10) < _norm_volume(100)
# 100 tips ≈ full saturation
assert _norm_volume(100) == pytest.approx(1.0)
assert _norm_volume(10_000) == 1.0
def test_preferred_hour_alignment(self):
# Exact match → 1.0
assert _norm_preferred_hour(9, 9) == pytest.approx(1.0)
# 12h opposite → 0.0
assert _norm_preferred_hour(21, 9) == pytest.approx(0.0, abs=1e-10)
# 6h off → 0.5 (cos(π/2) = 0, scaled to 0.5)
assert _norm_preferred_hour(15, 9) == pytest.approx(0.5, abs=1e-10)
def test_preferred_hour_null_neutral(self):
# Null preference → neutral 0.5 rather than misleading "alignment at 0"
assert _norm_preferred_hour(None, 9) == 0.5
class TestFeatureVector12:
def test_shape(self):
v = build_feature_vector_12(
{"hour_of_day": 9, "is_overdue": True, "task_age_days": 2, "priority": 3},
day_of_week=2,
profile={
"completion_rate_30d": 0.5,
"dismiss_rate_30d": 0.1,
"mean_dwell_ms_30d": 60_000,
"preferred_hour": 9,
"tip_volume_30d": 20,
},
)
assert v.shape == (12,)
def test_first_seven_match_v1(self):
"""v2 must reduce to v1-style features on the first 7 dims so rollout
behaviour is predictable when profile is absent."""
from main import build_feature_vector_7
feat = {"hour_of_day": 14, "is_overdue": True, "task_age_days": 5, "priority": 2}
v1 = build_feature_vector_7(feat, day_of_week=3)
v2 = build_feature_vector_12(feat, day_of_week=3, profile=None)
assert (v1 == v2[:7]).all()
def test_missing_profile_defaults(self):
v = build_feature_vector_12({"hour_of_day": 9}, day_of_week=0, profile=None)
# completion, dismiss, dwell, volume → 0; preferred_hour → 0.5 neutral
assert v[7] == 0.0
assert v[8] == 0.0
assert v[9] == 0.0
assert v[10] == pytest.approx(0.5)
assert v[11] == 0.0
@pytest.mark.asyncio
async def test_score_egreedy_v2_returns_candidate():
payload = {
"user_id": "v2-user",
"candidates": [
{"id": "t:a", "content": "A", "source": "todoist",
"features": {"is_overdue": True, "task_age_days": 2, "priority": 3}},
{"id": "t:b", "content": "B", "source": "todoist",
"features": {"is_overdue": False, "task_age_days": 0, "priority": 1}},
],
"context": {"hour_of_day": 9, "day_of_week": 1},
"profile_features": {
"completion_rate_30d": 0.4,
"dismiss_rate_30d": 0.1,
"mean_dwell_ms_30d": 45_000,
"preferred_hour": 9,
"tip_volume_30d": 8,
},
}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.post("/score/egreedy/v2", json=payload)
assert r.status_code == 200
body = r.json()
assert body["tip_id"] in {"t:a", "t:b"}
assert body["policy"] == "egreedy-v2"
@pytest.mark.asyncio
async def test_score_egreedy_v2_accepts_missing_profile():
payload = {
"user_id": "v2-no-profile",
"candidates": [
{"id": "t:solo", "content": "Solo", "source": "todoist",
"features": {"is_overdue": False, "task_age_days": 0, "priority": 1}},
],
"context": {"hour_of_day": 10, "day_of_week": 0},
}
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r = await client.post("/score/egreedy/v2", json=payload)
assert r.status_code == 200
assert r.json()["tip_id"] == "t:solo"
@pytest.mark.asyncio
async def test_reward_egreedy_v2_updates_stats():
user_id = "v2-reward-stats"
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r0 = await client.get(f"/stats/egreedy/v2/{user_id}")
before = r0.json()["cumulative_reward"]
await client.post("/reward/egreedy/v2", json={
"user_id": user_id,
"tip_id": "t:r",
"reward": 1.0,
"features": {"hour_of_day": 9, "is_overdue": True, "task_age_days": 2, "priority": 3},
"day_of_week": 1,
"profile_features": {
"completion_rate_30d": 0.3,
"dismiss_rate_30d": 0.2,
"mean_dwell_ms_30d": 30_000,
"preferred_hour": 9,
"tip_volume_30d": 5,
},
})
r1 = await client.get(f"/stats/egreedy/v2/{user_id}")
body = r1.json()
assert body["cumulative_reward"] == pytest.approx(before + 1.0)
assert body["policy"] == "egreedy-v2"
assert len(body["theta"]) == 12
assert len(body["feature_labels"]) == 12
@pytest.mark.asyncio
async def test_reset_clears_v2_state():
user_id = "v2-reset"
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
await client.post("/score/egreedy/v2", json={
"user_id": user_id,
"candidates": [
{"id": "t:v2r", "content": "x", "source": "todoist",
"features": {"is_overdue": False, "task_age_days": 0, "priority": 1}},
],
"context": {"hour_of_day": 10, "day_of_week": 0},
})
r0 = await client.get(f"/stats/egreedy/v2/{user_id}")
assert r0.json()["pulls"] >= 1
await client.post(f"/reset/{user_id}")
r1 = await client.get(f"/stats/egreedy/v2/{user_id}")
assert r1.json()["pulls"] == 0
@pytest.mark.asyncio
async def test_reward_negative_value():
"""Dismissing a tip should decrease cumulative_reward."""
user_id = "dismiss-user-neg"
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
r0 = await client.get(f"/stats/{user_id}")
before = r0.json()["cumulative_reward"]
await client.post("/reward", json={
"user_id": user_id,
"tip_id": "t:neg",
"reward": -1.0,
"features": {"hour_of_day": 20, "is_overdue": False, "task_age_days": 0, "priority": 1},
})
r1 = await client.get(f"/stats/{user_id}")
assert r1.json()["cumulative_reward"] == pytest.approx(before - 1.0)

View File

@@ -1,4 +1,4 @@
export type IntegrationProvider = 'todoist'; export type IntegrationProvider = 'todoist' | 'google-health';
export type IntegrationStatus = 'connected' | 'disconnected' | 'error'; export type IntegrationStatus = 'connected' | 'disconnected' | 'error';
export interface Integration { export interface Integration {

View File

@@ -2,7 +2,7 @@
export interface Signal { export interface Signal {
id: string; id: string;
source: string; // e.g. 'todoist', 'google-calendar', 'manual' source: string; // e.g. 'todoist', 'google-calendar', 'manual'
kind: 'task' | 'event' | 'habit' | 'insight'; kind: 'task' | 'event' | 'habit' | 'insight' | 'health';
content: string; content: string;
metadata: Record<string, unknown>; // source-specific raw fields metadata: Record<string, unknown>; // source-specific raw fields
features: Record<string, number | boolean>; // bandit-ready numeric/boolean features features: Record<string, number | boolean>; // bandit-ready numeric/boolean features

View File

@@ -2,7 +2,7 @@
export type TipKind = 'task' | 'advice' | 'insight' | 'reminder'; export type TipKind = 'task' | 'advice' | 'insight' | 'reminder';
/** Where the tip content originated */ /** Where the tip content originated */
export type TipSource = 'todoist' | 'llm' | 'advice'; export type TipSource = 'todoist' | 'llm' | 'advice' | 'fallback';
/** A single recommendation surfaced to the user */ /** A single recommendation surfaced to the user */
export interface Tip { export interface Tip {

View File

@@ -0,0 +1,129 @@
/**
* Migration tests — apply runMigrations() to a fresh in-memory SQLite handle
* and verify schema shape and idempotency.
*/
import { describe, it, expect } from 'vitest';
import Database from 'better-sqlite3';
import { runMigrations } from '../migrations.js';
function freshDb() {
const sqlite = new Database(':memory:');
sqlite.pragma('foreign_keys = ON');
return sqlite;
}
describe('runMigrations — fresh DB', () => {
it('creates the ADR-0014 tables, adds tone/tip_kinds_json, and drops legacy consent columns', () => {
const sqlite = freshDb();
runMigrations(sqlite);
const tables = (sqlite
.prepare(`SELECT name FROM sqlite_master WHERE type='table'`)
.all() as { name: string }[]).map((r) => r.name);
expect(tables).toEqual(expect.arrayContaining(['user_preferences', 'user_consents', 'user_contexts']));
const userCols = sqlite.prepare(`PRAGMA table_info(users)`).all() as { name: string }[];
const colNames = userCols.map((c) => c.name);
expect(colNames).toContain('tone');
expect(colNames).toContain('tip_kinds_json');
// ADR-0014 step 8: legacy columns must be absent on a fresh DB
expect(colNames).not.toContain('consent_given');
expect(colNames).not.toContain('consent_at');
});
it('drops consent columns from an existing DB that still had them', () => {
const sqlite = freshDb();
sqlite.pragma('foreign_keys = ON');
// Simulate a pre-step-8 DB: create table with legacy columns and seed a user
sqlite.exec(`
CREATE TABLE users (
id TEXT PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
role TEXT NOT NULL DEFAULT 'user',
consent_given INTEGER NOT NULL DEFAULT 0,
consent_at TEXT,
created_at TEXT NOT NULL
);
INSERT INTO users (id, email, role, consent_given, consent_at, created_at)
VALUES ('u1', 'u@test.com', 'user', 1, '2026-04-01T00:00:00Z', '2026-03-01T00:00:00Z');
`);
runMigrations(sqlite);
const colNames = (sqlite.prepare(`PRAGMA table_info(users)`).all() as { name: string }[]).map((c) => c.name);
expect(colNames).not.toContain('consent_given');
expect(colNames).not.toContain('consent_at');
// Backfill should have migrated the consent row before dropping
const consent = sqlite
.prepare(`SELECT consent_key FROM user_consents WHERE user_id = 'u1'`)
.get() as { consent_key: string } | undefined;
expect(consent?.consent_key).toBe('data:core');
});
it('declares the expected composite primary keys', () => {
const sqlite = freshDb();
runMigrations(sqlite);
type ColInfo = { name: string; pk: number };
const pkCols = (table: string): string[] =>
(sqlite.prepare(`PRAGMA table_info(${table})`).all() as ColInfo[])
.filter((c) => c.pk > 0)
.sort((a, b) => a.pk - b.pk)
.map((c) => c.name);
expect(pkCols('user_preferences')).toEqual(['user_id', 'scope', 'key']);
expect(pkCols('user_consents')).toEqual(['user_id', 'consent_key']);
expect(pkCols('user_contexts')).toEqual(['user_id', 'name']);
});
});
describe('runMigrations — idempotency', () => {
it('is safe to re-run on an already-migrated DB', () => {
const sqlite = freshDb();
runMigrations(sqlite);
expect(() => runMigrations(sqlite)).not.toThrow();
});
});
describe('runMigrations — issue #127 backfill', () => {
it('grants data:<provider> consent for existing active integration tokens', () => {
const sqlite = freshDb();
runMigrations(sqlite);
// Seed a user + active Todoist token (simulates pre-#127 state)
sqlite.exec(`
INSERT INTO users (id, email, role, created_at) VALUES ('u2', 'u2@test.com', 'user', '2026-01-01T00:00:00Z');
INSERT INTO user_consents (user_id, consent_key, granted_at) VALUES ('u2', 'data:core', '2026-01-01T00:00:00Z');
INSERT INTO integration_tokens (id, user_id, provider, access_token, token_status, connected_at)
VALUES ('tok1', 'u2', 'todoist', 'secret', 'active', '2026-01-02T00:00:00Z');
`);
// Re-run migrations — the backfill should insert data:todoist
runMigrations(sqlite);
const rows = sqlite
.prepare(`SELECT consent_key FROM user_consents WHERE user_id = 'u2' ORDER BY consent_key`)
.all() as { consent_key: string }[];
expect(rows.map((r) => r.consent_key)).toEqual(['data:core', 'data:todoist']);
});
it('is idempotent — running twice does not duplicate consent rows', () => {
const sqlite = freshDb();
runMigrations(sqlite);
sqlite.exec(`
INSERT INTO users (id, email, role, created_at) VALUES ('u3', 'u3@test.com', 'user', '2026-01-01T00:00:00Z');
INSERT INTO integration_tokens (id, user_id, provider, access_token, token_status, connected_at)
VALUES ('tok2', 'u3', 'todoist', 'secret', 'active', '2026-01-02T00:00:00Z');
`);
runMigrations(sqlite);
runMigrations(sqlite);
const count = (sqlite
.prepare(`SELECT COUNT(*) as n FROM user_consents WHERE user_id = 'u3' AND consent_key = 'data:todoist'`)
.get() as { n: number }).n;
expect(count).toBe(1);
});
});

View File

@@ -2,6 +2,7 @@ import Database from 'better-sqlite3';
import { drizzle } from 'drizzle-orm/better-sqlite3'; import { drizzle } from 'drizzle-orm/better-sqlite3';
import * as schema from './schema.js'; import * as schema from './schema.js';
import { config } from '../config.js'; import { config } from '../config.js';
import { runMigrations as runMigrationsImpl } from './migrations.js';
const sqlite = new Database(config.DATABASE_PATH); const sqlite = new Database(config.DATABASE_PATH);
sqlite.pragma('journal_mode = WAL'); sqlite.pragma('journal_mode = WAL');
@@ -13,172 +14,5 @@ export const db = drizzle(sqlite, { schema });
export const rawSqlite: any = sqlite; export const rawSqlite: any = sqlite;
export function runMigrations() { export function runMigrations() {
sqlite.exec(` runMigrationsImpl(sqlite);
CREATE TABLE IF NOT EXISTS users (
id TEXT PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
name TEXT,
image TEXT,
google_id TEXT UNIQUE,
role TEXT NOT NULL DEFAULT 'user',
consent_given INTEGER NOT NULL DEFAULT 0,
consent_at TEXT,
created_at TEXT NOT NULL,
deleted_at TEXT
);
CREATE TABLE IF NOT EXISTS integration_tokens (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
provider TEXT NOT NULL,
access_token TEXT NOT NULL,
refresh_token TEXT,
expires_at TEXT,
connected_at TEXT NOT NULL,
UNIQUE(user_id, provider)
);
CREATE TABLE IF NOT EXISTS tip_feedback (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
tip_id TEXT NOT NULL,
action TEXT NOT NULL,
source_id TEXT,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS tip_views (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
tip_id TEXT NOT NULL,
served_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS push_subscriptions (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
endpoint TEXT NOT NULL UNIQUE,
p256dh TEXT NOT NULL,
auth TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS sessions (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
expires_at TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS admin_actions (
id TEXT PRIMARY KEY,
admin_id TEXT NOT NULL REFERENCES users(id),
action TEXT NOT NULL,
target_type TEXT,
target_id TEXT,
detail TEXT,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS tip_scores (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
tip_id TEXT NOT NULL,
policy TEXT NOT NULL,
ml_score INTEGER,
features_json TEXT,
candidate_count INTEGER,
latency_ms INTEGER,
served_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS saved_queries (
id TEXT PRIMARY KEY,
admin_id TEXT NOT NULL REFERENCES users(id),
name TEXT NOT NULL,
sql TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS user_profile_features (
user_id TEXT NOT NULL REFERENCES users(id),
name TEXT NOT NULL,
value REAL,
value_text TEXT,
updated_at TEXT NOT NULL,
ttl_sec INTEGER NOT NULL,
PRIMARY KEY (user_id, name)
);
CREATE TABLE IF NOT EXISTS sim_runs (
id TEXT PRIMARY KEY,
policy_a TEXT NOT NULL,
policy_b TEXT NOT NULL,
n_users INTEGER NOT NULL,
n_rounds INTEGER NOT NULL,
tasks_per_round INTEGER NOT NULL DEFAULT 8,
use_llm INTEGER NOT NULL DEFAULT 0,
status TEXT NOT NULL DEFAULT 'pending',
summary_json TEXT,
winner TEXT,
persona_breakdown_json TEXT,
created_at TEXT NOT NULL,
finished_at TEXT
);
CREATE TABLE IF NOT EXISTS sim_events (
id TEXT PRIMARY KEY,
run_id TEXT NOT NULL REFERENCES sim_runs(id),
round INTEGER NOT NULL,
user_id TEXT NOT NULL,
persona TEXT NOT NULL,
policy TEXT NOT NULL,
tip_content TEXT NOT NULL,
priority INTEGER NOT NULL,
is_overdue INTEGER NOT NULL,
action TEXT NOT NULL,
dwell_ms INTEGER,
reward_milli INTEGER NOT NULL,
hour INTEGER NOT NULL,
day_of_week INTEGER NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS agent_outputs (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
agent_id TEXT NOT NULL,
prompt_text TEXT NOT NULL,
signals_snapshot TEXT,
computed_at TEXT NOT NULL,
expires_at TEXT NOT NULL,
agent_version TEXT NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_agent_outputs_user_agent_exp
ON agent_outputs(user_id, agent_id, expires_at DESC);
`);
// Additive column migrations — safe to run on existing DBs.
// SQLite doesn't support IF NOT EXISTS on ALTER TABLE; we ignore the error if already present.
for (const stmt of [
`ALTER TABLE users ADD COLUMN role TEXT NOT NULL DEFAULT 'user'`,
`ALTER TABLE push_subscriptions ADD COLUMN created_at TEXT NOT NULL DEFAULT ''`,
`ALTER TABLE tip_feedback ADD COLUMN dwell_ms INTEGER`,
`ALTER TABLE tip_feedback ADD COLUMN reward_milli INTEGER`,
`ALTER TABLE integration_tokens ADD COLUMN token_status TEXT NOT NULL DEFAULT 'active'`,
`ALTER TABLE tip_scores ADD COLUMN prompt_version TEXT`,
`ALTER TABLE tip_scores ADD COLUMN llm_model TEXT`,
`ALTER TABLE tip_scores ADD COLUMN tip_kind TEXT`,
`ALTER TABLE sim_runs ADD COLUMN mlflow_run_id TEXT`,
`ALTER TABLE sim_runs ADD COLUMN judge_mode TEXT NOT NULL DEFAULT 'rule'`,
`ALTER TABLE sim_runs ADD COLUMN n_policies INTEGER NOT NULL DEFAULT 2`,
]) {
try { sqlite.exec(stmt); } catch { /* column already exists */ }
}
// Seed first admin from env (ADMIN_SEED_EMAIL).
const seedEmail = process.env.ADMIN_SEED_EMAIL;
if (seedEmail) {
sqlite.prepare(`UPDATE users SET role = 'admin' WHERE email = ? AND role = 'user'`).run(seedEmail);
}
} }

View File

@@ -0,0 +1,241 @@
/**
* Schema migrations and one-shot backfills for the API DB.
*
* Kept separate from db/index.ts so tests can apply migrations to an in-memory
* SQLite handle without triggering the singleton DB connection at import time.
*/
import type { Database as BetterSqlite3Database } from 'better-sqlite3';
export function runMigrations(handle: BetterSqlite3Database) {
handle.exec(`
CREATE TABLE IF NOT EXISTS users (
id TEXT PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
name TEXT,
image TEXT,
google_id TEXT UNIQUE,
role TEXT NOT NULL DEFAULT 'user',
created_at TEXT NOT NULL,
deleted_at TEXT
);
CREATE TABLE IF NOT EXISTS integration_tokens (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
provider TEXT NOT NULL,
access_token TEXT NOT NULL,
refresh_token TEXT,
expires_at TEXT,
connected_at TEXT NOT NULL,
UNIQUE(user_id, provider)
);
CREATE TABLE IF NOT EXISTS tip_feedback (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
tip_id TEXT NOT NULL,
action TEXT NOT NULL,
source_id TEXT,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS tip_views (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
tip_id TEXT NOT NULL,
served_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS push_subscriptions (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
endpoint TEXT NOT NULL UNIQUE,
p256dh TEXT NOT NULL,
auth TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS sessions (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
expires_at TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS admin_actions (
id TEXT PRIMARY KEY,
admin_id TEXT NOT NULL REFERENCES users(id),
action TEXT NOT NULL,
target_type TEXT,
target_id TEXT,
detail TEXT,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS tip_scores (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
tip_id TEXT NOT NULL,
policy TEXT NOT NULL,
ml_score INTEGER,
features_json TEXT,
candidate_count INTEGER,
latency_ms INTEGER,
served_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS saved_queries (
id TEXT PRIMARY KEY,
admin_id TEXT NOT NULL REFERENCES users(id),
name TEXT NOT NULL,
sql TEXT NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS user_profile_features (
user_id TEXT NOT NULL REFERENCES users(id),
name TEXT NOT NULL,
value REAL,
value_text TEXT,
updated_at TEXT NOT NULL,
ttl_sec INTEGER NOT NULL,
PRIMARY KEY (user_id, name)
);
CREATE TABLE IF NOT EXISTS sim_runs (
id TEXT PRIMARY KEY,
policy_a TEXT NOT NULL,
policy_b TEXT NOT NULL,
n_users INTEGER NOT NULL,
n_rounds INTEGER NOT NULL,
tasks_per_round INTEGER NOT NULL DEFAULT 8,
use_llm INTEGER NOT NULL DEFAULT 0,
status TEXT NOT NULL DEFAULT 'pending',
summary_json TEXT,
winner TEXT,
persona_breakdown_json TEXT,
created_at TEXT NOT NULL,
finished_at TEXT
);
CREATE TABLE IF NOT EXISTS sim_events (
id TEXT PRIMARY KEY,
run_id TEXT NOT NULL REFERENCES sim_runs(id),
round INTEGER NOT NULL,
user_id TEXT NOT NULL,
persona TEXT NOT NULL,
policy TEXT NOT NULL,
tip_content TEXT NOT NULL,
priority INTEGER NOT NULL,
is_overdue INTEGER NOT NULL,
action TEXT NOT NULL,
dwell_ms INTEGER,
reward_milli INTEGER NOT NULL,
hour INTEGER NOT NULL,
day_of_week INTEGER NOT NULL,
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS agent_outputs (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
agent_id TEXT NOT NULL,
prompt_text TEXT NOT NULL,
signals_snapshot TEXT,
computed_at TEXT NOT NULL,
expires_at TEXT NOT NULL,
agent_version TEXT NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_agent_outputs_user_agent_exp
ON agent_outputs(user_id, agent_id, expires_at DESC);
CREATE TABLE IF NOT EXISTS task_enrichments (
content_hash TEXT PRIMARY KEY,
description TEXT NOT NULL,
model TEXT NOT NULL DEFAULT 'tip-generator',
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS user_preferences (
user_id TEXT NOT NULL REFERENCES users(id),
scope TEXT NOT NULL,
key TEXT NOT NULL,
value_json TEXT NOT NULL,
source TEXT NOT NULL DEFAULT 'user',
updated_at TEXT NOT NULL,
PRIMARY KEY (user_id, scope, key)
);
CREATE TABLE IF NOT EXISTS user_consents (
user_id TEXT NOT NULL REFERENCES users(id),
consent_key TEXT NOT NULL,
granted_at TEXT NOT NULL,
revoked_at TEXT,
PRIMARY KEY (user_id, consent_key)
);
CREATE TABLE IF NOT EXISTS user_contexts (
user_id TEXT NOT NULL REFERENCES users(id),
name TEXT NOT NULL,
active INTEGER NOT NULL DEFAULT 0,
schedule_json TEXT,
created_at TEXT NOT NULL,
PRIMARY KEY (user_id, name)
);
`);
// Additive column migrations — safe to run on existing DBs.
// SQLite doesn't support IF NOT EXISTS on ALTER TABLE; we ignore the error if already present.
for (const stmt of [
`ALTER TABLE users ADD COLUMN role TEXT NOT NULL DEFAULT 'user'`,
`ALTER TABLE push_subscriptions ADD COLUMN created_at TEXT NOT NULL DEFAULT ''`,
`ALTER TABLE tip_feedback ADD COLUMN dwell_ms INTEGER`,
`ALTER TABLE tip_feedback ADD COLUMN reward_milli INTEGER`,
`ALTER TABLE integration_tokens ADD COLUMN token_status TEXT NOT NULL DEFAULT 'active'`,
`ALTER TABLE tip_scores ADD COLUMN prompt_version TEXT`,
`ALTER TABLE tip_scores ADD COLUMN llm_model TEXT`,
`ALTER TABLE tip_scores ADD COLUMN tip_kind TEXT`,
`ALTER TABLE sim_runs ADD COLUMN mlflow_run_id TEXT`,
`ALTER TABLE sim_runs ADD COLUMN judge_mode TEXT NOT NULL DEFAULT 'rule'`,
`ALTER TABLE sim_runs ADD COLUMN n_policies INTEGER NOT NULL DEFAULT 2`,
`ALTER TABLE users ADD COLUMN tone TEXT`,
`ALTER TABLE users ADD COLUMN tip_kinds_json TEXT`,
]) {
try { handle.exec(stmt); } catch { /* column already exists */ }
}
// Backfill (ADR-0014 step 2): migrate consent_given=1 rows into user_consents.
// Wrapped in try/catch — silently skips on new DBs where consent_given never existed.
try {
handle.exec(`
INSERT OR IGNORE INTO user_consents (user_id, consent_key, granted_at)
SELECT id, 'data:core', COALESCE(consent_at, created_at)
FROM users
WHERE consent_given = 1
`);
} catch { /* column already dropped — nothing to backfill */ }
// Backfill (issue #127): grant data:<provider> consent for every active integration token.
// Idempotent — INSERT OR IGNORE skips rows that already exist.
handle.exec(`
INSERT OR IGNORE INTO user_consents (user_id, consent_key, granted_at)
SELECT user_id, 'data:' || provider, connected_at
FROM integration_tokens
WHERE token_status = 'active'
`);
// Drop legacy consent columns (ADR-0014 step 8). Runs after the backfill above.
// Silently skips if already dropped (column not found error) or never existed (new DB).
for (const stmt of [
`ALTER TABLE users DROP COLUMN consent_given`,
`ALTER TABLE users DROP COLUMN consent_at`,
]) {
try { handle.exec(stmt); } catch { /* already dropped or never existed */ }
}
// Seed first admin from env (ADMIN_SEED_EMAIL).
const seedEmail = process.env.ADMIN_SEED_EMAIL;
if (seedEmail) {
handle.prepare(`UPDATE users SET role = 'admin' WHERE email = ? AND role = 'user'`).run(seedEmail);
}
}

View File

@@ -7,12 +7,46 @@ export const users = sqliteTable('users', {
image: text('image'), image: text('image'),
googleId: text('google_id').unique(), googleId: text('google_id').unique(),
role: text('role').notNull().default('user'), // 'user' | 'admin' role: text('role').notNull().default('user'), // 'user' | 'admin'
consentGiven: integer('consent_given', { mode: 'boolean' }).notNull().default(false), // Stable globals (ADR-0014). Per-agent prefs land in user_preferences instead.
consentAt: text('consent_at'), tone: text('tone'), // 'direct' | 'gentle' | 'motivational'
tipKindsJson: text('tip_kinds_json'), // JSON array of allowed tip kinds; null = all
createdAt: text('created_at').notNull(), createdAt: text('created_at').notNull(),
deletedAt: text('deleted_at'), deletedAt: text('deleted_at'),
}); });
// ── Unified Profile model (ADR-0014) ────────────────────────────────────────
// Open-ended per-scope preferences. `scope` is 'orchestrator' or 'agent:<id>';
// the agent's pref_schema (from its manifest) validates value_json on read.
// `source='inferred'` is written by the inference framework (#111); never
// overwrites a `source='user'` row.
export const userPreferences = sqliteTable('user_preferences', {
userId: text('user_id').notNull().references(() => users.id),
scope: text('scope').notNull(), // 'orchestrator' | 'agent:<id>'
key: text('key').notNull(),
valueJson: text('value_json').notNull(),
source: text('source').notNull().default('user'), // 'user' | 'inferred'
updatedAt: text('updated_at').notNull(),
});
// Per-key consent. Revocation writes `revoked_at`; rows are never deleted
// so audits stay clean. `revoked_at IS NULL` = currently active.
export const userConsents = sqliteTable('user_consents', {
userId: text('user_id').notNull().references(() => users.id),
consentKey: text('consent_key').notNull(), // 'data:core' | 'data:todoist' | 'agent:<id>' | …
grantedAt: text('granted_at').notNull(),
revokedAt: text('revoked_at'),
});
// User-named contexts (work / home / vacation). M2 ships manual toggle only;
// auto-inference is per-agent (#112#116).
export const userContexts = sqliteTable('user_contexts', {
userId: text('user_id').notNull().references(() => users.id),
name: text('name').notNull(),
active: integer('active', { mode: 'boolean' }).notNull().default(false),
scheduleJson: text('schedule_json'), // optional: when active
createdAt: text('created_at').notNull(),
});
export const integrationTokens = sqliteTable('integration_tokens', { export const integrationTokens = sqliteTable('integration_tokens', {
id: text('id').primaryKey(), id: text('id').primaryKey(),
userId: text('user_id').notNull().references(() => users.id), userId: text('user_id').notNull().references(() => users.id),
@@ -155,6 +189,15 @@ export const agentOutputs = sqliteTable('agent_outputs', {
agentVersion: text('agent_version').notNull(), // bump to invalidate on logic changes agentVersion: text('agent_version').notNull(), // bump to invalidate on logic changes
}); });
// Persistent cache for LLM-enriched task descriptions used by clustering.
// Keyed by MD5 of raw task content; avoids re-calling LiteLLM on every agent compute cycle.
export const taskEnrichments = sqliteTable('task_enrichments', {
contentHash: text('content_hash').primaryKey(),
description: text('description').notNull(),
model: text('model').notNull().default('tip-generator'),
createdAt: text('created_at').notNull(),
});
// Admin saved SQL queries. // Admin saved SQL queries.
export const savedQueries = sqliteTable('saved_queries', { export const savedQueries = sqliteTable('saved_queries', {
id: text('id').primaryKey(), id: text('id').primaryKey(),

View File

@@ -17,6 +17,9 @@ import { userRouter } from './routes/user.js';
import { pushRouter } from './routes/push.js'; import { pushRouter } from './routes/push.js';
import { adminRouter, adminInternalRouter } from './routes/admin.js'; import { adminRouter, adminInternalRouter } from './routes/admin.js';
import benchRouter from './routes/bench.js'; import benchRouter from './routes/bench.js';
import agentOutputsRouter from './routes/agent-outputs.js';
import agentRegistryRouter from './routes/agent-registry.js';
import profileRouter from './routes/profile.js';
import { mkdir } from 'fs/promises'; import { mkdir } from 'fs/promises';
import { dirname } from 'path'; import { dirname } from 'path';
import { requireAuth } from './middleware/session.js'; import { requireAuth } from './middleware/session.js';
@@ -24,6 +27,7 @@ import { requireAdmin } from './middleware/admin.js';
import type { Request, Response } from 'express'; import type { Request, Response } from 'express';
import { connectNats } from './events/nats.js'; import { connectNats } from './events/nats.js';
import { startTodoistSyncScheduler } from './signals/scheduler.js'; import { startTodoistSyncScheduler } from './signals/scheduler.js';
import { startAgentPrecomputeScheduler } from './signals/agent-scheduler.js';
import { bus } from './events/bus.js'; import { bus } from './events/bus.js';
import { registerProfileSubscriptions } from './profile/subscriber.js'; import { registerProfileSubscriptions } from './profile/subscriber.js';
@@ -68,6 +72,10 @@ app.use('/api/push', pushRouter);
app.use('/api/admin', adminRouter); app.use('/api/admin', adminRouter);
app.use('/api/admin', adminInternalRouter); app.use('/api/admin', adminInternalRouter);
app.use('/api/bench', requireAuth as any, requireAdmin as any, benchRouter); app.use('/api/bench', requireAuth as any, requireAdmin as any, benchRouter);
// agent-registry mounts first so /registry beats agent-outputs' /:userId pattern.
app.use('/api/agents', agentRegistryRouter);
app.use('/api/agents', agentOutputsRouter);
app.use('/api/profile', profileRouter);
app.use('/api/ml', requireAuth as any, requireAdmin as any, async (req: Request, res: Response) => { app.use('/api/ml', requireAuth as any, requireAdmin as any, async (req: Request, res: Response) => {
const mlUrl = config.ML_SERVING_URL; const mlUrl = config.ML_SERVING_URL;
@@ -108,6 +116,7 @@ if (config.NATS_URL) {
} }
startTodoistSyncScheduler(config.TODOIST_SYNC_INTERVAL_MS); startTodoistSyncScheduler(config.TODOIST_SYNC_INTERVAL_MS);
void startAgentPrecomputeScheduler();
// Profile features are invalidated on relevant signals (#81 phase B.2); // Profile features are invalidated on relevant signals (#81 phase B.2);
// TTL stays as a safety net for clock drift / dropped events. // TTL stays as a safety net for clock drift / dropped events.

View File

@@ -24,8 +24,8 @@ const SHORT_AGO = new Date(Date.now() - 30_000).toISOString();
beforeAll(async () => { beforeAll(async () => {
await testDb.insert(users).values([ await testDb.insert(users).values([
{ id: 'pf-user-1', email: 'pf1@test.com', role: 'user', consentGiven: true, consentAt: NOW, createdAt: NOW }, { id: 'pf-user-1', email: 'pf1@test.com', role: 'user', createdAt: NOW },
{ id: 'pf-user-empty', email: 'pfempty@test.com', role: 'user', consentGiven: true, consentAt: NOW, createdAt: NOW }, { id: 'pf-user-empty', email: 'pfempty@test.com', role: 'user', createdAt: NOW },
]); ]);
}); });

View File

@@ -0,0 +1,119 @@
/**
* Unit tests for getEligibleAgentIds (ADR-0014 step 5).
* DB is mocked via in-memory SQLite; fetchRegistry is mocked per scenario.
*/
import { describe, it, expect, vi, beforeAll, beforeEach } from 'vitest';
import { makeTestDb } from '../../test/db.js';
import { users, userConsents, userPreferences, userContexts } from '../../db/schema.js';
const testDb = makeTestDb();
vi.mock('../../db/index.js', () => ({ db: testDb, rawSqlite: testDb.rawSqlite }));
// Registry mock — overridden per test.
const mockFetchRegistry = vi.fn();
vi.mock('../../routes/agent-registry.js', () => ({
fetchRegistry: (...args: unknown[]) => mockFetchRegistry(...args),
_resetRegistryCache: vi.fn(),
}));
const { getEligibleAgentIds } = await import('../eligibility.js');
const NOW = new Date().toISOString();
const MANIFEST_DEFAULTS = {
version: '1.0.0',
description: '',
pref_schema: {},
context_schema: [],
output_contract: {},
ttl_sec: 300,
};
const AGENT_A = { ...MANIFEST_DEFAULTS, id: 'agent-a', required_consents: ['data:core'], silenced_in_contexts: [] };
const AGENT_B = { ...MANIFEST_DEFAULTS, id: 'agent-b', required_consents: ['data:core', 'data:todoist'], silenced_in_contexts: [] };
const AGENT_C = { ...MANIFEST_DEFAULTS, id: 'agent-c', required_consents: ['data:core'], silenced_in_contexts: ['vacation'] };
beforeAll(async () => {
await testDb.insert(users).values({
id: 'u1', email: 'u@test.com', name: null, image: null, role: 'user',
createdAt: NOW,
});
});
beforeEach(() => {
mockFetchRegistry.mockReset();
});
describe('getEligibleAgentIds', () => {
it('returns empty set when registry is unavailable', async () => {
mockFetchRegistry.mockRejectedValue(new Error('network'));
const ids = await getEligibleAgentIds('u1');
expect(ids.size).toBe(0);
});
it('excludes agents whose required consents are not granted', async () => {
mockFetchRegistry.mockResolvedValue({ agents: [AGENT_A, AGENT_B] });
// only data:core granted
await testDb.insert(userConsents).values({ userId: 'u1', consentKey: 'data:core', grantedAt: NOW, revokedAt: null });
const ids = await getEligibleAgentIds('u1');
expect(ids.has('agent-a')).toBe(true);
expect(ids.has('agent-b')).toBe(false);
});
it('excludes agents when a required consent is revoked', async () => {
mockFetchRegistry.mockResolvedValue({ agents: [AGENT_B] });
// grant then revoke data:todoist
await testDb.insert(userConsents).values([
{ userId: 'u1', consentKey: 'data:todoist', grantedAt: NOW, revokedAt: NOW },
]).onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { revokedAt: NOW },
});
const ids = await getEligibleAgentIds('u1');
expect(ids.has('agent-b')).toBe(false);
});
it('silences agents whose silenced_in_contexts intersects active contexts', async () => {
mockFetchRegistry.mockResolvedValue({ agents: [AGENT_A, AGENT_C] });
// ensure data:core granted
await testDb.insert(userConsents).values({ userId: 'u1', consentKey: 'data:core', grantedAt: NOW, revokedAt: null })
.onConflictDoUpdate({ target: [userConsents.userId, userConsents.consentKey], set: { revokedAt: null } });
// activate vacation context
await testDb.insert(userContexts).values({ userId: 'u1', name: 'vacation', active: true, scheduleJson: null, createdAt: NOW });
const ids = await getEligibleAgentIds('u1');
expect(ids.has('agent-a')).toBe(true);
expect(ids.has('agent-c')).toBe(false);
});
it('excludes agents explicitly disabled via user_preferences', async () => {
mockFetchRegistry.mockResolvedValue({ agents: [AGENT_A] });
await testDb.insert(userConsents).values({ userId: 'u1', consentKey: 'data:core', grantedAt: NOW, revokedAt: null })
.onConflictDoUpdate({ target: [userConsents.userId, userConsents.consentKey], set: { revokedAt: null } });
await testDb.insert(userPreferences).values({
userId: 'u1', scope: 'agent:agent-a', key: 'enabled', valueJson: 'false', source: 'user', updatedAt: NOW,
}).onConflictDoUpdate({
target: [userPreferences.userId, userPreferences.scope, userPreferences.key],
set: { valueJson: 'false' },
});
const ids = await getEligibleAgentIds('u1');
expect(ids.has('agent-a')).toBe(false);
});
it('includes agents when enabled pref is true (or absent)', async () => {
mockFetchRegistry.mockResolvedValue({ agents: [AGENT_A] });
await testDb.insert(userConsents).values({ userId: 'u1', consentKey: 'data:core', grantedAt: NOW, revokedAt: null })
.onConflictDoUpdate({ target: [userConsents.userId, userConsents.consentKey], set: { revokedAt: null } });
await testDb.insert(userPreferences).values({
userId: 'u1', scope: 'agent:agent-a', key: 'enabled', valueJson: 'true', source: 'user', updatedAt: NOW,
}).onConflictDoUpdate({
target: [userPreferences.userId, userPreferences.scope, userPreferences.key],
set: { valueJson: 'true' },
});
const ids = await getEligibleAgentIds('u1');
expect(ids.has('agent-a')).toBe(true);
});
});

View File

@@ -23,8 +23,8 @@ const STALE_BASE = {
beforeAll(async () => { beforeAll(async () => {
await testDb.insert(users).values([ await testDb.insert(users).values([
{ id: 'sub-user-1', email: 'sub1@test.com', role: 'user', consentGiven: true, consentAt: NOW, createdAt: NOW }, { id: 'sub-user-1', email: 'sub1@test.com', role: 'user', createdAt: NOW },
{ id: 'sub-user-2', email: 'sub2@test.com', role: 'user', consentGiven: true, consentAt: NOW, createdAt: NOW }, { id: 'sub-user-2', email: 'sub2@test.com', role: 'user', createdAt: NOW },
]); ]);
}); });

View File

@@ -0,0 +1,81 @@
/**
* Registry-driven agent eligibility filter (ADR-0014 step 5, updated by ADR-0015).
*
* Rules (all must pass for an agent to be eligible):
* 1. Every data:<source> in required_consents is granted and not revoked.
* Consent is granted automatically when the user connects that data source.
* agent:<id> consents no longer exist — per-agent control is a preference (rule 3).
* 2. No silenced_in_contexts entry matches an active context.
* 3. user_preferences[scope='agent:<id>', key='enabled'] is not false.
*
* Fail-closed: if the registry is unavailable, returns an empty set so the
* orchestrator falls back to the random policy rather than proceeding without
* consent checks.
*/
import { db } from '../db/index.js';
import { userConsents, userPreferences, userContexts } from '../db/schema.js';
import { eq, and, isNull } from 'drizzle-orm';
import { fetchRegistry } from '../routes/agent-registry.js';
export interface AgentManifestWire {
id: string;
required_consents: string[];
silenced_in_contexts: string[];
[key: string]: unknown;
}
interface RegistryPayload {
agents: AgentManifestWire[];
}
export async function getEligibleAgentIds(userId: string): Promise<Set<string>> {
let registry: RegistryPayload;
try {
registry = (await fetchRegistry()) as RegistryPayload;
} catch {
return new Set();
}
const [consentRows, prefRows, contextRows] = await Promise.all([
db
.select({ consentKey: userConsents.consentKey })
.from(userConsents)
.where(and(eq(userConsents.userId, userId), isNull(userConsents.revokedAt))),
db
.select({ scope: userPreferences.scope, key: userPreferences.key, valueJson: userPreferences.valueJson })
.from(userPreferences)
.where(eq(userPreferences.userId, userId)),
db
.select({ name: userContexts.name, active: userContexts.active })
.from(userContexts)
.where(and(eq(userContexts.userId, userId), eq(userContexts.active, true))),
]);
// Active consents (granted + not revoked)
const activeConsents = new Set(consentRows.map((r) => r.consentKey));
// Active context names
const activeContextNames = new Set(contextRows.map((r) => r.name));
// Per-agent enabled flag from user_preferences
const agentEnabled: Record<string, boolean> = {};
for (const p of prefRows) {
if (!p.scope.startsWith('agent:')) continue;
if (p.key !== 'enabled') continue;
try {
agentEnabled[p.scope] = JSON.parse(p.valueJson) as boolean;
} catch {
// ignore malformed
}
}
const eligible = new Set<string>();
for (const manifest of registry.agents) {
if (!manifest.required_consents.every((c) => activeConsents.has(c))) continue;
if (manifest.silenced_in_contexts.some((ctx) => activeContextNames.has(ctx))) continue;
const enabledPref = agentEnabled[`agent:${manifest.id}`];
if (enabledPref === false) continue;
eligible.add(manifest.id);
}
return eligible;
}

View File

@@ -38,9 +38,9 @@ const DAY_AGO = new Date(Date.now() - 23 * 60 * 60 * 1000).toISOString();
beforeAll(async () => { beforeAll(async () => {
await testDb.insert(users).values([ await testDb.insert(users).values([
{ id: 'admin-1', email: 'admin@test.com', role: 'admin', consentGiven: true, consentAt: NOW, createdAt: NOW }, { id: 'admin-1', email: 'admin@test.com', role: 'admin', createdAt: NOW },
{ id: 'user-1', email: 'alice@test.com', role: 'user', consentGiven: true, consentAt: NOW, createdAt: NOW }, { id: 'user-1', email: 'alice@test.com', role: 'user', createdAt: NOW },
{ id: 'user-2', email: 'bob@test.com', role: 'user', consentGiven: false, createdAt: NOW }, { id: 'user-2', email: 'bob@test.com', role: 'user', createdAt: NOW },
]); ]);
await testDb.insert(integrationTokens).values([ await testDb.insert(integrationTokens).values([
{ id: 'tok-1', userId: 'user-1', provider: 'todoist', accessToken: 'secret', connectedAt: NOW }, { id: 'tok-1', userId: 'user-1', provider: 'todoist', accessToken: 'secret', connectedAt: NOW },

View File

@@ -0,0 +1,108 @@
/**
* GET /api/agents/registry — proxies ml/serving's manifest list with a short
* in-process cache. Tests stub global fetch and verify caching + 502 fallback.
*/
import { describe, it, expect, vi, beforeAll, afterEach, beforeEach } from 'vitest';
import express from 'express';
import * as http from 'http';
vi.mock('../../middleware/session.js', () => ({
sessionMiddleware: (_req: express.Request, _res: express.Response, next: express.NextFunction) => next(),
requireAuth: (req: express.Request, _res: express.Response, next: express.NextFunction) => {
(req as any).userId = 'user-1';
next();
},
}));
const REGISTRY_PAYLOAD = {
agents: [
{ id: 'overdue-task', version: '1.0.0', pref_schema: { type: 'object' } },
{ id: 'momentum', version: '1.0.0', pref_schema: { type: 'object' } },
],
};
function get(url: string): Promise<{ status: number; body: any }> {
return new Promise((resolve, reject) => {
const u = new URL(url);
http.get({ hostname: u.hostname, port: Number(u.port), path: u.pathname }, (res) => {
let data = '';
res.on('data', (c) => { data += c; });
res.on('end', () => {
try { resolve({ status: res.statusCode ?? 0, body: data ? JSON.parse(data) : null }); }
catch { resolve({ status: res.statusCode ?? 0, body: data }); }
});
}).on('error', reject);
});
}
describe('GET /api/agents/registry', () => {
let server: http.Server;
let baseUrl: string;
let savedFetch: typeof globalThis.fetch;
let resetCache: () => void;
beforeAll(async () => {
const mod = await import('../agent-registry.js');
const router = mod.default;
resetCache = mod._resetRegistryCache;
const app = express();
app.use('/api/agents', router);
server = await new Promise<http.Server>((resolve) => {
const s = app.listen(0, () => resolve(s));
});
const addr = server.address() as { port: number };
baseUrl = `http://localhost:${addr.port}`;
savedFetch = globalThis.fetch;
});
beforeEach(() => {
resetCache();
});
afterEach(() => {
globalThis.fetch = savedFetch;
});
it('proxies ml/serving manifests', async () => {
const fetchMock = vi.fn(async () =>
new Response(JSON.stringify(REGISTRY_PAYLOAD), { status: 200 }),
);
globalThis.fetch = fetchMock as unknown as typeof fetch;
const r = await get(`${baseUrl}/api/agents/registry`);
expect(r.status).toBe(200);
expect(r.body).toEqual(REGISTRY_PAYLOAD);
expect(fetchMock).toHaveBeenCalledTimes(1);
});
it('caches across calls within the TTL', async () => {
const fetchMock = vi.fn(async () =>
new Response(JSON.stringify(REGISTRY_PAYLOAD), { status: 200 }),
);
globalThis.fetch = fetchMock as unknown as typeof fetch;
await get(`${baseUrl}/api/agents/registry`);
await get(`${baseUrl}/api/agents/registry`);
expect(fetchMock).toHaveBeenCalledTimes(1);
});
it('returns 502 when ml/serving fails', async () => {
globalThis.fetch = vi.fn(async () => new Response('boom', { status: 500 })) as unknown as typeof fetch;
const r = await get(`${baseUrl}/api/agents/registry`);
expect(r.status).toBe(502);
expect(r.body.error).toBe('ml/serving unavailable');
});
it('does not cache failures', async () => {
const fetchMock = vi.fn()
.mockResolvedValueOnce(new Response('boom', { status: 500 }))
.mockResolvedValueOnce(new Response(JSON.stringify(REGISTRY_PAYLOAD), { status: 200 }));
globalThis.fetch = fetchMock as unknown as typeof fetch;
const first = await get(`${baseUrl}/api/agents/registry`);
expect(first.status).toBe(502);
const second = await get(`${baseUrl}/api/agents/registry`);
expect(second.status).toBe(200);
expect(fetchMock).toHaveBeenCalledTimes(2);
});
});

View File

@@ -0,0 +1,193 @@
/**
* Integration tests for GET/PATCH /api/profile (ADR-0014 step 4).
* Real in-memory SQLite; auth middleware mocked so requests arrive as 'user-1'.
*/
import { describe, it, expect, vi, beforeAll, afterAll } from 'vitest';
import express from 'express';
import * as http from 'http';
import { makeTestDb } from '../../test/db.js';
import { users, userPreferences, userConsents, userContexts } from '../../db/schema.js';
const testDb = makeTestDb();
vi.mock('../../db/index.js', () => ({ db: testDb, rawSqlite: testDb.rawSqlite }));
vi.mock('../../middleware/session.js', () => ({
sessionMiddleware: (_req: express.Request, _res: express.Response, next: express.NextFunction) =>
next(),
requireAuth: (req: express.Request, _res: express.Response, next: express.NextFunction) => {
(req as any).userId = 'user-1';
next();
},
}));
function call(
server: http.Server,
method: string,
path: string,
body?: unknown,
): Promise<{ status: number; body: unknown }> {
return new Promise((resolve, reject) => {
const { port } = server.address() as { port: number };
const req = http.request(
{ method, hostname: '127.0.0.1', port, path, headers: { 'Content-Type': 'application/json' } },
(res) => {
let data = '';
res.on('data', (c) => (data += c));
res.on('end', () => {
try { resolve({ status: res.statusCode!, body: JSON.parse(data) }); }
catch { resolve({ status: res.statusCode!, body: data }); }
});
},
);
req.on('error', reject);
if (body !== undefined) req.write(JSON.stringify(body));
req.end();
});
}
function startServer(app: express.Application): Promise<{ server: http.Server; call: (method: string, path: string, body?: unknown) => ReturnType<typeof call> }> {
return new Promise((resolve) => {
const server = http.createServer(app);
server.listen(0, () =>
resolve({ server, call: (m, p, b) => call(server, m, p, b) }),
);
});
}
const profileRouter = (await import('../profile.js')).default;
const app = express();
app.use(express.json());
app.use('/api/profile', profileRouter);
const { server, call: c } = await startServer(app);
afterAll(() => server.close());
const NOW = new Date().toISOString();
beforeAll(async () => {
await testDb.insert(users).values({
id: 'user-1',
email: 'a@example.com',
name: 'Alice',
image: null,
role: 'user',
tone: 'direct',
tipKindsJson: JSON.stringify(['task', 'advice']),
createdAt: NOW,
});
});
describe('GET /api/profile', () => {
it('returns user globals with empty prefs/consents/contexts', async () => {
const res = await c('GET', '/api/profile');
expect(res.status).toBe(200);
const body = res.body as any;
expect(body.user).toMatchObject({ id: 'user-1', tone: 'direct', tipKinds: ['task', 'advice'] });
expect(body.prefs).toEqual({});
expect(body.consents).toEqual({});
expect(body.contexts).toEqual([]);
});
it('includes prefs grouped by scope', async () => {
await testDb.insert(userPreferences).values([
{ userId: 'user-1', scope: 'orchestrator', key: 'quietHours', valueJson: '"22:00-07:00"', source: 'user', updatedAt: NOW },
{ userId: 'user-1', scope: 'agent:focus-area', key: 'areas', valueJson: '["work","health"]', source: 'inferred', updatedAt: NOW },
]);
const res = await c('GET', '/api/profile');
const body = res.body as any;
expect(body.prefs['orchestrator']).toMatchObject({ quietHours: '22:00-07:00' });
expect(body.prefs['agent:focus-area']).toMatchObject({ areas: ['work', 'health'] });
});
it('includes consents', async () => {
await testDb.insert(userConsents).values([
{ userId: 'user-1', consentKey: 'data:core', grantedAt: NOW, revokedAt: null },
{ userId: 'user-1', consentKey: 'data:todoist', grantedAt: NOW, revokedAt: NOW },
]);
const body = (await c('GET', '/api/profile')).body as any;
expect(body.consents['data:core'].revokedAt).toBeNull();
expect(body.consents['data:todoist'].revokedAt).toBe(NOW);
});
it('includes contexts', async () => {
await testDb.insert(userContexts).values({
userId: 'user-1', name: 'work', active: true, scheduleJson: null, createdAt: NOW,
});
const body = (await c('GET', '/api/profile')).body as any;
expect(body.contexts).toContainEqual(expect.objectContaining({ name: 'work', active: true }));
});
});
describe('PATCH /api/profile/prefs/:scope', () => {
it('upserts preference keys with source=user', async () => {
const res = await c('PATCH', '/api/profile/prefs/orchestrator', { tone: 'gentle' });
expect(res.status).toBe(200);
expect(res.body).toEqual({ ok: true });
const body = (await c('GET', '/api/profile')).body as any;
expect(body.prefs['orchestrator']['tone']).toBe('gentle');
});
it('overwrites an inferred value with user source', async () => {
await testDb.insert(userPreferences).values({
userId: 'user-1', scope: 'agent:momentum', key: 'enabled', valueJson: 'false',
source: 'inferred', updatedAt: NOW,
}).onConflictDoUpdate({
target: [userPreferences.userId, userPreferences.scope, userPreferences.key],
set: { valueJson: 'false', source: 'inferred', updatedAt: NOW },
});
await c('PATCH', '/api/profile/prefs/agent:momentum', { enabled: true });
const body = (await c('GET', '/api/profile')).body as any;
expect(body.prefs['agent:momentum']['enabled']).toBe(true);
});
it('returns 400 for non-object body', async () => {
const res = await c('PATCH', '/api/profile/prefs/orchestrator', [1, 2]);
expect(res.status).toBe(400);
});
});
describe('PATCH /api/profile/consents', () => {
it('grants a new consent key', async () => {
const res = await c('PATCH', '/api/profile/consents', { grant: ['data:calendar'] });
expect(res.status).toBe(200);
const body = (await c('GET', '/api/profile')).body as any;
expect(body.consents['data:calendar'].revokedAt).toBeNull();
});
it('revokes an existing active consent', async () => {
await c('PATCH', '/api/profile/consents', { grant: ['agent:overdue-task'] });
await c('PATCH', '/api/profile/consents', { revoke: ['agent:overdue-task'] });
const body = (await c('GET', '/api/profile')).body as any;
expect(body.consents['agent:overdue-task'].revokedAt).not.toBeNull();
});
it('returns 400 when grant is not an array', async () => {
const res = await c('PATCH', '/api/profile/consents', { grant: 'data:core' });
expect(res.status).toBe(400);
});
});
describe('PATCH /api/profile/contexts', () => {
it('creates a new context', async () => {
const res = await c('PATCH', '/api/profile/contexts', { name: 'vacation', active: false });
expect(res.status).toBe(200);
const body = (await c('GET', '/api/profile')).body as any;
expect(body.contexts).toContainEqual(expect.objectContaining({ name: 'vacation', active: false }));
});
it('toggles active on existing context', async () => {
await c('PATCH', '/api/profile/contexts', { name: 'home', active: false });
await c('PATCH', '/api/profile/contexts', { name: 'home', active: true });
const body = (await c('GET', '/api/profile')).body as any;
const ctx = (body.contexts as any[]).find((x) => x.name === 'home');
expect(ctx?.active).toBe(true);
});
it('returns 400 when name is missing', async () => {
const res = await c('PATCH', '/api/profile/contexts', { active: true });
expect(res.status).toBe(400);
});
});

View File

@@ -4,12 +4,17 @@
* inside beforeAll (same pattern as admin.test.ts) to avoid TDZ issues. * inside beforeAll (same pattern as admin.test.ts) to avoid TDZ issues.
* Uses http.request (not fetch) as the test client so that globalThis.fetch * Uses http.request (not fetch) as the test client so that globalThis.fetch
* mocking doesn't interfere with the test runner itself. * mocking doesn't interfere with the test runner itself.
*
* The orchestrator path (ADR-0013): signals fetched for task context/fallback,
* then ml/serving /recommend called. agent_outputs table is empty in tests so
* the orchestrator always uses the raw-task fallback path.
*/ */
import { describe, it, expect, vi, beforeAll, afterEach } from 'vitest'; import { describe, it, expect, vi, beforeAll, afterEach } from 'vitest';
import express from 'express'; import express from 'express';
import * as http from 'http'; import * as http from 'http';
import { makeTestDb } from '../../test/db.js'; import { makeTestDb } from '../../test/db.js';
import { users, integrationTokens, tipScores } from '../../db/schema.js'; import { users, integrationTokens, tipScores, agentOutputs, userConsents } from '../../db/schema.js';
import { nanoid } from 'nanoid';
const testDb = makeTestDb(); const testDb = makeTestDb();
@@ -48,21 +53,22 @@ describe('POST /recommend integration', () => {
let server: http.Server; let server: http.Server;
let baseUrl: string; let baseUrl: string;
let savedFetch: typeof globalThis.fetch; let savedFetch: typeof globalThis.fetch;
let clearCache: () => void; let clearSignalCache: () => void;
beforeAll(async () => { beforeAll(async () => {
await testDb.insert(users).values({ await testDb.insert(users).values({
id: 'user-1', email: 'u@test.com', role: 'user', id: 'user-1', email: 'u@test.com', role: 'user',
consentGiven: true, createdAt: new Date().toISOString(), createdAt: new Date().toISOString(),
}); });
await testDb.insert(integrationTokens).values({ await testDb.insert(integrationTokens).values({
id: 'tok-1', userId: 'user-1', provider: 'todoist', id: 'tok-1', userId: 'user-1', provider: 'todoist',
accessToken: 'fake-token', connectedAt: new Date().toISOString(), accessToken: 'fake-token', connectedAt: new Date().toISOString(),
tokenStatus: 'active',
}); });
const mod = await import('../recommender.js'); const mod = await import('../recommender.js');
const { recommenderRouter } = mod; const { recommenderRouter } = mod;
clearCache = (mod as any)._clearCandidateCacheForTests; clearSignalCache = (mod as any)._clearSignalCacheForTests;
const app = express(); const app = express();
app.use(express.json()); app.use(express.json());
app.use('/api', recommenderRouter); app.use('/api', recommenderRouter);
@@ -74,19 +80,23 @@ describe('POST /recommend integration', () => {
afterEach(() => { afterEach(() => {
globalThis.fetch = savedFetch; globalThis.fetch = savedFetch;
clearCache?.(); clearSignalCache?.();
}); });
it('returns 204 when Todoist + LLM both return empty', async () => { it('returns fallback tip when orchestrator fails', async () => {
globalThis.fetch = vi.fn().mockResolvedValue({ globalThis.fetch = vi.fn().mockImplementation((url: string) => {
ok: true, status: 200, if (String(url).includes('todoist.com')) {
json: async () => ({ results: [] }), return Promise.resolve({ ok: true, status: 200, json: async () => ({ results: [] }) } as any);
} as any); }
const { status } = await post(`${baseUrl}/api/recommend`); return Promise.resolve({ ok: false, status: 503 } as any);
expect(status).toBe(204); });
const { status, body } = await post(`${baseUrl}/api/recommend`);
expect(status).toBe(200);
expect(body.tip.source).toBe('fallback');
expect(body.tip.rationale).toBe('AI service issues');
}); });
it('serves todoist tip and writes correct tip_scores columns', async () => { it('serves orchestrator tip and writes correct tip_scores columns', async () => {
globalThis.fetch = vi.fn().mockImplementation((url: string) => { globalThis.fetch = vi.fn().mockImplementation((url: string) => {
if (String(url).includes('todoist.com')) { if (String(url).includes('todoist.com')) {
return Promise.resolve({ return Promise.resolve({
@@ -96,55 +106,16 @@ describe('POST /recommend integration', () => {
}), }),
} as any); } as any);
} }
if (String(url).includes('/generate')) { if (String(url).includes('/recommend')) {
return Promise.resolve({ ok: false, status: 503, json: async () => ({}) } as any);
}
if (String(url).includes('/score')) {
return Promise.resolve({
ok: true, status: 200,
json: async () => ({ tip_id: 'todoist:task-1', score: 0.8 }),
} as any);
}
return Promise.resolve({ ok: false, status: 500, json: async () => ({}) } as any);
});
const { status, body } = await post(`${baseUrl}/api/recommend`);
expect(status).toBe(200);
expect(body.tip.source).toBe('todoist');
expect(body.tip.kind).toBe('task');
const rows = await testDb.select().from(tipScores);
const row = rows[rows.length - 1];
expect(row.tipKind).toBe('task');
expect(row.promptVersion).toBeNull();
expect(row.llmModel).toBeNull();
});
it('writes prompt_version + llm_model when LLM tip is served', async () => {
globalThis.fetch = vi.fn().mockImplementation((url: string) => {
if (String(url).includes('todoist.com')) {
return Promise.resolve({
ok: true, status: 200,
json: async () => ({ results: [] }),
} as any);
}
if (String(url).includes('/generate')) {
return Promise.resolve({ return Promise.resolve({
ok: true, status: 200, ok: true, status: 200,
json: async () => ({ json: async () => ({
candidates: [{ id: 'adv-1', content: 'Take a break.', rationale: 'You deserve it.' }], tip: { id: 'adv-1', content: 'Take a break.', rationale: 'You deserve it.' },
model: 'tip-generator', model: 'tip-generator',
prompt_version: 'v1',
}), }),
} as any); } as any);
} }
if (String(url).includes('/score')) { return Promise.resolve({ ok: false, status: 500 } as any);
return Promise.resolve({
ok: true, status: 200,
json: async () => ({ tip_id: 'llm:adv-1', score: 0.9 }),
} as any);
}
return Promise.resolve({ ok: false, status: 500, json: async () => ({}) } as any);
}); });
const { status, body } = await post(`${baseUrl}/api/recommend`); const { status, body } = await post(`${baseUrl}/api/recommend`);
@@ -155,12 +126,14 @@ describe('POST /recommend integration', () => {
const rows = await testDb.select().from(tipScores); const rows = await testDb.select().from(tipScores);
const row = rows[rows.length - 1]; const row = rows[rows.length - 1];
expect(row.promptVersion).toBe('v1'); expect(row.policy).toBe('orchestrator');
expect(row.promptVersion).toBe('v4-orchestrator');
expect(row.llmModel).toBe('tip-generator'); expect(row.llmModel).toBe('tip-generator');
expect(row.mlScore).toBeNull();
expect(row.tipKind).toBe('advice'); expect(row.tipKind).toBe('advice');
}); });
it('falls back to todoist tip when /generate returns non-200', async () => { it('falls back to hardcoded tip when orchestrator fails', async () => {
globalThis.fetch = vi.fn().mockImplementation((url: string) => { globalThis.fetch = vi.fn().mockImplementation((url: string) => {
if (String(url).includes('todoist.com')) { if (String(url).includes('todoist.com')) {
return Promise.resolve({ return Promise.resolve({
@@ -170,22 +143,86 @@ describe('POST /recommend integration', () => {
}), }),
} as any); } as any);
} }
if (String(url).includes('/generate')) { return Promise.resolve({ ok: false, status: 502 } as any);
return Promise.resolve({ ok: false, status: 502, json: async () => ({}) } as any);
}
if (String(url).includes('/score')) {
return Promise.resolve({
ok: true, status: 200,
json: async () => ({ tip_id: 'todoist:fallback-1', score: 0.5 }),
} as any);
}
return Promise.resolve({ ok: false, status: 500, json: async () => ({}) } as any);
}); });
const { status, body } = await post(`${baseUrl}/api/recommend`); const { status, body } = await post(`${baseUrl}/api/recommend`);
expect([200, 204]).toContain(status); expect(status).toBe(200);
if (status === 200) { expect(body.tip.source).toBe('fallback');
expect(body.tip.source).toBe('todoist'); expect(body.tip.rationale).toBe('AI service issues');
expect(body.tip.kind).toBe('advice');
});
it('eligibility filter: only passes consented agent outputs to ml/serving', async () => {
const NOW = new Date().toISOString();
const FUTURE = new Date(Date.now() + 60_000).toISOString();
// Grant data:core only — not data:todoist
await testDb.insert(userConsents).values([
{ userId: 'user-1', consentKey: 'data:core', grantedAt: NOW, revokedAt: null },
]).onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { revokedAt: null },
});
// Two agent outputs: time-of-day (needs data:core only) and overdue-task (needs data:todoist too)
await testDb.insert(agentOutputs).values([
{
id: nanoid(), userId: 'user-1', agentId: 'time-of-day',
promptText: 'It is morning.',
computedAt: NOW, expiresAt: FUTURE, agentVersion: '1.0.0',
},
{
id: nanoid(), userId: 'user-1', agentId: 'overdue-task',
promptText: 'You have overdue tasks.',
computedAt: NOW, expiresAt: FUTURE, agentVersion: '1.0.0',
},
]);
// Manifest: time-of-day requires ['data:core'], overdue-task requires ['data:core','data:todoist']
const registry = {
agents: [
{ id: 'time-of-day', required_consents: ['data:core'], silenced_in_contexts: [], version: '1.0.0', description: '', pref_schema: {}, context_schema: [], output_contract: {}, ttl_sec: 300, inferred_params: [] },
{ id: 'overdue-task', required_consents: ['data:core', 'data:todoist'], silenced_in_contexts: [], version: '1.0.0', description: '', pref_schema: {}, context_schema: [], output_contract: {}, ttl_sec: 300, inferred_params: [] },
],
};
let capturedAgentOutputs: { agent_id: string }[] = [];
globalThis.fetch = vi.fn().mockImplementation((url: string) => {
const u = String(url);
if (u.includes('todoist.com')) {
return Promise.resolve({ ok: true, status: 200, json: async () => ({ results: [] }) } as any);
} }
if (u.includes('/agents/registry')) {
return Promise.resolve({ ok: true, status: 200, json: async () => registry } as any);
}
if (u.includes('/recommend')) {
return Promise.resolve({
ok: true, status: 200,
json: async (req?: Request) => {
// The body has already been sent; capture via the mock call args instead
return { tip: { id: 'tip-x', content: 'Stay focused.' }, model: 'tip-generator' };
},
} as any);
}
return Promise.resolve({ ok: false, status: 500 } as any);
});
// Intercept the /recommend body to inspect what agent_outputs were sent
const origFetch = globalThis.fetch as unknown as (url: string, init?: RequestInit) => Promise<Response>;
const wrappedFetch = vi.fn().mockImplementation(async (url: string, init?: RequestInit) => {
if (String(url).includes('/recommend') && init?.body) {
const body = JSON.parse(init.body as string);
capturedAgentOutputs = body.agent_outputs ?? [];
}
return origFetch(url, init);
});
globalThis.fetch = wrappedFetch;
const { status } = await post(`${baseUrl}/api/recommend`);
expect(status).toBe(200);
// Only time-of-day should have been passed; overdue-task is blocked (missing data:todoist)
expect(capturedAgentOutputs.map((a) => a.agent_id)).toEqual(['time-of-day']);
}); });
}); });

View File

@@ -3,8 +3,7 @@
* These can import directly from the module without any mocking. * These can import directly from the module without any mocking.
*/ */
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
import { inferReward, dueAgeDays, pickPromptVersion } from '../recommender.js'; import { inferReward, dueAgeDays } from '../recommender.js';
import { config } from '../../config.js';
describe('inferReward', () => { describe('inferReward', () => {
it('dismiss → -1', () => expect(inferReward('dismiss', null)).toBe(-1.0)); it('dismiss → -1', () => expect(inferReward('dismiss', null)).toBe(-1.0));
@@ -38,45 +37,3 @@ describe('dueAgeDays', () => {
expect(dueAgeDays({ date: yesterday })).toBeGreaterThan(0); expect(dueAgeDays({ date: yesterday })).toBeGreaterThan(0);
}); });
}); });
describe('pickPromptVersion', () => {
// Save + restore the original env-driven config field across tests.
let original: string;
beforeEach(() => { original = config.TIP_PROMPT_VERSION; });
afterEach(() => { (config as { TIP_PROMPT_VERSION: string }).TIP_PROMPT_VERSION = original; });
it('empty config → null (let ml/serving pick its default)', () => {
(config as { TIP_PROMPT_VERSION: string }).TIP_PROMPT_VERSION = '';
expect(pickPromptVersion()).toBeNull();
});
it('whitespace-only config → null', () => {
(config as { TIP_PROMPT_VERSION: string }).TIP_PROMPT_VERSION = ' ';
expect(pickPromptVersion()).toBeNull();
});
it('single value → that value', () => {
(config as { TIP_PROMPT_VERSION: string }).TIP_PROMPT_VERSION = 'v2-mentor';
expect(pickPromptVersion()).toBe('v2-mentor');
});
it('comma-separated → uniformly samples from the set', () => {
(config as { TIP_PROMPT_VERSION: string }).TIP_PROMPT_VERSION = 'v1,v2-mentor,v3-few-shot';
const seen = new Set<string>();
// With 100 trials, the chance of missing any of 3 buckets is (2/3)^100 ≈ 0 — test is reliable.
for (let i = 0; i < 100; i++) {
const picked = pickPromptVersion();
expect(picked).not.toBeNull();
seen.add(picked!);
}
expect(seen).toEqual(new Set(['v1', 'v2-mentor', 'v3-few-shot']));
});
it('trims whitespace around comma-separated entries', () => {
(config as { TIP_PROMPT_VERSION: string }).TIP_PROMPT_VERSION = ' v1 , v2-mentor ';
for (let i = 0; i < 20; i++) {
const picked = pickPromptVersion()!;
expect(['v1', 'v2-mentor']).toContain(picked);
}
});
});

View File

@@ -18,7 +18,6 @@ import { requireAdmin } from '../middleware/admin.js';
import { nanoid } from 'nanoid'; import { nanoid } from 'nanoid';
import { bus } from '../events/bus.js'; import { bus } from '../events/bus.js';
import { config } from '../config.js'; import { config } from '../config.js';
import { getShadowPolicies, setPolicyActive } from './recommender.js';
import { inspectProfile, rebuildProfile, summarizeProfileFreshness } from '../profile/builder.js'; import { inspectProfile, rebuildProfile, summarizeProfileFreshness } from '../profile/builder.js';
import { spawn } from 'child_process'; import { spawn } from 'child_process';
import { existsSync, readFileSync, unlinkSync } from 'fs'; import { existsSync, readFileSync, unlinkSync } from 'fs';
@@ -99,7 +98,6 @@ router.get('/users', async (req: AuthenticatedRequest, res: Response) => {
name: users.name, name: users.name,
image: users.image, image: users.image,
role: users.role, role: users.role,
consentGiven: users.consentGiven,
createdAt: users.createdAt, createdAt: users.createdAt,
deletedAt: users.deletedAt, deletedAt: users.deletedAt,
}) })
@@ -162,8 +160,6 @@ router.get('/users/:id', async (req: AuthenticatedRequest, res: Response) => {
name: user.name, name: user.name,
image: user.image, image: user.image,
role: user.role, role: user.role,
consentGiven: user.consentGiven,
consentAt: user.consentAt,
createdAt: user.createdAt, createdAt: user.createdAt,
deletedAt: user.deletedAt, deletedAt: user.deletedAt,
}, },
@@ -564,36 +560,6 @@ router.get('/health', async (_req: AuthenticatedRequest, res: Response) => {
res.json({ ok: allOk, services, checkedAt: new Date().toISOString() }); res.json({ ok: allOk, services, checkedAt: new Date().toISOString() });
}); });
// ---------------------------------------------------------------------------
// GET /api/admin/policies
// POST /api/admin/policies/:name/toggle
// ---------------------------------------------------------------------------
router.get('/policies', async (_req: AuthenticatedRequest, res: Response) => {
res.json({ policies: getShadowPolicies() });
});
router.post('/policies/:name/toggle', async (req: AuthenticatedRequest, res: Response) => {
const { name } = req.params as { name: string };
const { active } = req.body as { active: boolean };
const ok = setPolicyActive(name, active);
if (!ok) {
res.status(404).json({ error: 'Policy not found' });
return;
}
await db.insert(adminActions).values({
id: nanoid(),
adminId: req.userId!,
action: active ? 'enable_policy' : 'disable_policy',
targetType: 'policy',
targetId: name,
detail: null,
createdAt: new Date().toISOString(),
});
res.json({ ok: true });
});
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// POST /api/admin/replay-signal // POST /api/admin/replay-signal
// Re-emit a past event on the bus (for testing / backfill). // Re-emit a past event on the bus (for testing / backfill).

View File

@@ -1,18 +1,19 @@
import { Router } from 'express'; import { Router, type Request, type Response, type IRouter } from 'express';
import { nanoid } from 'nanoid'; import { nanoid } from 'nanoid';
import { db } from '../db/index.js'; import { db } from '../db/index.js';
import { agentOutputs, tipFeedback, tipViews } from '../db/schema.js'; import { agentOutputs, tipFeedback, tipViews, userPreferences, taskEnrichments } from '../db/schema.js';
import { eq, and, gt, lt } from 'drizzle-orm'; import { eq, and, gt, lt, inArray } from 'drizzle-orm';
import crypto from 'node:crypto';
import { config } from '../config.js'; import { config } from '../config.js';
import { getProfile } from '../profile/builder.js'; import { getProfile, type Profile } from '../profile/builder.js';
import { todoistSource } from '../signals/todoist.js'; import { todoistSource } from '../signals/todoist.js';
import { googleHealthSource } from '../signals/google-health.js';
import { SignalAggregator } from '../signals/aggregator.js'; import { SignalAggregator } from '../signals/aggregator.js';
import type { Request, Response } from 'express';
const router = Router(); const router: IRouter = Router();
// Separate aggregator instance — avoids circular dep with recommender.ts. // Separate aggregator instance — avoids circular dep with recommender.ts.
const _agentAggregator = new SignalAggregator().register(todoistSource); const _agentAggregator = new SignalAggregator().register(todoistSource).register(googleHealthSource);
// ── Internal auth helper ────────────────────────────────────────────────────── // ── Internal auth helper ──────────────────────────────────────────────────────
@@ -27,6 +28,33 @@ function checkInternalToken(req: Request, res: Response): boolean {
// ── DB helpers ──────────────────────────────────────────────────────────────── // ── DB helpers ────────────────────────────────────────────────────────────────
function contentHash(text: string): string {
return crypto.createHash('md5').update(text).digest('hex');
}
async function fetchEnrichmentCache(tasks: { content?: string }[]): Promise<Record<string, string>> {
const hashes = tasks
.map((t) => t.content?.trim())
.filter((c): c is string => !!c)
.map(contentHash);
if (!hashes.length) return {};
const rows = await db
.select({ contentHash: taskEnrichments.contentHash, description: taskEnrichments.description })
.from(taskEnrichments)
.where(inArray(taskEnrichments.contentHash, hashes));
return Object.fromEntries(rows.map((r) => [r.contentHash, r.description]));
}
async function persistEnrichments(newEntries: Record<string, string>): Promise<void> {
const now = new Date().toISOString();
for (const [hash, description] of Object.entries(newEntries)) {
await db
.insert(taskEnrichments)
.values({ contentHash: hash, description, createdAt: now })
.onConflictDoNothing();
}
}
export async function getActiveAgentOutputs(userId: string) { export async function getActiveAgentOutputs(userId: string) {
const now = new Date().toISOString(); const now = new Date().toISOString();
return db return db
@@ -77,9 +105,170 @@ router.get('/active-users', async (req: Request, res: Response) => {
} }
}); });
// ── Core compute logic (used by route + scheduler) ───────────────────────────
/** Load agent prefs for a user from user_preferences, merging user+inferred.
* User source wins: if both exist, the 'user' row is returned. */
async function loadAgentPrefs(userId: string, agentId: string): Promise<Record<string, unknown>> {
const scope = `agent:${agentId}`;
const rows = await db
.select({ key: userPreferences.key, valueJson: userPreferences.valueJson, source: userPreferences.source })
.from(userPreferences)
.where(and(eq(userPreferences.userId, userId), eq(userPreferences.scope, scope)));
// Build merged dict: 'user' source takes precedence over 'inferred'
const merged: Record<string, { value: unknown; source: string }> = {};
for (const row of rows) {
try {
const value = JSON.parse(row.valueJson);
const existing = merged[row.key];
if (!existing || row.source === 'user') {
merged[row.key] = { value, source: row.source };
}
} catch {
// skip malformed
}
}
return Object.fromEntries(Object.entries(merged).map(([k, v]) => [k, v.value]));
}
/** Persist inferred prefs to user_preferences, skipping keys the user has explicitly set. */
async function persistInferredPrefs(
userId: string,
agentId: string,
inferredPrefs: Record<string, unknown>,
): Promise<void> {
if (!Object.keys(inferredPrefs).length) return;
const scope = `agent:${agentId}`;
const now = new Date().toISOString();
for (const [key, value] of Object.entries(inferredPrefs)) {
const valueJson = JSON.stringify(value);
await db
.insert(userPreferences)
.values({ userId, scope, key, valueJson, source: 'inferred', updatedAt: now })
.onConflictDoUpdate({
target: [userPreferences.userId, userPreferences.scope, userPreferences.key],
set: { valueJson, updatedAt: now },
// Only overwrite rows already marked inferred; user overrides are untouched.
setWhere: eq(userPreferences.source, 'inferred'),
});
}
}
function taskListHash(tasks: { content?: string }[]): string {
const sorted = tasks
.map((t) => t.content?.trim() ?? '')
.filter(Boolean)
.sort()
.join('\n');
return crypto.createHash('md5').update(sorted).digest('hex');
}
async function isUpToDate(userId: string, agentId: string, currentHash: string): Promise<boolean> {
const rows = await db
.select({ signalsSnapshot: agentOutputs.signalsSnapshot })
.from(agentOutputs)
.where(and(eq(agentOutputs.userId, userId), eq(agentOutputs.agentId, agentId)))
.limit(1);
if (!rows.length) return false;
try {
const snapshot = JSON.parse(rows[0].signalsSnapshot ?? '{}') as { _task_hash?: string };
return snapshot._task_hash === currentHash;
} catch { return false; }
}
export async function computeAndStore(userId: string, agentId: string): Promise<void> {
let tasks: object[] = [];
try {
const signals = await _agentAggregator.fetchAll(userId);
tasks = signals.map((s) => ({
id: s.id,
source: s.source,
kind: s.kind,
content: s.content,
// Task-specific fields (default to harmless values for non-task signals)
priority: (s.features.priority as number) ?? 1,
is_overdue: Boolean(s.features.is_overdue),
task_age_days: (s.features.task_age_days as number) ?? 0,
project_id: (s.metadata as Record<string, unknown>).project_id ?? null,
// All features spread so source-specific agents (e.g. health-vitals) can read them
...s.features,
}));
} catch {
// No integration or fetch error — agents that need tasks will report "no tasks"
}
const currentTaskHash = taskListHash(tasks as { content?: string }[]);
if (await isUpToDate(userId, agentId, currentTaskHash)) return;
let profile: Profile = {};
try {
profile = await getProfile(userId);
} catch {}
const sevenDaysAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString();
const feedbackRows = await db
.select({ action: tipFeedback.action, dwellMs: tipFeedback.dwellMs, createdAt: tipFeedback.createdAt })
.from(tipFeedback)
.where(and(eq(tipFeedback.userId, userId), gt(tipFeedback.createdAt, sevenDaysAgo)));
const feedbackHistory = feedbackRows.map((f) => ({
action: f.action,
dwell_ms: f.dwellMs,
created_at: f.createdAt,
}));
// Load agent prefs (user overrides + previous inferences) to inject into the compute call.
const agentPrefs = await loadAgentPrefs(userId, agentId);
// Fetch enrichment cache for task titles present in this compute call.
const enrichmentCache = await fetchEnrichmentCache(tasks as { content?: string }[]);
const mlResp = await fetch(`${config.ML_SERVING_URL}/agents/${agentId}/compute`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ user_id: userId, tasks, profile, feedback_history: feedbackHistory, agent_prefs: agentPrefs, enrichment_cache: enrichmentCache, task_hash: currentTaskHash }),
signal: AbortSignal.timeout(60_000),
});
if (!mlResp.ok) {
const detail = await mlResp.text().catch(() => '');
throw new Error(`ml/serving /agents/${agentId}/compute returned ${mlResp.status}: ${detail}`);
}
const output = await mlResp.json() as {
user_id: string; agent_id: string; prompt_text: string;
signals_snapshot: unknown; computed_at: string; expires_at: string; agent_version: string;
new_enrichments?: Record<string, string>;
};
await storeAgentOutput(output);
// Persist any new enrichments produced during this compute cycle.
if (output.new_enrichments && Object.keys(output.new_enrichments).length > 0) {
await persistEnrichments(output.new_enrichments);
}
// Run inference framework for this agent and persist results.
// Failures are non-fatal — the compute result is already stored.
try {
const inferResp = await fetch(`${config.ML_SERVING_URL}/agents/${agentId}/infer`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ user_id: userId, feedback_history: feedbackHistory }),
signal: AbortSignal.timeout(10_000),
});
if (inferResp.ok) {
const inferResult = await inferResp.json() as { inferred_prefs: Record<string, unknown> };
await persistInferredPrefs(userId, agentId, inferResult.inferred_prefs);
}
} catch {
// inference failure is non-fatal
}
}
// ── POST /api/agents/:agentId/compute ───────────────────────────────────────── // ── POST /api/agents/:agentId/compute ─────────────────────────────────────────
// Orchestrating endpoint for per-(user, agent) compute tasks. // Orchestrating endpoint for per-(user, agent) compute tasks.
// Fetches all signals, calls ml/serving /agents/{agentId}/compute, stores result.
// Body: { user_id: string } // Body: { user_id: string }
router.post('/:agentId/compute', async (req: Request, res: Response) => { router.post('/:agentId/compute', async (req: Request, res: Response) => {
@@ -94,64 +283,11 @@ router.post('/:agentId/compute', async (req: Request, res: Response) => {
} }
try { try {
// Fetch tasks via Todoist integration (gracefully empty if not connected). await computeAndStore(user_id, agentId);
let tasks: object[] = []; res.json({ ok: true, agent_id: agentId, user_id });
try {
const signals = await _agentAggregator.fetchAll(user_id);
tasks = signals.map((s) => ({
id: s.id,
content: s.content,
priority: (s.features.priority as number) ?? 1,
is_overdue: Boolean(s.features.is_overdue),
task_age_days: (s.features.task_age_days as number) ?? 0,
project_id: (s.metadata as Record<string, unknown>).project_id ?? null,
}));
} catch {
// No integration or fetch error — agents that need tasks will report "no tasks"
}
// Fetch profile features (lazy-refreshed from DB).
let profile: Record<string, number | null> = {};
try {
profile = await getProfile(user_id);
} catch {}
// Fetch last 7 days of feedback for RecentPatternsAgent.
const sevenDaysAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString();
const feedbackRows = await db
.select({ action: tipFeedback.action, dwellMs: tipFeedback.dwellMs, createdAt: tipFeedback.createdAt })
.from(tipFeedback)
.where(and(eq(tipFeedback.userId, user_id), gt(tipFeedback.createdAt, sevenDaysAgo)));
const feedbackHistory = feedbackRows.map((f) => ({
action: f.action,
dwell_ms: f.dwellMs,
created_at: f.createdAt,
}));
// Call ml/serving to run the agent.
const mlResp = await fetch(`${config.ML_SERVING_URL}/agents/${agentId}/compute`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ user_id, tasks, profile, feedback_history: feedbackHistory }),
signal: AbortSignal.timeout(15_000),
});
if (!mlResp.ok) {
const detail = await mlResp.text().catch(() => '');
res.status(502).json({ error: `ml/serving returned ${mlResp.status}`, detail });
return;
}
const output = await mlResp.json() as {
user_id: string; agent_id: string; prompt_text: string;
signals_snapshot: unknown; computed_at: string; expires_at: string; agent_version: string;
};
await storeAgentOutput(output);
res.json({ ok: true, agent_id: output.agent_id, user_id: output.user_id, expires_at: output.expires_at });
} catch (err: any) { } catch (err: any) {
res.status(500).json({ error: err.message }); const status = err.message?.includes('returned 4') ? 422 : 500;
res.status(status).json({ error: err.message });
} }
}); });

View File

@@ -0,0 +1,42 @@
import { Router, type Request, type Response, type IRouter } from 'express';
import { config } from '../config.js';
import { requireAuth } from '../middleware/session.js';
const router: IRouter = Router();
// Manifests change only on ml/serving restart, so a small in-process cache
// avoids hammering the upstream on every admin pageview / profile fetch.
const CACHE_TTL_MS = 60_000;
let _cache: { fetchedAt: number; payload: unknown } | null = null;
export function _resetRegistryCache() {
_cache = null;
}
export async function fetchRegistry(): Promise<unknown> {
if (_cache && Date.now() - _cache.fetchedAt < CACHE_TTL_MS) return _cache.payload;
const upstream = await fetch(`${config.ML_SERVING_URL}/agents/registry`, {
signal: AbortSignal.timeout(5000),
});
if (!upstream.ok) {
throw new Error(`ml/serving /agents/registry returned ${upstream.status}`);
}
const payload = await upstream.json();
_cache = { fetchedAt: Date.now(), payload };
return payload;
}
// ── GET /api/agents/registry ─────────────────────────────────────────────────
// Manifest list for every registered agent (ADR-0014). Auth-gated: manifests
// drive admin UI form rendering and feed the orchestrator eligibility filter.
router.get('/registry', requireAuth as any, async (_req: Request, res: Response) => {
try {
const payload = await fetchRegistry();
res.json(payload);
} catch (err: any) {
res.status(502).json({ error: 'ml/serving unavailable', detail: err.message });
}
});
export default router;

View File

@@ -2,7 +2,7 @@ import { type Router as ExpressRouter, Router, Request, Response } from 'express
import * as client from 'openid-client'; import * as client from 'openid-client';
import { nanoid } from 'nanoid'; import { nanoid } from 'nanoid';
import { db } from '../db/index.js'; import { db } from '../db/index.js';
import { users, sessions } from '../db/schema.js'; import { users, sessions, userConsents } from '../db/schema.js';
import { eq } from 'drizzle-orm'; import { eq } from 'drizzle-orm';
import { config } from '../config.js'; import { config } from '../config.js';
import { logger } from '../logger.js'; import { logger } from '../logger.js';
@@ -104,7 +104,8 @@ router.get('/callback', async (req: Request, res: Response) => {
if (!user) { if (!user) {
const id = nanoid(); const id = nanoid();
await db.insert(users).values({ id, email, name, image, googleId, createdAt: now, consentGiven: true, consentAt: now }); await db.insert(users).values({ id, email, name, image, googleId, createdAt: now });
await db.insert(userConsents).values({ userId: id, consentKey: 'data:core', grantedAt: now });
[user] = await db.select().from(users).where(eq(users.id, id)).limit(1); [user] = await db.select().from(users).where(eq(users.id, id)).limit(1);
} }

View File

@@ -8,11 +8,10 @@
* GET /api/bench/leaderboard/:experiment — leaderboard by (model, prompt) * GET /api/bench/leaderboard/:experiment — leaderboard by (model, prompt)
*/ */
import { Router, Request, Response } from "express"; import { Router, type Request, type Response, type IRouter } from "express";
import httpx from "httpx";
import * as process from "process"; import * as process from "process";
const router = Router(); const router: IRouter = Router();
const MLFLOW_URL = process.env.MLFLOW_URL || "http://mlflow:5000"; const MLFLOW_URL = process.env.MLFLOW_URL || "http://mlflow:5000";
const MLFLOW_USER = process.env.MLFLOW_TRACKING_USERNAME || "admin"; const MLFLOW_USER = process.env.MLFLOW_TRACKING_USERNAME || "admin";

View File

@@ -1,7 +1,7 @@
import { type Router as ExpressRouter, Router, Request, Response } from 'express'; import { type Router as ExpressRouter, Router, Request, Response } from 'express';
import { nanoid } from 'nanoid'; import { nanoid } from 'nanoid';
import { db } from '../db/index.js'; import { db } from '../db/index.js';
import { integrationTokens } from '../db/schema.js'; import { integrationTokens, userConsents } from '../db/schema.js';
import { eq, and } from 'drizzle-orm'; import { eq, and } from 'drizzle-orm';
import { config } from '../config.js'; import { config } from '../config.js';
import { requireAuth, AuthenticatedRequest } from '../middleware/session.js'; import { requireAuth, AuthenticatedRequest } from '../middleware/session.js';
@@ -12,9 +12,41 @@ const TODOIST_OAUTH_URL = 'https://todoist.com/oauth/authorize';
const TODOIST_TOKEN_URL = 'https://todoist.com/oauth/access_token'; const TODOIST_TOKEN_URL = 'https://todoist.com/oauth/access_token';
const TODOIST_SCOPES = 'data:read_write'; const TODOIST_SCOPES = 'data:read_write';
const GOOGLE_AUTH_URL = 'https://accounts.google.com/o/oauth2/v2/auth';
const GOOGLE_TOKEN_URL = 'https://oauth2.googleapis.com/token';
const GOOGLE_REVOKE_URL = 'https://oauth2.googleapis.com/revoke';
const GOOGLE_HEALTH_SCOPES = [
'https://www.googleapis.com/auth/googlehealth.activity_and_fitness.readonly',
'https://www.googleapis.com/auth/googlehealth.health_metrics_and_measurements.readonly',
'https://www.googleapis.com/auth/googlehealth.sleep.readonly',
].join(' ');
// In-memory CSRF state store // In-memory CSRF state store
const pendingStates = new Map<string, { userId: string; redirectTo: string }>(); const pendingStates = new Map<string, { userId: string; redirectTo: string }>();
async function grantDataSourceConsent(userId: string, provider: string): Promise<void> {
const consentKey = `data:${provider}`;
const now = new Date().toISOString();
await db.insert(userConsents)
.values({ userId, consentKey, grantedAt: now, revokedAt: null })
.onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { grantedAt: now, revokedAt: null },
});
}
async function revokeDataSourceConsent(userId: string, provider: string): Promise<void> {
const consentKey = `data:${provider}`;
const now = new Date().toISOString();
await db.insert(userConsents)
.values({ userId, consentKey, grantedAt: now, revokedAt: now })
.onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { revokedAt: now },
});
}
/** GET /api/integrations — list connected integrations */ /** GET /api/integrations — list connected integrations */
router.get('/', requireAuth, async (req: AuthenticatedRequest, res: Response) => { router.get('/', requireAuth, async (req: AuthenticatedRequest, res: Response) => {
const tokens = await db const tokens = await db
@@ -100,10 +132,102 @@ router.get('/todoist/callback', async (req: Request, res: Response) => {
tokenStatus: 'active', tokenStatus: 'active',
connectedAt: now, connectedAt: now,
}); });
await grantDataSourceConsent(pending.userId, 'todoist');
res.redirect(`${config.WEB_BASE_URL}${pending.redirectTo}?connected=todoist`); res.redirect(`${config.WEB_BASE_URL}${pending.redirectTo}?connected=todoist`);
}); });
/** GET /api/integrations/google-health/connect — start Google Fit OAuth */
router.get('/google-health/connect', requireAuth, (req: AuthenticatedRequest, res: Response) => {
const state = nanoid();
pendingStates.set(state, {
userId: req.userId!,
redirectTo: (req.query.redirectTo as string) ?? '/connect',
});
setTimeout(() => pendingStates.delete(state), 10 * 60 * 1000);
const url = new URL(GOOGLE_AUTH_URL);
url.searchParams.set('client_id', config.GOOGLE_CLIENT_ID);
url.searchParams.set('redirect_uri', `${config.API_BASE_URL}/api/integrations/google-health/callback`);
url.searchParams.set('response_type', 'code');
url.searchParams.set('scope', GOOGLE_HEALTH_SCOPES);
url.searchParams.set('state', state);
url.searchParams.set('access_type', 'offline');
url.searchParams.set('prompt', 'consent');
res.redirect(url.toString());
});
/** GET /api/integrations/google-health/callback — Google returns here */
router.get('/google-health/callback', async (req: Request, res: Response) => {
const state = req.query.state as string;
const code = req.query.code as string;
const error = req.query.error as string | undefined;
if (error) {
res.status(400).json({ error: `Google denied access: ${error}` });
return;
}
const pending = pendingStates.get(state);
if (!pending) {
res.status(400).json({ error: 'Invalid or expired state' });
return;
}
pendingStates.delete(state);
const body = new URLSearchParams({
client_id: config.GOOGLE_CLIENT_ID,
client_secret: config.GOOGLE_CLIENT_SECRET,
code,
grant_type: 'authorization_code',
redirect_uri: `${config.API_BASE_URL}/api/integrations/google-health/callback`,
});
const tokenRes = await fetch(GOOGLE_TOKEN_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded', Accept: 'application/json' },
body: body.toString(),
});
if (!tokenRes.ok) {
const detail = await tokenRes.text().catch(() => '');
res.status(502).json({ error: `Failed to exchange Google token: ${detail}` });
return;
}
const tokenData = (await tokenRes.json()) as {
access_token: string;
refresh_token?: string;
expires_in: number;
};
const now = new Date();
const expiresAt = new Date(now.getTime() + tokenData.expires_in * 1000).toISOString();
await db
.delete(integrationTokens)
.where(
and(
eq(integrationTokens.userId, pending.userId),
eq(integrationTokens.provider, 'google-health'),
),
);
await db.insert(integrationTokens).values({
id: nanoid(),
userId: pending.userId,
provider: 'google-health',
accessToken: tokenData.access_token,
refreshToken: tokenData.refresh_token ?? null,
expiresAt,
tokenStatus: 'active',
connectedAt: now.toISOString(),
});
await grantDataSourceConsent(pending.userId, 'google-health');
res.redirect(`${config.WEB_BASE_URL}${pending.redirectTo}?connected=google-health`);
});
/** DELETE /api/integrations/:provider — revoke token */ /** DELETE /api/integrations/:provider — revoke token */
router.delete('/:provider', requireAuth, async (req: AuthenticatedRequest, res: Response) => { router.delete('/:provider', requireAuth, async (req: AuthenticatedRequest, res: Response) => {
const provider = String(req.params.provider); const provider = String(req.params.provider);
@@ -120,13 +244,18 @@ router.delete('/:provider', requireAuth, async (req: AuthenticatedRequest, res:
.limit(1); .limit(1);
if (token?.provider === 'todoist') { if (token?.provider === 'todoist') {
// Best-effort revocation
await fetch('https://api.todoist.com/sync/v9/access_tokens/revoke', { await fetch('https://api.todoist.com/sync/v9/access_tokens/revoke', {
method: 'POST', method: 'POST',
headers: { Authorization: `Bearer ${token.accessToken}` }, headers: { Authorization: `Bearer ${token.accessToken}` },
}).catch(() => {}); }).catch(() => {});
} }
if (token?.provider === 'google-health') {
await fetch(`${GOOGLE_REVOKE_URL}?token=${token.accessToken}`, { method: 'POST' }).catch(() => {});
}
await revokeDataSourceConsent(req.userId!, provider);
await db await db
.delete(integrationTokens) .delete(integrationTokens)
.where( .where(

View File

@@ -0,0 +1,197 @@
/**
* GET /api/profile — read-through: user globals + prefs + contexts + consents
* PATCH /api/profile/prefs/:scope — upsert user_preferences rows (source='user')
* PATCH /api/profile/consents — grant or revoke consent keys
* PATCH /api/profile/contexts — activate/deactivate or create user contexts
*
* ADR-0014 step 4.
*/
import { Router, type Response, type IRouter } from 'express';
import { db } from '../db/index.js';
import {
users,
userPreferences,
userConsents,
userContexts,
} from '../db/schema.js';
import { eq, and, isNull } from 'drizzle-orm';
import { requireAuth, type AuthenticatedRequest } from '../middleware/session.js';
const router: IRouter = Router();
// ── GET /api/profile ─────────────────────────────────────────────────────────
router.get('/', requireAuth as any, async (req: AuthenticatedRequest, res: Response) => {
const userId = req.userId!;
const [user] = await db.select().from(users).where(eq(users.id, userId)).limit(1);
if (!user || user.deletedAt) {
res.status(404).json({ error: 'User not found' });
return;
}
const [prefs, consents, contexts] = await Promise.all([
db.select().from(userPreferences).where(eq(userPreferences.userId, userId)),
db.select().from(userConsents).where(eq(userConsents.userId, userId)),
db.select().from(userContexts).where(eq(userContexts.userId, userId)),
]);
// Group prefs by scope: { 'orchestrator': { key: value_json, … }, 'agent:foo': { … } }
const prefsByScope: Record<string, Record<string, unknown>> = {};
for (const p of prefs) {
if (!prefsByScope[p.scope]) prefsByScope[p.scope] = {};
try {
prefsByScope[p.scope][p.key] = JSON.parse(p.valueJson);
} catch {
prefsByScope[p.scope][p.key] = p.valueJson;
}
}
// Consents: include both active and revoked (callers can filter on revokedAt)
const consentMap: Record<string, { grantedAt: string; revokedAt: string | null }> = {};
for (const c of consents) {
consentMap[c.consentKey] = { grantedAt: c.grantedAt, revokedAt: c.revokedAt ?? null };
}
res.json({
user: {
id: user.id,
email: user.email,
name: user.name,
image: user.image,
tone: user.tone ?? null,
tipKinds: user.tipKindsJson ? JSON.parse(user.tipKindsJson) : null,
},
prefs: prefsByScope,
consents: consentMap,
contexts: contexts.map((c) => ({
name: c.name,
active: c.active,
schedule: c.scheduleJson ? JSON.parse(c.scheduleJson) : null,
createdAt: c.createdAt,
})),
});
});
// ── PATCH /api/profile/prefs/:scope ──────────────────────────────────────────
// Body: { [key]: value } — each key is upserted as source='user'.
router.patch('/prefs/:scope', requireAuth as any, async (req: AuthenticatedRequest, res: Response) => {
const userId = req.userId!;
const { scope } = req.params;
const body = req.body as Record<string, unknown>;
if (!scope || typeof scope !== 'string') {
res.status(400).json({ error: 'scope is required' });
return;
}
if (!body || typeof body !== 'object' || Array.isArray(body)) {
res.status(400).json({ error: 'body must be a JSON object' });
return;
}
const now = new Date().toISOString();
for (const [key, value] of Object.entries(body)) {
const valueJson = JSON.stringify(value);
await db
.insert(userPreferences)
.values({ userId, scope, key, valueJson, source: 'user', updatedAt: now })
.onConflictDoUpdate({
target: [userPreferences.userId, userPreferences.scope, userPreferences.key],
set: { valueJson, source: 'user', updatedAt: now },
});
}
res.json({ ok: true });
});
// ── PATCH /api/profile/consents ───────────────────────────────────────────────
// Body: { grant?: string[], revoke?: string[] }
router.patch('/consents', requireAuth as any, async (req: AuthenticatedRequest, res: Response) => {
const userId = req.userId!;
const { grant = [], revoke = [] } = req.body as { grant?: string[]; revoke?: string[] };
if (!Array.isArray(grant) || !Array.isArray(revoke)) {
res.status(400).json({ error: 'grant and revoke must be arrays' });
return;
}
const now = new Date().toISOString();
for (const key of grant) {
await db
.insert(userConsents)
.values({ userId, consentKey: key, grantedAt: now, revokedAt: null })
.onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { grantedAt: now, revokedAt: null },
});
}
for (const key of revoke) {
await db
.update(userConsents)
.set({ revokedAt: now })
.where(
and(
eq(userConsents.userId, userId),
eq(userConsents.consentKey, key),
isNull(userConsents.revokedAt),
),
);
}
res.json({ ok: true });
});
// ── PATCH /api/profile/contexts ───────────────────────────────────────────────
// Body: { name: string, active?: boolean, schedule?: object|null }
// Creates the row if it doesn't exist; toggles active / updates schedule.
router.patch('/contexts', requireAuth as any, async (req: AuthenticatedRequest, res: Response) => {
const userId = req.userId!;
const { name, active, schedule } = req.body as {
name?: string;
active?: boolean;
schedule?: unknown;
};
if (!name || typeof name !== 'string') {
res.status(400).json({ error: 'name is required' });
return;
}
const now = new Date().toISOString();
const scheduleJson = schedule !== undefined ? JSON.stringify(schedule) : undefined;
const existing = await db
.select()
.from(userContexts)
.where(and(eq(userContexts.userId, userId), eq(userContexts.name, name)))
.limit(1);
if (existing.length === 0) {
await db.insert(userContexts).values({
userId,
name,
active: active ?? false,
scheduleJson: scheduleJson ?? null,
createdAt: now,
});
} else {
const set: Partial<typeof userContexts.$inferInsert> = {};
if (active !== undefined) set.active = active;
if (scheduleJson !== undefined) set.scheduleJson = scheduleJson;
if (Object.keys(set).length > 0) {
await db
.update(userContexts)
.set(set)
.where(and(eq(userContexts.userId, userId), eq(userContexts.name, name)));
}
}
res.json({ ok: true });
});
export default router;

View File

@@ -2,261 +2,162 @@ import { type Router as ExpressRouter, Router, Response } from 'express';
import { nanoid } from 'nanoid'; import { nanoid } from 'nanoid';
import { logger } from '../logger.js'; import { logger } from '../logger.js';
import { db } from '../db/index.js'; import { db } from '../db/index.js';
import { integrationTokens, tipFeedback, tipViews, tipScores } from '../db/schema.js'; import { tipFeedback, tipViews, tipScores, userPreferences } from '../db/schema.js';
import { eq, and, desc } from 'drizzle-orm'; import { eq, and, desc } from 'drizzle-orm';
import { requireAuth, AuthenticatedRequest } from '../middleware/session.js'; import { requireAuth, AuthenticatedRequest } from '../middleware/session.js';
import { config } from '../config.js'; import { config } from '../config.js';
import { bus } from '../events/bus.js'; import { bus } from '../events/bus.js';
import type { TipCandidate, Signal } from '@oo/shared-types'; import type { Tip, Signal } from '@oo/shared-types';
import { todoistSource, dueAgeDays } from '../signals/todoist.js'; import { todoistSource, dueAgeDays } from '../signals/todoist.js';
export { dueAgeDays }; export { dueAgeDays };
import { googleHealthSource } from '../signals/google-health.js';
import { SignalAggregator } from '../signals/aggregator.js'; import { SignalAggregator } from '../signals/aggregator.js';
import { getProfile, type Profile } from '../profile/builder.js'; import { getActiveAgentOutputs } from './agent-outputs.js';
import { getEligibleAgentIds } from '../profile/eligibility.js';
const router: ExpressRouter = Router(); const router: ExpressRouter = Router();
/** // ---------------------------------------------------------------------------
* Pick a prompt version for this request. `config.TIP_PROMPT_VERSION` is either // Fallback tips — shown when the AI service is unavailable
* empty (let ml/serving pick its default), a single version, or a comma-separated // ---------------------------------------------------------------------------
* list to rotate uniformly across requests so the #92 dashboard accumulates const FALLBACK_TIPS = [
* comparable buckets per variant. Exported for testing. "Take a moment to stretch and breathe — your body and mind will thank you.",
*/ "Write down one thing you're grateful for today.",
export function pickPromptVersion(): string | null { "Drink a glass of water. Small acts of self-care add up.",
const raw = config.TIP_PROMPT_VERSION.trim(); "Reach out to someone you haven't spoken to in a while.",
if (!raw) return null; "Close a tab you've been meaning to close for days.",
const versions = raw.split(',').map((v) => v.trim()).filter(Boolean); "Step outside for five minutes, even briefly.",
if (!versions.length) return null; "Put your phone down for the next 30 minutes and see how it feels.",
return versions[Math.floor(Math.random() * versions.length)] ?? null; "Do the smallest possible version of a task you've been avoiding.",
"Tidy one small area — a clear space helps a clear mind.",
"Pause and ask: what would make today feel like a win?",
"Rest is productive. Give yourself permission to recharge.",
"You don't have to do everything today. Pick one thing and do it well.",
];
function randomFallbackTip(): import('@oo/shared-types').Tip {
const content = FALLBACK_TIPS[Math.floor(Math.random() * FALLBACK_TIPS.length)];
return {
id: `fallback:${nanoid()}`,
content,
source: 'fallback',
kind: 'advice',
rationale: 'AI service issues',
createdAt: new Date().toISOString(),
};
} }
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Signal aggregator — register sources here as new integrations are added // Signal aggregator — register sources here as new integrations are added
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
export const aggregator = new SignalAggregator().register(todoistSource); export const aggregator = new SignalAggregator().register(todoistSource).register(googleHealthSource);
export const _clearSignalCacheForTests = () => {
// ---------------------------------------------------------------------------
// Candidate cache — stores the last assembled candidate set per user so the
// feedback handler can look up features for reward delivery.
// ---------------------------------------------------------------------------
const candidateCache = new Map<string, TipCandidate[]>();
export const _clearCandidateCacheForTests = () => {
candidateCache.clear();
todoistSource.clearCache(); todoistSource.clearCache();
googleHealthSource.clearCache();
}; };
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Shadow-policy registry // Orchestrator: fetch agent snippets + call ml/serving /recommend
// ---------------------------------------------------------------------------
const shadowPolicies = new Map<string, { active: boolean }>([
// egreedy-v2 promoted to active policy (ADR-0012). Shadow entry kept for
// rollback toggle; leave disabled in normal operation.
['egreedy-v2-shadow', { active: false }],
]);
export function getShadowPolicies() {
return Array.from(shadowPolicies.entries()).map(([name, s]) => ({ name, ...s }));
}
export function setPolicyActive(name: string, active: boolean): boolean {
if (!shadowPolicies.has(name)) return false;
shadowPolicies.set(name, { active });
return true;
}
// ---------------------------------------------------------------------------
// Signal → TipCandidate conversion
// ---------------------------------------------------------------------------
function signalToCandidate(signal: Signal): TipCandidate {
return {
id: signal.id,
content: signal.content,
source: signal.source as TipCandidate['source'],
kind: signal.kind as TipCandidate['kind'],
sourceId: (signal.metadata.todoistId as string | undefined) ?? undefined,
createdAt: signal.timestamp,
features: signal.features,
};
}
// ---------------------------------------------------------------------------
// Stage 2: score candidates via ml/serving bandit
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
async function remotePolicy( interface OrchestratorResult {
userId: string, tip: Tip;
tasks: TipCandidate[],
profile: Profile,
traceparent?: string,
): Promise<{ tipId: string; score: number; policy: string } | null> {
const hour = new Date().getHours();
const dayOfWeek = new Date().getDay();
const body = {
user_id: userId,
candidates: tasks.map((t) => ({
id: t.id,
content: t.content,
source: t.source,
source_id: t.sourceId ?? null,
features: t.features,
})),
context: { hour_of_day: hour, day_of_week: dayOfWeek },
profile_features: profile,
};
try {
const res = await fetch(`${config.ML_SERVING_URL}/score/egreedy/v2`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', ...(traceparent ? { traceparent } : {}) },
body: JSON.stringify(body),
signal: AbortSignal.timeout(3000),
});
if (!res.ok) return null;
const data = (await res.json()) as { tip_id: string; score: number };
return { tipId: data.tip_id, score: data.score, policy: 'egreedy-v2' };
} catch {
return null;
}
}
function randomPolicy(candidates: TipCandidate[]): TipCandidate | null {
if (!candidates.length) return null;
return candidates[Math.floor(Math.random() * candidates.length)];
}
// ---------------------------------------------------------------------------
// Stage 1b: fetch LLM candidates from ml/serving /generate
// ---------------------------------------------------------------------------
interface LlmCandidate {
id: string;
content: string;
rationale?: string;
}
interface LlmGenerateResult {
candidates: TipCandidate[];
promptVersion: string | null;
model: string | null; model: string | null;
agentIds: string[];
} }
async function fetchLlmCandidates( async function loadOrchestratorPref<T>(userId: string, key: string): Promise<T | undefined> {
const rows = await db
.select({ valueJson: userPreferences.valueJson })
.from(userPreferences)
.where(and(eq(userPreferences.userId, userId), eq(userPreferences.scope, 'orchestrator'), eq(userPreferences.key, key)))
.limit(1);
if (!rows.length) return undefined;
try { return JSON.parse(rows[0].valueJson) as T; } catch { return undefined; }
}
type OrchestratorOutcome = { ok: true; result: OrchestratorResult } | { ok: false };
async function fetchOrchestratorTip(
userId: string, userId: string,
signals: Signal[], signals: Signal[],
hour: number, hour: number,
dayOfWeek: number, dayOfWeek: number,
promptVersion: string | null,
profile: Profile,
traceparent?: string, traceparent?: string,
): Promise<LlmGenerateResult> { recentTip?: string,
try { ): Promise<OrchestratorOutcome> {
const [allAgentRows, eligibleIds, scienceDestiny] = await Promise.all([
getActiveAgentOutputs(userId),
getEligibleAgentIds(userId),
loadOrchestratorPref<number>(userId, 'science_destiny'),
]);
const agentOutputs = allAgentRows
.filter((r) => eligibleIds.has(r.agentId))
.map((r) => ({ agent_id: r.agentId, prompt_text: r.promptText }));
const tasks = signals.slice(0, 10).map((s) => ({ const tasks = signals.slice(0, 10).map((s) => ({
content: s.content, content: s.content,
priority: s.features.priority, priority: s.features.priority,
is_overdue: s.features.is_overdue, is_overdue: s.features.is_overdue,
task_age_days: s.features.task_age_days, task_age_days: s.features.task_age_days,
})); }));
const res = await fetch(`${config.ML_SERVING_URL}/generate`, {
try {
const res = await fetch(`${config.ML_SERVING_URL}/recommend`, {
method: 'POST', method: 'POST',
headers: { 'Content-Type': 'application/json', ...(traceparent ? { traceparent } : {}) }, headers: { 'Content-Type': 'application/json', ...(traceparent ? { traceparent } : {}) },
body: JSON.stringify({ body: JSON.stringify({ user_id: userId, agent_outputs: agentOutputs, tasks, hour_of_day: hour, day_of_week: dayOfWeek, science_destiny: scienceDestiny ?? 50, recent_tip: recentTip ?? null }),
user_id: userId,
context: { tasks, hour_of_day: hour, day_of_week: dayOfWeek },
n: 3,
profile_features: profile,
...(promptVersion ? { prompt_version: promptVersion } : {}),
}),
signal: AbortSignal.timeout(15_000), signal: AbortSignal.timeout(15_000),
}); });
if (!res.ok) return { candidates: [], promptVersion: null, model: null }; if (!res.ok) return { ok: false };
const data = (await res.json()) as { const data = (await res.json()) as {
candidates: LlmCandidate[]; tip: { id: string; content: string; rationale?: string };
model?: string; model?: string;
prompt_version?: string;
}; };
const now = new Date().toISOString(); const now = new Date().toISOString();
const candidates: TipCandidate[] = data.candidates.map((c) => ({ return {
id: `llm:${c.id}`, ok: true,
content: c.content, result: {
tip: {
id: `llm:${data.tip.id}`,
content: data.tip.content,
source: 'llm' as const, source: 'llm' as const,
kind: 'advice' as const, kind: 'advice' as const,
rationale: c.rationale, rationale: data.tip.rationale,
createdAt: now, createdAt: now,
features: { is_overdue: false, task_age_days: 0, priority: 1 }, },
}));
return {
candidates,
promptVersion: data.prompt_version ?? null,
model: data.model ?? null, model: data.model ?? null,
agentIds: agentOutputs.map((a) => a.agent_id),
},
}; };
} catch { } catch {
return { candidates: [], promptVersion: null, model: null }; return { ok: false };
} }
} }
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// POST /api/recommend // POST /api/recommend
// Pipeline: [Stage 1] assemble candidates → [Stage 2] score → [Stage 3] serve // Pipeline: fetch signals → orchestrator → serve; random fallback on failure
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
router.post('/recommend', requireAuth, async (req: AuthenticatedRequest, res: Response) => { router.post('/recommend', requireAuth, async (req: AuthenticatedRequest, res: Response) => {
const hour = new Date().getHours(); const hour = new Date().getHours();
const dayOfWeek = new Date().getDay(); const dayOfWeek = new Date().getDay();
const { recent_tip: recentTip } = req.body as { recent_tip?: string };
// Fail fast if no source tokens are connected
const anyToken = await db
.select({ id: integrationTokens.id })
.from(integrationTokens)
.where(eq(integrationTokens.userId, req.userId!))
.limit(1);
if (!anyToken.length) {
res.status(422).json({ error: 'No integrations connected' });
return;
}
// Stage 1: assemble candidates — aggregated signals + LLM-generated advice (parallel)
const signals = await aggregator.fetchAll(req.userId!); const signals = await aggregator.fetchAll(req.userId!);
// Refresh + load the user-level profile feature dict (lazy TTL refresh).
const profile = await getProfile(req.userId!);
const signalCandidates = signals.map(signalToCandidate);
const requestedPromptVersion = pickPromptVersion();
const llmResult = await fetchLlmCandidates(
req.userId!,
signals,
hour,
dayOfWeek,
requestedPromptVersion,
profile,
req.traceparent,
);
const allCandidates: TipCandidate[] = [...signalCandidates, ...llmResult.candidates];
if (!allCandidates.length) {
res.status(204).end();
return;
}
// Cache candidates so the feedback handler can retrieve features
candidateCache.set(req.userId!, allCandidates);
const t0 = Date.now(); const t0 = Date.now();
const outcome = await fetchOrchestratorTip(req.userId!, signals, hour, dayOfWeek, req.traceparent, recentTip);
// Stage 2: score — egreedy bandit with random fallback
const scored = await remotePolicy(req.userId!, allCandidates, profile, req.traceparent);
const latencyMs = Date.now() - t0; const latencyMs = Date.now() - t0;
const tip = scored
? (allCandidates.find((t) => t.id === scored.tipId) ?? randomPolicy(allCandidates))
: randomPolicy(allCandidates);
if (!tip) { if (!outcome.ok) {
res.status(204).end(); res.json({ tip: randomFallbackTip() });
return; return;
} }
// Stage 3: serve + log const orchestrated = outcome.result;
const policy = scored ? scored.policy : 'random'; const tip = orchestrated.tip;
const isLlmTip = tip.source === 'llm'; const policy = 'orchestrator';
const servedAt = new Date().toISOString(); const servedAt = new Date().toISOString();
await db.insert(tipViews).values({ id: nanoid(), userId: req.userId!, tipId: tip.id, servedAt }); await db.insert(tipViews).values({ id: nanoid(), userId: req.userId!, tipId: tip.id, servedAt });
@@ -266,19 +167,13 @@ router.post('/recommend', requireAuth, async (req: AuthenticatedRequest, res: Re
userId: req.userId!, userId: req.userId!,
tipId: tip.id, tipId: tip.id,
policy, policy,
mlScore: scored ? Math.round(scored.score * 1000) : null, mlScore: null,
featuresJson: JSON.stringify({ featuresJson: JSON.stringify({ agent_ids: orchestrated.agentIds, hour_of_day: hour, day_of_week: dayOfWeek }),
...tip.features, candidateCount: 1,
hour_of_day: hour,
day_of_week: dayOfWeek,
}),
candidateCount: allCandidates.length,
latencyMs, latencyMs,
servedAt, servedAt,
// Trust the version/model the generator reports; falls back to whatever promptVersion: 'v4-orchestrator',
// we asked for so the bucket isn't mislabeled if /generate omits it. llmModel: orchestrated.model,
promptVersion: isLlmTip ? (llmResult.promptVersion ?? requestedPromptVersion ?? null) : null,
llmModel: isLlmTip ? (llmResult.model ?? 'tip-generator') : null,
tipKind: tip.kind ?? null, tipKind: tip.kind ?? null,
}); });
@@ -289,56 +184,6 @@ router.post('/recommend', requireAuth, async (req: AuthenticatedRequest, res: Re
servedAt, servedAt,
}); });
// Run shadow policies (fire-and-forget, no effect on user)
for (const [name, s] of shadowPolicies) {
if (!s.active) continue;
if (name.startsWith('random')) {
const shadowTip = randomPolicy(allCandidates);
bus.publish('signals.tip.served', {
userId: req.userId!,
tipId: shadowTip?.id ?? 'none',
policy: `shadow:${name}`,
servedAt,
});
} else if (name === 'egreedy-v2-shadow') {
// Call v2 endpoint with the same payload used for the active policy.
// No reward is delivered — offline sim is the reward measurement for shadow.
void (async () => {
try {
const body = {
user_id: req.userId!,
candidates: allCandidates.map((t) => ({
id: t.id,
content: t.content,
source: t.source,
source_id: t.sourceId ?? null,
features: t.features,
})),
context: { hour_of_day: hour, day_of_week: dayOfWeek },
profile_features: profile,
};
const res = await fetch(`${config.ML_SERVING_URL}/score/egreedy/v2`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
signal: AbortSignal.timeout(3000),
});
if (res.ok) {
const data = (await res.json()) as { tip_id: string };
bus.publish('signals.tip.served', {
userId: req.userId!,
tipId: data.tip_id,
policy: `shadow:${name}`,
servedAt,
});
}
} catch {
// shadow is best-effort
}
})();
}
}
res.json({ tip }); res.json({ tip });
}); });
@@ -359,60 +204,11 @@ export function inferReward(action: string, dwellMs: number | null): number {
if (action === 'snooze') return 0.1; if (action === 'snooze') return 0.1;
if (action === 'helpful') return 0.5; if (action === 'helpful') return 0.5;
if (action === 'not_helpful') return -0.5; if (action === 'not_helpful') return -0.5;
// done — use dwell time if (dwellMs === null || dwellMs < 0) return 0.5;
if (dwellMs === null || dwellMs < 0) return 0.5; // unknown dwell: neutral positive if (dwellMs < 15_000) return -0.3;
if (dwellMs < 15_000) return -0.3; // stale / reflex if (dwellMs < 120_000) return 1.0;
if (dwellMs < 120_000) return 1.0; // magic zone if (dwellMs < 600_000) return 0.6;
if (dwellMs < 600_000) return 0.6; // good return 0.3;
return 0.3; // eventually
}
// ---------------------------------------------------------------------------
// Reward delivery with retry (bug #75 — was fire-and-forget)
// ---------------------------------------------------------------------------
async function sendRewardWithRetry(
userId: string,
tipId: string,
reward: number,
features: TipCandidate['features'],
profile: Profile,
traceparent?: string,
): Promise<void> {
const body = JSON.stringify({
user_id: userId,
tip_id: tipId,
reward,
features,
day_of_week: new Date().getDay(),
profile_features: profile,
});
for (let attempt = 1; attempt <= 3; attempt++) {
try {
const res = await fetch(`${config.ML_SERVING_URL}/reward/egreedy/v2`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', ...(traceparent ? { traceparent } : {}) },
body,
signal: AbortSignal.timeout(3000),
});
if (res.ok) return;
throw new Error(`HTTP ${res.status}`);
} catch (err: any) {
if (attempt === 3) {
logger.error({ tipId, err }, 'reward: failed after 3 attempts');
bus.publish('signals.tip.reward_failed', {
userId,
tipId,
reward,
attempts: 3,
error: err.message,
failedAt: new Date().toISOString(),
});
return;
}
await new Promise((r) => setTimeout(r, 250 * Math.pow(2, attempt)));
}
}
} }
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -429,7 +225,6 @@ router.post('/tip/:id/feedback', requireAuth, async (req: AuthenticatedRequest,
return; return;
} }
// Compute dwell time from the most recent tipViews record for this user+tip
let dwellMs: number | null = null; let dwellMs: number | null = null;
const [lastView] = await db const [lastView] = await db
.select({ servedAt: tipViews.servedAt }) .select({ servedAt: tipViews.servedAt })
@@ -455,11 +250,6 @@ router.post('/tip/:id/feedback', requireAuth, async (req: AuthenticatedRequest,
createdAt: now.toISOString(), createdAt: now.toISOString(),
}); });
// Look up cached candidate for reward features; invalidate after
const cached = candidateCache.get(req.userId!);
const candidate = cached?.find((t) => t.id === tipId);
candidateCache.delete(req.userId!);
bus.publish('signals.tip.feedback', { bus.publish('signals.tip.feedback', {
userId: req.userId!, userId: req.userId!,
tipId, tipId,
@@ -469,13 +259,6 @@ router.post('/tip/:id/feedback', requireAuth, async (req: AuthenticatedRequest,
createdAt: now.toISOString(), createdAt: now.toISOString(),
}); });
if (candidate) {
// Re-fetch profile for the v2 ridge update; TTL cache makes this near-instant.
const profile = await getProfile(req.userId!);
sendRewardWithRetry(req.userId!, tipId, reward, candidate.features, profile, req.traceparent);
}
// Delegate action to the owning signal source (e.g. mark done in Todoist)
await aggregator.act(req.userId!, tipId, action); await aggregator.act(req.userId!, tipId, action);
res.json({ ok: true }); res.json({ ok: true });

View File

@@ -1,7 +1,7 @@
import { type Router as ExpressRouter, Router, Response } from 'express'; import { type Router as ExpressRouter, Router, Response } from 'express';
import { db } from '../db/index.js'; import { db } from '../db/index.js';
import { users, integrationTokens, tipFeedback, tipViews, sessions } from '../db/schema.js'; import { users, integrationTokens, tipFeedback, tipViews, sessions, userConsents } from '../db/schema.js';
import { eq } from 'drizzle-orm'; import { eq, and, isNull } from 'drizzle-orm';
import { requireAuth, AuthenticatedRequest } from '../middleware/session.js'; import { requireAuth, AuthenticatedRequest } from '../middleware/session.js';
const router: ExpressRouter = Router(); const router: ExpressRouter = Router();
@@ -20,16 +20,19 @@ router.get('/me', requireAuth, async (req: AuthenticatedRequest, res: Response)
image: user.image, image: user.image,
role: user.role, role: user.role,
createdAt: user.createdAt, createdAt: user.createdAt,
consentGiven: user.consentGiven,
}); });
}); });
/** POST /api/user/consent — record consent */ /** POST /api/user/consent — record data:core consent */
router.post('/consent', requireAuth, async (req: AuthenticatedRequest, res: Response) => { router.post('/consent', requireAuth, async (req: AuthenticatedRequest, res: Response) => {
const now = new Date().toISOString();
await db await db
.update(users) .insert(userConsents)
.set({ consentGiven: true, consentAt: new Date().toISOString() }) .values({ userId: req.userId!, consentKey: 'data:core', grantedAt: now })
.where(eq(users.id, req.userId!)); .onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { grantedAt: now, revokedAt: null },
});
res.json({ ok: true }); res.json({ ok: true });
}); });

View File

@@ -0,0 +1,166 @@
/**
* Tests for the agent pre-compute scheduler (signals/agent-scheduler.ts).
*
* Key behaviour under test: runCycle calls getEligibleAgentIds per user and
* skips computeAndStore for agents the user hasn't consented to.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
vi.mock('../../logger.js', () => ({
logger: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), fatal: vi.fn() },
}));
import { logger } from '../../logger.js';
// ── active-user query: db.selectDistinct(...).from(...).where(...) ──────────
let activeUsers: { userId: string }[] = [];
const userWhereMock = vi.fn(async () => activeUsers);
const userFromMock = vi.fn(() => ({ where: userWhereMock }));
const selectDistinctMock = vi.fn(() => ({ from: userFromMock }));
// ── purge: db.delete(...).where(...) ────────────────────────────────────────
const deleteWhereMock = vi.fn(async () => ({}));
const deleteMock = vi.fn(() => ({ where: deleteWhereMock }));
vi.mock('../../db/index.js', () => ({
db: { selectDistinct: selectDistinctMock, delete: deleteMock },
}));
vi.mock('../../db/schema.js', () => ({
agentOutputs: { expiresAt: 'expires_at' },
tipViews: { userId: 'user_id', servedAt: 'served_at' },
}));
vi.mock('drizzle-orm', () => ({
gt: vi.fn(),
lt: vi.fn(),
and: vi.fn(),
eq: vi.fn(),
isNull: vi.fn(),
}));
vi.mock('../../config.js', () => ({ config: { ML_SERVING_URL: 'http://ml' } }));
// ── computeAndStore — tracks which (user, agent) pairs were computed ────────
const computeAndStoreMock = vi.fn(async () => {});
vi.mock('../../routes/agent-outputs.js', () => ({
computeAndStore: computeAndStoreMock,
}));
// ── eligibility — replaceable per test ─────────────────────────────────────
let eligibleIds: Set<string> = new Set();
const getEligibleAgentIdsMock = vi.fn(async (_userId: string) => eligibleIds);
vi.mock('../../profile/eligibility.js', () => ({
getEligibleAgentIds: getEligibleAgentIdsMock,
}));
// ml-serving /health — return a fixed agent list
global.fetch = vi.fn(async () => ({
ok: true,
json: async () => ({ agents: ['overdue-task', 'momentum', 'time-of-day'] }),
})) as unknown as typeof fetch;
beforeEach(() => {
activeUsers = [];
eligibleIds = new Set();
computeAndStoreMock.mockClear();
getEligibleAgentIdsMock.mockClear();
userWhereMock.mockClear();
deleteWhereMock.mockClear();
vi.clearAllMocks();
vi.useFakeTimers();
// restore default mocks after clearAllMocks
userWhereMock.mockImplementation(async () => activeUsers);
getEligibleAgentIdsMock.mockImplementation(async () => eligibleIds);
computeAndStoreMock.mockResolvedValue(undefined);
deleteWhereMock.mockResolvedValue({});
global.fetch = vi.fn(async () => ({
ok: true,
json: async () => ({ agents: ['overdue-task', 'momentum', 'time-of-day'] }),
})) as unknown as typeof fetch;
});
afterEach(() => {
vi.useRealTimers();
});
describe('startAgentPrecomputeScheduler', () => {
it('skips computeAndStore for agents not in the eligibility set', async () => {
activeUsers = [{ userId: 'alice' }];
eligibleIds = new Set(['momentum']); // only momentum consented
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
const computed = computeAndStoreMock.mock.calls.map((c) => c[1]);
expect(computed).toEqual(['momentum']);
expect(computed).not.toContain('overdue-task');
expect(computed).not.toContain('time-of-day');
});
it('skips all agents when eligibility set is empty', async () => {
activeUsers = [{ userId: 'bob' }];
eligibleIds = new Set(); // no consents
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
expect(computeAndStoreMock).not.toHaveBeenCalled();
expect(logger.info).toHaveBeenCalledWith(
expect.objectContaining({ skipped: 3, ok: 0 }),
'agent-scheduler: cycle complete',
);
});
it('computes all agents when all are eligible', async () => {
activeUsers = [{ userId: 'carol' }];
eligibleIds = new Set(['overdue-task', 'momentum', 'time-of-day']);
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
expect(computeAndStoreMock).toHaveBeenCalledTimes(3);
expect(logger.info).toHaveBeenCalledWith(
expect.objectContaining({ ok: 3, skipped: 0 }),
'agent-scheduler: cycle complete',
);
});
it('skips entire user when eligibility check throws', async () => {
activeUsers = [{ userId: 'dave' }];
getEligibleAgentIdsMock.mockRejectedValueOnce(new Error('db timeout'));
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
expect(computeAndStoreMock).not.toHaveBeenCalled();
expect(logger.error).toHaveBeenCalledWith(
expect.objectContaining({ err: expect.anything(), userId: 'dave' }),
'agent-scheduler: eligibility check failed, skipping user',
);
});
it('checks eligibility independently per user', async () => {
activeUsers = [{ userId: 'u1' }, { userId: 'u2' }];
getEligibleAgentIdsMock.mockImplementation(async (userId: string) =>
userId === 'u1' ? new Set(['momentum']) : new Set(['overdue-task', 'time-of-day']),
);
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
const u1Calls = computeAndStoreMock.mock.calls.filter((c) => c[0] === 'u1').map((c) => c[1]);
const u2Calls = computeAndStoreMock.mock.calls.filter((c) => c[0] === 'u2').map((c) => c[1]);
expect(u1Calls).toEqual(['momentum']);
expect(u2Calls.sort()).toEqual(['overdue-task', 'time-of-day']);
});
});

View File

@@ -0,0 +1,119 @@
/**
* Agent pre-compute scheduler (ADR-0013, Step 5).
*
* Every 15 minutes: for each user who viewed a tip in the last 48 hours,
* run all sub-agents and store their prompt snippets in agent_outputs.
* Also purges rows expired more than 24 hours ago.
*
* Agent IDs are fetched from ml/serving /health at start, falling back to
* a hardcoded list if ml/serving is not yet reachable.
*/
import { db } from '../db/index.js';
import { agentOutputs, tipViews } from '../db/schema.js';
import { gt, lt } from 'drizzle-orm';
import { logger } from '../logger.js';
import { config } from '../config.js';
import { computeAndStore } from '../routes/agent-outputs.js';
import { getEligibleAgentIds } from '../profile/eligibility.js';
const FALLBACK_AGENT_IDS = [
'overdue-task',
'momentum',
'time-of-day',
'recent-patterns',
'focus-area',
];
const DEFAULT_INTERVAL_MS = 15 * 60 * 1000;
async function fetchAgentIds(): Promise<string[]> {
try {
const res = await fetch(`${config.ML_SERVING_URL}/health`, {
signal: AbortSignal.timeout(5_000),
});
if (!res.ok) return FALLBACK_AGENT_IDS;
const data = (await res.json()) as { agents?: string[] };
return data.agents?.length ? data.agents : FALLBACK_AGENT_IDS;
} catch {
return FALLBACK_AGENT_IDS;
}
}
async function getActiveUserIds(): Promise<string[]> {
const cutoff = new Date(Date.now() - 48 * 60 * 60 * 1000).toISOString();
const rows = await db
.selectDistinct({ userId: tipViews.userId })
.from(tipViews)
.where(gt(tipViews.servedAt, cutoff));
return rows.map((r) => r.userId);
}
async function purgeExpired(): Promise<void> {
const cutoff = new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
await db.delete(agentOutputs).where(lt(agentOutputs.expiresAt, cutoff));
}
async function runCycle(agentIds: string[]): Promise<void> {
let userIds: string[];
try {
userIds = await getActiveUserIds();
} catch (err: any) {
logger.error({ err }, 'agent-scheduler: failed to query active users');
return;
}
if (!userIds.length) return;
let ok = 0;
let failed = 0;
let skipped = 0;
for (const userId of userIds) {
let eligible: Set<string>;
try {
eligible = await getEligibleAgentIds(userId);
} catch (err: any) {
logger.error({ err, userId }, 'agent-scheduler: eligibility check failed, skipping user');
skipped += agentIds.length;
continue;
}
for (const agentId of agentIds) {
if (!eligible.has(agentId)) {
skipped++;
continue;
}
try {
await computeAndStore(userId, agentId);
ok++;
} catch (err: any) {
failed++;
logger.error({ err, userId, agentId }, 'agent-scheduler: compute error');
}
}
}
try {
await purgeExpired();
} catch (err: any) {
logger.error({ err }, 'agent-scheduler: purge failed');
}
logger.info(
{ ok, failed, skipped, users: userIds.length, agents: agentIds.length },
'agent-scheduler: cycle complete',
);
}
export async function startAgentPrecomputeScheduler(
intervalMs = DEFAULT_INTERVAL_MS,
): Promise<void> {
const agentIds = await fetchAgentIds();
logger.info({ agentIds }, 'agent-scheduler: starting');
setTimeout(() => {
void runCycle(agentIds);
setInterval(() => void runCycle(agentIds), intervalMs);
}, 15_000);
}

View File

@@ -0,0 +1,304 @@
import type { Signal, SignalSource } from '@oo/shared-types';
import { db } from '../db/index.js';
import { integrationTokens } from '../db/schema.js';
import { eq, and } from 'drizzle-orm';
import { bus } from '../events/bus.js';
import { config } from '../config.js';
import { logger } from '../logger.js';
const CACHE_TTL_MS = 5 * 60_000;
const HEALTH_API_BASE = 'https://health.googleapis.com/v4/users/me/dataTypes';
const GOOGLE_TOKEN_URL = 'https://oauth2.googleapis.com/token';
const STEP_DAILY_GOAL = 7_000;
const SLEEP_GOAL_HOURS = 7;
// v4 DataPoint shape is a union keyed by data type; we read defensively.
interface DataPoint {
[key: string]: unknown;
}
interface DataPointsResponse {
dataPoints?: DataPoint[];
nextPageToken?: string;
}
async function refreshGoogleToken(
userId: string,
refreshToken: string,
): Promise<string | null> {
const body = new URLSearchParams({
client_id: config.GOOGLE_CLIENT_ID,
client_secret: config.GOOGLE_CLIENT_SECRET,
refresh_token: refreshToken,
grant_type: 'refresh_token',
});
const res = await fetch(GOOGLE_TOKEN_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
body: body.toString(),
});
if (!res.ok) return null;
const data = (await res.json()) as { access_token: string; expires_in: number };
const expiresAt = new Date(Date.now() + data.expires_in * 1000).toISOString();
await db
.update(integrationTokens)
.set({ accessToken: data.access_token, expiresAt, tokenStatus: 'active' })
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')));
return data.access_token;
}
function todayMidnightIso(): string {
const d = new Date();
d.setHours(0, 0, 0, 0);
return d.toISOString();
}
function yesterdayIso(): string {
return new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
}
async function fetchDataPoints(
token: string,
dataType: string,
filter: string,
): Promise<DataPoint[]> {
const url = new URL(`${HEALTH_API_BASE}/${dataType}/dataPoints`);
url.searchParams.set('filter', filter);
const res = await fetch(url.toString(), {
headers: { Authorization: `Bearer ${token}` },
});
if (!res.ok) throw new Error(`health ${dataType}: ${res.status}`);
const data = (await res.json()) as DataPointsResponse;
return data.dataPoints ?? [];
}
// Defensive numeric reader — probes likely field names in a v4 DataPoint payload.
function readNumber(point: DataPoint, paths: string[][]): number {
for (const path of paths) {
let cur: unknown = point;
for (const key of path) {
if (cur && typeof cur === 'object' && key in (cur as object)) {
cur = (cur as Record<string, unknown>)[key];
} else {
cur = undefined;
break;
}
}
if (typeof cur === 'number') return cur;
}
return 0;
}
function readString(point: DataPoint, paths: string[][]): string | undefined {
for (const path of paths) {
let cur: unknown = point;
for (const key of path) {
if (cur && typeof cur === 'object' && key in (cur as object)) {
cur = (cur as Record<string, unknown>)[key];
} else {
cur = undefined;
break;
}
}
if (typeof cur === 'string') return cur;
}
return undefined;
}
export class GoogleHealthSignalSource implements SignalSource {
readonly id = 'google-health';
private cache = new Map<string, { signals: Signal[]; fetchedAt: number }>();
clearCache(userId?: string): void {
if (userId) this.cache.delete(userId);
else this.cache.clear();
}
async fetchSignals(userId: string): Promise<Signal[]> {
const entry = this.cache.get(userId);
if (entry && Date.now() - entry.fetchedAt < CACHE_TTL_MS) return entry.signals;
const [row] = await db
.select()
.from(integrationTokens)
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')))
.limit(1);
if (!row) return [];
let token = row.accessToken;
const isExpired = row.expiresAt && new Date(row.expiresAt).getTime() - Date.now() < 5 * 60_000;
if (isExpired && row.refreshToken) {
const refreshed = await refreshGoogleToken(userId, row.refreshToken);
if (!refreshed) {
logger.warn({ userId }, 'google-health: refresh failed');
await db
.update(integrationTokens)
.set({ tokenStatus: 'needs_reconnect' })
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')));
bus.publish('signals.integration.token_expired', {
userId,
provider: 'google-health',
detectedAt: new Date().toISOString(),
});
return entry?.signals ?? [];
}
token = refreshed;
}
try {
const dayStartIso = todayMidnightIso();
const dayEndIso = new Date().toISOString();
const yIso = yesterdayIso();
const stepsFilter = `steps.interval.start_time >= "${dayStartIso}" AND steps.interval.start_time < "${dayEndIso}"`;
const caloriesFilter = `total_calories.interval.start_time >= "${dayStartIso}" AND total_calories.interval.start_time < "${dayEndIso}"`;
const hrFilter = `heart_rate.sample_time.physical_time >= "${dayStartIso}" AND heart_rate.sample_time.physical_time < "${dayEndIso}"`;
const sleepFilter = `sleep.interval.start_time >= "${yIso}" AND sleep.interval.start_time < "${dayEndIso}"`;
const [stepsPts, caloriesPts, hrPts, sleepPts] = await Promise.all([
fetchDataPoints(token, 'steps', stepsFilter),
fetchDataPoints(token, 'total-calories', caloriesFilter),
fetchDataPoints(token, 'heart-rate', hrFilter),
fetchDataPoints(token, 'sleep', sleepFilter),
]);
// One-time peek at raw shape so we can refine field paths after first real OAuth.
logger.debug(
{ userId, samples: { stepsPts: stepsPts.slice(0, 1), caloriesPts: caloriesPts.slice(0, 1), hrPts: hrPts.slice(0, 1), sleepPts: sleepPts.slice(0, 1) } },
'google-health: v4 dataPoints sample',
);
const signals: Signal[] = [];
const now = new Date().toISOString();
const steps = stepsPts.reduce(
(sum, p) => sum + readNumber(p, [['steps', 'count'], ['count']]),
0,
);
const stepGoalPct = Math.round((steps / STEP_DAILY_GOAL) * 100);
signals.push({
id: `google-health:steps`,
source: 'google-health',
kind: 'health',
content: `${steps.toLocaleString()} steps today (${stepGoalPct}% of ${STEP_DAILY_GOAL.toLocaleString()} goal)`,
metadata: { dataType: 'steps' },
features: {
step_count: steps,
step_goal_pct: stepGoalPct,
step_goal: STEP_DAILY_GOAL,
below_step_goal: steps < STEP_DAILY_GOAL,
},
timestamp: now,
});
const calories = Math.round(
caloriesPts.reduce(
(sum, p) =>
sum + readNumber(p, [['totalCalories', 'kilocalories'], ['kilocalories'], ['energy', 'kilocalories']]),
0,
),
);
signals.push({
id: `google-health:activity`,
source: 'google-health',
kind: 'health',
content: `${calories} calories burned today`,
metadata: { dataType: 'activity' },
features: {
calories_burned: calories,
},
timestamp: now,
});
if (hrPts.length > 0) {
const hrValues = hrPts
.map((p) => readNumber(p, [['heartRate', 'beatsPerMinute'], ['beatsPerMinute']]))
.filter((v) => v > 0);
if (hrValues.length > 0) {
const bpm = Math.round(hrValues.reduce((a, b) => a + b, 0) / hrValues.length);
signals.push({
id: `google-health:heart_rate`,
source: 'google-health',
kind: 'health',
content: `Resting heart rate: ${bpm} bpm`,
metadata: { dataType: 'heart_rate' },
features: { resting_bpm: bpm, elevated_hr: bpm > 90 },
timestamp: now,
});
}
}
if (sleepPts.length > 0) {
const sleepSessions = sleepPts
.map((p) => ({
start: readString(p, [['sleep', 'interval', 'startTime'], ['interval', 'startTime'], ['startTime']]),
end: readString(p, [['sleep', 'interval', 'endTime'], ['interval', 'endTime'], ['endTime']]),
}))
.filter((s): s is { start: string; end: string } => !!s.start && !!s.end)
.sort((a, b) => Date.parse(b.end) - Date.parse(a.end));
const last = sleepSessions[0];
if (last) {
const durationMs = Date.parse(last.end) - Date.parse(last.start);
const sleepHours = Math.round((durationMs / 3_600_000) * 10) / 10;
const belowGoal = sleepHours < SLEEP_GOAL_HOURS;
signals.push({
id: `google-health:sleep`,
source: 'google-health',
kind: 'health',
content: `${sleepHours}h sleep last night (${belowGoal ? 'below' : 'meets'} ${SLEEP_GOAL_HOURS}h goal)`,
metadata: { dataType: 'sleep' },
features: {
sleep_hours: sleepHours,
sleep_goal_hours: SLEEP_GOAL_HOURS,
sleep_deficit_hours: Math.max(0, SLEEP_GOAL_HOURS - sleepHours),
below_sleep_goal: belowGoal,
},
timestamp: now,
});
}
}
this.cache.set(userId, { signals, fetchedAt: Date.now() });
bus.publish('signals.task.synced', {
userId,
source: 'google-health',
count: signals.length,
syncedAt: now,
});
return signals;
} catch (err: unknown) {
const status = (err as { message?: string }).message;
if (status?.includes('401')) {
logger.warn({ userId }, 'google-health: token expired (401)');
if (row.refreshToken) {
await refreshGoogleToken(userId, row.refreshToken);
} else {
await db
.update(integrationTokens)
.set({ tokenStatus: 'needs_reconnect' })
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')));
bus.publish('signals.integration.token_expired', {
userId,
provider: 'google-health',
detectedAt: new Date().toISOString(),
});
}
} else {
logger.error({ userId, err }, 'google-health: fetch failed');
}
return entry?.signals ?? [];
}
}
}
export const googleHealthSource = new GoogleHealthSignalSource();

View File

@@ -20,8 +20,8 @@ export function makeTestDb(): DrizzleDb & { rawSqlite: BetterSqlite3Database } {
image TEXT, image TEXT,
google_id TEXT UNIQUE, google_id TEXT UNIQUE,
role TEXT NOT NULL DEFAULT 'user', role TEXT NOT NULL DEFAULT 'user',
consent_given INTEGER NOT NULL DEFAULT 0, tone TEXT,
consent_at TEXT, tip_kinds_json TEXT,
created_at TEXT NOT NULL, created_at TEXT NOT NULL,
deleted_at TEXT deleted_at TEXT
); );
@@ -131,6 +131,44 @@ export function makeTestDb(): DrizzleDb & { rawSqlite: BetterSqlite3Database } {
finished_at TEXT finished_at TEXT
); );
CREATE TABLE IF NOT EXISTS agent_outputs (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL REFERENCES users(id),
agent_id TEXT NOT NULL,
prompt_text TEXT NOT NULL,
signals_snapshot TEXT,
computed_at TEXT NOT NULL,
expires_at TEXT NOT NULL,
agent_version TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS user_preferences (
user_id TEXT NOT NULL REFERENCES users(id),
scope TEXT NOT NULL,
key TEXT NOT NULL,
value_json TEXT NOT NULL,
source TEXT NOT NULL DEFAULT 'user',
updated_at TEXT NOT NULL,
PRIMARY KEY (user_id, scope, key)
);
CREATE TABLE IF NOT EXISTS user_consents (
user_id TEXT NOT NULL REFERENCES users(id),
consent_key TEXT NOT NULL,
granted_at TEXT NOT NULL,
revoked_at TEXT,
PRIMARY KEY (user_id, consent_key)
);
CREATE TABLE IF NOT EXISTS user_contexts (
user_id TEXT NOT NULL REFERENCES users(id),
name TEXT NOT NULL,
active INTEGER NOT NULL DEFAULT 0,
schedule_json TEXT,
created_at TEXT NOT NULL,
PRIMARY KEY (user_id, name)
);
CREATE TABLE IF NOT EXISTS sim_events ( CREATE TABLE IF NOT EXISTS sim_events (
id TEXT PRIMARY KEY, id TEXT PRIMARY KEY,
run_id TEXT NOT NULL REFERENCES sim_runs(id), run_id TEXT NOT NULL REFERENCES sim_runs(id),