Compare commits

..

33 Commits

Author SHA1 Message Date
ac1226c367 feat(integrations): migrate google-health from Fit REST to Google Health API v4
Google Fit REST API was closed to new sign-ups on 2024-05-01 and shuts down
end of 2026, surfacing as "Access blocked: this app's request is invalid"
when starting the OAuth flow.

- Swap the 10 fitness.* OAuth scopes for the 3 googlehealth.*.readonly
  scopes (activity_and_fitness, health_metrics_and_measurements, sleep).
- Replace fitness/v1 dataset:aggregate + sessions calls with
  health.googleapis.com/v4/users/me/dataTypes/{steps,total-calories,
  heart-rate,sleep}/dataPoints, filtered to today's window.
- Read the v4 DataPoint union defensively (the per-type schema is sparsely
  documented) and log the first raw sample at debug so we can refine field
  paths after the first real OAuth.
- Output Signal contract is unchanged — agents and downstream consumers
  see the same steps/activity/heart_rate/sleep signals.

Cloud Console still needs: enable Google Health API, add the 3 scopes to
the consent screen, add test user (all googlehealth scopes are Restricted).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 05:42:05 +00:00
2159d4cbd1 fix(infra): unblock docker builds for stars agent and web
- Dockerfile.ml: install build-essential so pyswisseph (stars agent) compiles
- Dockerfile.web: copy root package.json + pnpm-workspace.yaml + pnpm-lock.yaml into builder stage so pnpm --filter resolves the workspace
- CLAUDE.md: record both gotchas alongside the existing Docker rebuild notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:46:20 +00:00
522454ab61 feat(agents): stars agent — astrological transits via pyswisseph (#121)
Computes natal chart (Sun/Moon/Mercury/Venus/Mars/Jupiter/Saturn) from
birth_date and finds active transits (conjunction/sextile/square/trine/
opposition) between today's sky and the user's natal positions. Top 3
most-exact transits are passed to the orchestrator as interpretive themes
to colour the tip — grounded and actionable, not predictive.

Birth date sourced from agent_prefs (populated by a connected Google
data source); requires data:google-health consent. Agent self-silences
when birth_date is absent. pyswisseph added to ml/serving/requirements.txt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 10:59:10 +00:00
be8c006a4d feat(agents): tarot agent — daily three-card draw (situation/action/outcome) (#120)
Draws 3 Major Arcana cards from a daily seed (user_id + date) so the
reading is stable within a day and unique per user. Card meanings and
action hints are precomputed in the agent; the orchestrator receives a
structured prompt snippet and is instructed to weave the themes into a
grounded, practical tip without explaining the cards.

No inferred params, no external data — requires only data:core consent.
TTL 6 h (refreshes at most twice daily).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 10:52:55 +00:00
8474468614 feat(integrations): add Google Health card to connect page (#119)
The OAuth backend (signal source, /connect and /callback routes, token
refresh, consent grant) was already complete. This adds the missing UI:
a Google Health card in /connect with Connect/Disconnect actions, and
broadens the "See my tip →" CTA to appear when any integration is
connected (not only Todoist).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 10:28:14 +00:00
ad43a8f06a fix(recommender): serve fallback tips to users with no integrations (#117)
The integration-token gate returned 422 for users with no connected
sources, blocking them from any tip. Users with no integrations now go
through the full orchestrator pipeline; if it fails (or returns nothing
because agent outputs are also empty), randomFallbackTip() fires and
serves a generic advice tip instead of an error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 09:54:54 +00:00
56fda0d737 chore(scheduler): skip agents whose data sources aren't granted (#128)
Check getEligibleAgentIds per user in runCycle before calling
computeAndStore — agents without consented data sources, silenced by
active context, or disabled via preference are skipped rather than
computed unconditionally. Eligibility check failure skips the whole
user (fail-closed). Skipped count added to cycle-complete log line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:45:08 +00:00
b1bd3d465f docs(readme): replace inline issue checklists with Gitea milestone links
Roadmap phase sections now show shipped summaries only; open work lives
in Gitea milestones. Eliminates duplicate source-of-truth between README
and issue tracker.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:34:45 +00:00
8fd08379d7 chore(m2): close out remaining loose ends (#80, #86, #90)
- Add `ai` compose profile — Ollama + LiteLLM containers for local dev
  when Agap shared services are unavailable; use with LITELLM_URL /
  OLLAMA_URL env vars pointing ml-serving at localhost
- Mark #90 done (LLM schema validation + fallback shipped in 85a332b)
- Mark #80 superseded by ADR-0013 (multi-agent orchestrator is the pipeline)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:31:25 +00:00
85a332b22b feat(recommender): LLM schema validation + hardcoded fallback tips on AI failure (#90)
Python (ml/serving):
- Validate tip item after JSON parse: non-empty content, valid kind
- Retry on schema failure with a targeted clarification prompt, same 2× retry budget
- JSON parse failures keep the existing retry suffix

TypeScript (recommender):
- Add TipSource 'fallback' to shared-types
- FALLBACK_TIPS: 12 general-purpose life tips (hardcoded, no DB read)
- fetchOrchestratorTip returns {ok} discriminated union instead of null
- On !res.ok or fetch error: serve a random fallback tip with rationale 'AI service issues'
- Update tests: 204 path removed; both failure cases now expect source='fallback'

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:21:03 +00:00
772bb6e194 feat(consents): auto-grant data:<provider> on connect; remove agent: consents (ADR-0015)
- integrations.ts: grant data:<provider> on OAuth callback, revoke on disconnect
- Backfill migration: INSERT OR IGNORE data:<provider> for all active tokens
- Agent manifests: drop agent:<id> from required_consents (momentum, time-of-day,
  overdue-task, recent-patterns, health-vitals) — per-agent control is a preference
- eligibility.ts: update comment to reflect data:-only consent model
- test_manifest.py: assert no agent: consents remain in any manifest
- migrations.test.ts: backfill idempotency tests for issue #127
- Dockerfile.api: drop --offline flag (fixes ERR_PNPM_NO_OFFLINE_META)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:09:58 +00:00
34925310cf docs: update focus-area manifest description and CLAUDE.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 15:00:06 +00:00
f66f337779 feat(focus-area): use enriched descriptions in cluster output
cluster_tasks now attaches enriched_description to each task dict.
focus-area reads enriched_description (falling back to raw content) when
building the area summary, so the orchestrator sees the expanded 3-sentence
descriptions instead of terse raw titles.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:58:31 +00:00
f6b89fc849 refactor(focus-area): output all clusters as context; remove scoring and preferred_areas
The agent no longer picks a winner — it summarises every cluster so the
orchestrator can decide what's relevant. Scoring by overdue count overlapped
with the overdue-task agent. preferred_areas (project-ID based, broken label
matching) removed entirely.

Output format: numbered list of areas with task titles included.
Snapshot: {cluster_count, clusters: [{label, task_count, tasks}]}.
Version bumped to 3.0.0; inferred_params cleared.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:57:04 +00:00
12c956b588 fix(clustering): drop TTL check from isUpToDate; task hash is the only signal
If tasks haven't changed, the output is valid forever. If they changed,
always recompute regardless of age. TTL on focus-area restored to 24h —
it only controls recommender eligibility, not recompute frequency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:46:43 +00:00
d12f11d29d feat(clustering): 1h TTL + skip recompute when tasks unchanged
focus-area now recomputes at most once per hour, and only if the task list
actually changed since the last compute.

- focus-area TTL: 43200s → 3600s; version bumped to 2.1.0
- computeAndStore hashes sorted task contents (MD5) and checks the stored
  _task_hash in the existing snapshot; skips the ml-serving call when the
  hash matches and the output isn't expired
- ml-serving injects _task_hash into the snapshot so the next cycle can compare

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:45:15 +00:00
9ddeea6cac feat(clustering): persistent enrichment cache in task_enrichments table
Each unique task title is now enriched by LiteLLM once and cached in the DB.
Subsequent agent compute cycles (every 12h) fetch the cache before calling
ml-serving; only new titles hit the tip-generator.

- DB: task_enrichments(content_hash PK, description, model, created_at)
- TS: fetchEnrichmentCache / persistEnrichments helpers in agent-outputs.ts;
  enrichment_cache passed in compute request, new_enrichments persisted from response
- Python: AgentComputeRequest.enrichment_cache / AgentComputeResponse.new_enrichments;
  AgentInput.enrichment_cache; _enrich_batch returns (descriptions, new_entries);
  cluster_tasks returns (clusters, new_enrichments)
- FocusAreaAgent stashes new_enrichments in signals_snapshot under _new_enrichments;
  compute_agent endpoint pops it before storing the snapshot

Closes part of #129

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:39:35 +00:00
08d08ad7b0 feat(clustering): LLM-enrichment before embedding (port from taskpile #129)
Ported from taskpile experiments/clustering_eval (prompt v1, qwen2.5:1.5b).
The experiment showed ARI 0.22→0.77 and AUROC 0.76→0.91 on synthetic tasks
when embedding LLM-expanded descriptions instead of raw titles.

- Expand each task title via LiteLLM tip-generator before embedding
- Prefix with "clustering: " (nomic-embed-text task instruction prefix)
- Cache expansions in-memory by content hash within a compute cycle
- Falls back to raw title if enrichment fails; no change to fallback behaviour

Fixes #129

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:20:48 +00:00
1ca2351488 fix(clustering): route embeddings through LiteLLM instead of Ollama directly
The old code called Ollama's /api/embeddings one task at a time, which caused
silent fallback to project-based grouping when host.docker.internal:11434 was
unreachable from the ml-serving container.

- Switch to LiteLLM /embeddings (model alias "embedder") as primary path
- Batch all task contents in one request instead of N serial calls
- Fall back to Ollama /api/embed (updated to current API) when LITELLM_URL is absent
- Update tests to mock _embed_batch instead of the removed _embed

Fixes #123

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 13:42:53 +00:00
4e9210fcef fix(web): wrap loadTip in arrow fn to satisfy MouseEventHandler type 2026-05-12 13:34:46 +00:00
59c493323f fix(recommender): remove Todoist fallback on orchestrator failure; add snooze exclusion
When fetchOrchestratorTip returned null (LiteLLM timeout, bad JSON, etc.)
the recommender silently fell back to randomPolicy, serving a raw Todoist
task with no rationale — explaining both reported symptoms.

- Remove randomPolicy/signalToCandidate; return 204 when orchestrator fails
  so the UI shows "All clear" instead of a confusing Todoist task
- Pass recent_tip through the stack (frontend → POST /recommend →
  fetchOrchestratorTip → ml/serving RecommendRequest → build_orchestrator_messages)
  so after snooze the LLM is instructed not to repeat the snoozed content

Fixes #122

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 13:28:32 +00:00
d4b40e2590 docs: document MLflow trace API, span inspection, and no-agent diagnosis
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 11:23:13 +00:00
a0a069c525 fix(admin): break redirect loop on /forbidden for non-admin users
The middleware was redirecting non-admins to /forbidden but /forbidden
wasn't excluded from the matcher, so the middleware ran again on that
page, saw a non-admin, and redirected again — infinite loop. Added
/forbidden to the pass-through list alongside /login.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 11:12:16 +00:00
d1f28666b0 feat(integrations): add Google Health (Fit) integration with full permissions
OAuth2 flow with all 11 Google Fitness scopes (activity, body, sleep,
heart rate, nutrition, location, blood glucose/pressure/temperature,
oxygen saturation, reproductive health). Stores access + refresh tokens;
auto-refreshes on expiry.

GoogleHealthSignalSource fetches steps, sleep sessions, active minutes,
calories, and heart rate from the Fit aggregate + sessions APIs. Signals
flow into both the tip orchestrator and the health-vitals pre-compute
agent, which generates prompt snippets about step progress, sleep
deficit, sedentary time, and elevated heart rate.

Signal.kind extended with 'health'; IntegrationProvider extended with
'google-health'. Agent compute signal mapping enriched to include source,
kind, and all features so health-vitals can filter its own signals.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 11:12:11 +00:00
161e654027 feat(serving): replace MLflow run logging with native trace spans
Convert ml-serving from isolated MLflow runs to nested traces using
mlflow.start_span_no_context(). The recommend endpoint now emits a full
span tree: recommend (CHAIN) → build_context (TOOL), agent:* (AGENT) ×N,
llm_orchestrator (LLM). Compute and infer endpoints each emit a single span.

Supporting changes:
- mlflow-skinny>=3.1.0 added to requirements
- MLflow configured with --serve-artifacts + mlflow-artifacts:/ default root
  for cross-container artifact proxy (spans now persist from ml-serving)
- --allowed-hosts extended to include mlflow:5000 (SDK includes port in Host)
- science_destiny slider wired through prompts.py and recommend endpoint
- Config page exposes science/destiny slider (0=data-driven, 100=intuitive)
- Tip page shows rationale inline on tap

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 08:26:05 +00:00
afacc34969 fix(agents): instruct orchestrator to output tip in English
Small models (qwen2.5:1.5b) mirror the language of task title content
in the prompt. Adding an explicit English note to snippets that embed
raw task titles (focus-area, overdue-task) prevents language bleed.
Also added the instruction to the orchestrator system prompt and user
message as belt-and-suspenders.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 11:53:21 +00:00
c124ff4d24 docs: update CLAUDE.md with session learnings (#118 tracing, compose gotchas)
- Clarify compose profile requirement for build/up (silent no-op without --profile)
- Add --force-recreate pattern for env-var-only changes
- Document MLflow host_header and auth gotchas for container-to-container calls
- Record MLflow tracing addition and #118 M4 tracking issue

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 10:41:57 +00:00
95e1b342b4 fix(serving): wire MLflow auth and Host header for container-to-container calls
- Pass MLFLOW_ADMIN_PASSWORD as fallback password credential
- Set host_header='localhost' to satisfy MLflow's --allowed-hosts check
  (MLflow rejects Host: mlflow but accepts Host: localhost)
- Default MLFLOW_TRACKING_URI to http://mlflow:5000 in compose so the
  env_file value is not silently overridden to empty

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 10:39:08 +00:00
c43dbaf23d feat(serving): add MLflow tracing to ml-serving for all agent calls
Logs one MLflow run per /recommend (params, token metrics, latency,
full prompt + tip as artifacts) and per /agents/{id}/compute and
/infer call (signals snapshot, inferred prefs, latency).

Tracing is a no-op when MLFLOW_TRACKING_URI is unset; ml-serving
starts and serves tips correctly without MLflow configured.

Refs #118 (M4: remove from production / move off critical path).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 10:30:24 +00:00
488a764519 docs: mark M2 complete in README
All M2 items shipped: ADR-0014 (unified profile + inference framework),
per-agent auto-inference, tip generator, TipCandidate schema, prompt
versioning, model benchmark, task clustering, UX refinements.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 08:02:44 +00:00
c67f2b14c4 docs: update CLAUDE.md with #61 completion and feature test patterns
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 07:45:40 +00:00
17b9516903 feat(features): mirror invalidatedBy into Python ProfileFeature (#61)
Adds invalidated_by: tuple[str, ...] to ProfileFeature, mirroring the
invalidatedBy bus subjects from registry.ts. Adds a test that parses the
TS source and asserts Python stays in sync — same drift-detection pattern
used for names and ttlSec.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 07:10:36 +00:00
a75be0d832 docs: update CLAUDE.md with session learnings (#97, #113)
- focus-area v2.0.0 completion in recent completions; remove from active work
- Update focus-area inferred params table row
- min_history gotcha: checked against events, not task_completions
- httpx trust_env=False rule for ml/ code
- Agent test command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 06:56:17 +00:00
49 changed files with 2752 additions and 567 deletions

109
CLAUDE.md
View File

@@ -65,7 +65,18 @@ docs/ architecture notes, ADRs, API specs
- One PR = one concern. Conventional-commit prefixes (`feat:`, `fix:`, `chore:`, `docs:`, `refactor:`).
- ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work.
- No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later).
- Compose profiles: `core` (api + web + admin), `full` (adds ml-serving), `mlops` (adds MLflow), `ai` (adds Ollama + LiteLLM). Mix as needed.
- Compose profiles: `core` (api + web + admin), `full` (adds ml-serving + nats), `mlops` (adds MLflow), `ai` (adds Ollama + LiteLLM). Mix as needed. Always pass `--profile <name>` to `build`/`up` — without a profile, no services are selected and builds silently do nothing.
- Docker rebuild: use `--force-recreate` on `up` when only env vars changed (no image rebuild needed); new env vars in `.env.local` are not picked up by a running container until it is recreated.
- Docker rebuild gotchas:
- **Never run two `docker compose up --build` at once** — both grab the same `--mount=type=cache,id=pnpm` and deadlock on the API's `pnpm --prod deploy` step. Symptom: build sits silent for hours on `[api builder 8/8]`. Before starting any build, check `ps aux | grep "docker compose"` and kill any prior `up --build` (`kill -9 <pid>` — the wrapper bash and the docker compose binary are separate PIDs; kill the docker compose one).
- **Don't add `--offline` to `pnpm --prod deploy`** — pnpm's metadata cache (`/root/.cache/pnpm/`) is not in the `/pnpm/store` cache mount, so `--offline` fails with `ERR_PNPM_NO_OFFLINE_META` for transitive devDeps (e.g. vite via vitest). Leave the deploy step network-on; it works.
- **All TS Dockerfiles need `python3 make g++`** in the base stage — `better-sqlite3` rebuilds natively on install. Missing from `Dockerfile.admin` historically caused `gyp ERR! find Python` failures.
- **`Dockerfile.ml` needs `build-essential`** (not just `gcc`) — `pyswisseph` (stars agent) compiles C from source and fails with `fatal error: math.h: No such file or directory` if only `gcc` is installed; it needs `libc-dev` too, easiest via `build-essential`.
- **`Dockerfile.web` builder stage needs root `package.json` + `pnpm-workspace.yaml` + `pnpm-lock.yaml`** copied in. Without them, `pnpm --filter @oo/shared-types build` fails with `[ERR_PNPM_NO_PKG_MANIFEST] No package.json found in /app`. The deps stage has them but the builder is a fresh layer; selective copies must include them.
- **A clean build of `--profile core` takes ~3 min total** when the buildx cache is warm. If it's been silent for >10 min, check for the parallel-build deadlock above before assuming "still going".
- Run Python agent tests: `python3 -m pytest ml/agents/tests/ -x -q` (tests add repo root to `sys.path` themselves).
- Run Python feature tests: `python3 -m pytest ml/features/ -x -q`
- `ml/features/` files are Python mirrors of TS registries — TS is source of truth. Tests parse `registry.ts` with regex to detect drift; follow the same pattern whenever a new field is added to `ProfileFeature`.
## Definition of done (per feature)
@@ -83,13 +94,93 @@ oO generates tips through a multi-agent pipeline (ADR-0013): pre-compute agents
| Alias | Model | Used by |
|-------|-------|---------|
| `tip-generator` | qwen2.5:1.5b (default) | `ml/serving` tip generation |
| `embedder` | nomic-embed-text | task clustering, dedup |
| `embedder` | nomic-embed-text | task clustering (after LLM enrichment), dedup |
| `judge` | claude-haiku-4-5 (cloud, eval only) | offline sim |
Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap host, `http://host.docker.internal:11434` from containers).
Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.
All `httpx` calls in `ml/` must use `trust_env=False` to bypass the system proxy — same rule as `bw` and curl. Pattern: `httpx.Client(trust_env=False, timeout=N)`.
MLflow container-to-container calls: always pass `host_header="localhost"` to `MLflowClient` — MLflow's `--allowed-hosts` rejects `Host: mlflow` (the container DNS name) with 403. Auth credential is `MLFLOW_ADMIN_PASSWORD`. MLflow REST API lives at the origin root, not under the `/mlflow` UI prefix.
### MLflow API versions — runs vs traces
MLflow uses **two API versions** — use the right one or you'll get 405:
| What | API prefix | Example |
|------|-----------|---------|
| Runs, experiments, metrics | `/api/2.0/mlflow/` | `runs/search`, `experiments/list` |
| Traces (LLM observability) | `/api/3.0/mlflow/traces/` | `traces/{trace_id}` |
**Experiment IDs:** `3` = oO/serving. Artifacts stored as run tags prefixed `artifact:<path>`.
### Querying from the host shell
Always strip the proxy and pass `Host: localhost` (no port — `localhost:5000` fails the DNS-rebinding check).
```bash
# Search recent runs (experiment 3)
env -u HTTPS_PROXY -u HTTP_PROXY -u ALL_PROXY -u https_proxy -u http_proxy -u all_proxy \
curl -s -H "Host: localhost" -u "admin:${MLFLOW_ADMIN_PASSWORD}" \
-X POST http://localhost:5000/api/2.0/mlflow/runs/search \
-H "Content-Type: application/json" \
-d '{"experiment_ids":["3"],"max_results":5,"order_by":["start_time DESC"]}'
# Get a trace by ID (note: /api/3.0/, not /api/2.0/)
env -u HTTPS_PROXY -u HTTP_PROXY -u ALL_PROXY -u https_proxy -u http_proxy -u all_proxy \
curl -s -H "Host: localhost" -u "admin:${MLFLOW_ADMIN_PASSWORD}" \
http://localhost:5000/api/3.0/mlflow/traces/tr-<trace_id> | python3 -m json.tool
```
The trace response includes `trace_metadata.mlflow.traceInputs/Outputs`, `trace_metadata.mlflow.trace.sizeStats` (num_spans), and `tags.mlflow.traceName`.
### Getting spans (Python client from inside the container)
The REST API has **no endpoint for spans**`/api/3.0/mlflow/traces/{id}/spans` returns 404. Use the Python client inside `oo-ml-serving-1`:
```bash
docker exec oo-ml-serving-1 python3 -c "
import mlflow, json, os
mlflow.set_tracking_uri('http://mlflow:5000')
os.environ['MLFLOW_TRACKING_USERNAME'] = 'admin'
os.environ['MLFLOW_TRACKING_PASSWORD'] = os.environ.get('MLFLOW_ADMIN_PASSWORD', '')
client = mlflow.tracking.MlflowClient()
trace = client.get_trace('tr-<trace_id>')
for span in trace.data.spans:
print(span.name, '| parent:', span.parent_id, '| status:', span.status)
print(' inputs:', json.dumps(span.inputs)[:200])
print(' outputs:', json.dumps(span.outputs)[:200])
print(' attrs:', span.attributes)
"
```
### Span structure for a tip generation trace
A healthy `recommend` trace has 3 spans:
| Span | Type | Parent | Key attributes |
|------|------|--------|---------------|
| `recommend` | CHAIN | (root) | `agent_count`, `latency_ms`; inputs include `agent_ids` list |
| `build_context` | TOOL | recommend | `agent_count`, `task_count`, `science_destiny` |
| `llm_orchestrator` | LLM | recommend | `prompt_tokens`, `completion_tokens`, `model`, `attempts` |
### Diagnosing "no agents in trace"
If the trace shows `agent_ids: []` and `agent_count: 0` in the root span, and the orchestrator prompt says *"No pre-computed agent context available"*, it means the recommender found zero eligible snippets at request time. Causes:
1. **Agent compute hasn't run** — no `agent_outputs` rows for this user yet
2. **Snippets expired** — TTL elapsed since last compute
3. **Eligibility filter dropped all agents** — none passed the manifest-driven check
Diagnose with:
```bash
docker exec oo-api-1 psql "$DATABASE_URL" -c \
"SELECT agent_id, computed_at, expires_at FROM agent_outputs WHERE user_id='<uid>' ORDER BY computed_at DESC LIMIT 10;"
```
**Multi-agent tip generation pipeline (ADR-0013):**
1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
@@ -108,10 +199,12 @@ Recent completions:
- ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)
- ADR-0014 complete: unified Profile schema + backfill, manifest plumbing, `/api/profile` read-through, registry-driven eligibility filter, inference framework + per-agent inference, legacy consent column drop — 2026-05-05
- Rich per-agent inference for all four active agents (#112, #114, #115, #116) — 2026-05-06: quiet/peak hours (time-of-day), z-score baseline (momentum), p50 lateness + project realness (overdue-task), adaptive lookback + weekly/daily cycles (recent-patterns)
- Semantic task clustering via nomic-embed-text + LLM enrichment (#97, #113, #129) — 2026-05-12: `ml/agents/clustering.py`; titles expanded via `tip-generator` before embedding; persistent cache in `task_enrichments` table; recompute gated on task-list hash change; focus-area v3.0.0 outputs all clusters with enriched descriptions
Active work (M2):
- Per-user feature freshness SLAs (#61, ADR-0011 phase B)
- Embedding-based task clustering for focus-area inference (#97, #113)
- Per-user feature freshness SLAs (#61) — 2026-05-06: `invalidated_by` mirrored into `ProfileFeature`; drift-detection test added
- MLflow tracing added to `ml/serving` for all agent calls — 2026-05-06: `ml/serving/mlflow_client.py`; activated by `MLFLOW_TRACKING_URI=http://mlflow:5000` (default in compose `full` profile); requires `--profile mlops` for the MLflow container. Issue #118 (M4) tracks removal from production critical path.
Active work (M2): *(all M2 items complete — see README for M3 planning)*
## ADR-0014 endpoint map (as of step 6)
@@ -132,7 +225,7 @@ Lives in `ml/agents/inference/`. `run_inference(manifest, history)` evaluates al
- `infer()` error → emit `cold_start_default` (never crashes)
- Results written to `user_preferences` with `source='inferred'`; keys with `source='user'` are never overwritten
All five agents are at v1.2.0. Per-agent inferred params (all live in `ml/agents/<name>.py`):
Per-agent inferred params (all live in `ml/agents/<name>.py`):
| Agent | Inferred params | Notes |
|-------|----------------|-------|
@@ -140,10 +233,12 @@ All five agents are at v1.2.0. Per-agent inferred params (all live in `ml/agents
| `momentum` | `engagement_trend`, `baseline_completions_per_day`, `stdev` | Baseline = 28d rolling mean done/day; snippet uses z-score language |
| `overdue-task` | `lateness_tolerance_days`, `project_realness` | Tolerance = p50 lateness from TaskCompletion history; realness = project median vs global median |
| `recent-patterns` | `lookback_days`, `weekly_cycle`, `daily_cycle` | Lookback sized to ≥30 done events; cycles use peak-to-mean ratio; snippet hints when strength > 0.5 |
| `focus-area` | *(none yet)* | Needs project-level feedback linkage (#78) |
| `focus-area` | *(none)* | No inferred params. Clusters tasks via LLM-enriched embeddings and outputs all areas with expanded descriptions. Recomputes only when task list changes (hash-gated). |
`UserHistory` carries both `events: list[FeedbackEvent]` and `task_completions: list[TaskCompletion]`. `AgentInferRequest` (ml/serving) accepts `task_completions: list[dict]` alongside `feedback_history`.
`min_history` is checked against `len(history.events)` (feedback events), **not** `task_completions`. Agents that infer from completions should set `min_history=0` and guard inside `infer()`.
## What NOT to do
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.

161
README.md
View File

@@ -121,172 +121,31 @@ All model calls route through **LiteLLM** at `llm.alogins.net` (or `LITELLM_URL`
## Roadmap
Issues and open work are tracked in [Gitea milestones](http://localhost:3000/alvis/oO/milestones). Pick an issue, check its milestone (= phase), read the service's `README.md`, ship.
### Phase 0 — Walking skeleton *(M0)* ✓ shipped
Goal: a single user signs in with Google, connects Todoist, and sees one random Todoist task on a black page. Deletion works.
- [x] Monorepo scaffold, docker-compose dev env
- [x] `auth` — Google OAuth2/PKCE via openid-client v6; session cookie; Next.js middleware guard
- [x] `integrations/todoist` — OAuth2 flow, token stored in DB, disconnect supported
- [x] `recommender` with `RandomPolicy`; stable `POST /recommend` contract; 30s task cache
- [x] `apps/web` — sign-in, connect, tip pages; PWA manifest + icons
- [x] Feedback: `done / snooze / dismiss`; reward inferred from dwell-time (`inferReward`); marks task complete in Todoist
- [x] Deploy modular monolith to Agap VM via Caddy at `o.alogins.net`
- [x] ToS + Privacy Policy pages (`/legal/terms`, `/legal/privacy`); implicit consent on sign-in
- [x] Account deletion: revokes tokens, purges data, soft-deletes profile; button on /connect
- [x] Metrics baseline: `tip_views` table (tip served) + `tip_feedback` (reactions) — activation + reaction rate queryable
Single user signs in with Google, connects Todoist, sees one random task on a black page. Deletion works. Auth, integrations, recommender stub, PWA, feedback loop, ToS/privacy, metrics baseline.
### Phase 1 — Real signal + in-the-moment delivery *(M1)* ✓ shipped
Goal: tips are picked, not drawn from a hat — and they arrive at the right moment on the web.
- [x] Event bus scaffold: typed in-process EventEmitter with 500-event ring buffer; subjects match future NATS JetStream — swap is mechanical
- [x] Todoist sync emits `signals.task.synced`; tip served/feedback emit `signals.tip.*`
- [x] Features extracted per task: `is_overdue`, `task_age_days`, `priority`; context: `hour_of_day`, `day_of_week`
- [x] **ε-greedy v1** (d=7, ε=0.10, day-of-week sin/cos features); per-user state persisted to disk
- [x] **ε-greedy v2** (d=12, profile features: completion rate, dismiss rate, dwell, preferred hour, tip volume) in shadow; promoted to active policy (ADR-0012)
- [x] `RemotePolicy` in recommender: calls ml/serving, falls back to RandomPolicy on timeout/error; logs explainability to `tip_scores`
- [x] Feedback loop: dwell-time inferred reward (`inferReward`) → online model update; `done` in 15 s2 min = +1.0 (magic zone)
- [x] Offline simulation framework (`ml/experiments/sim`): rule/LLM/claude-code judges, two-policy comparison, results persisted to `sim_runs` + `sim_events`
- [x] **Web Push** (VAPID): SW, subscribe/unsubscribe API, "notify me" button on tip page
- [x] Shadow-policy registry: run N shadow policies per request, log picks without serving them (#56)
- [x] NATS JetStream bridge — durable `signals.>` and `feedback.>` streams; in-process bus stays the source of truth, every publish bridges out (#21, shipped)
- [x] Per-user profile features (completion rate, dismiss rate, dwell, preferred hour, tip volume) — event-driven, JIT invalidation (#81)
- [ ] Quiet-hours + dedupe for push delivery
- [ ] Delayed rewards: tasks completed directly in Todoist (requires webhook from Todoist)
- [ ] Apple OAuth (deferred to M3)
Tips are picked, not drawn from a hat. Event bus, Todoist sync, task features, ε-greedy policy (v1 + v2), web push, NATS JetStream bridge, shadow-policy registry, offline sim framework, per-user profile features, admin + ML ops console (`apps/admin`).
#### M1 add-on — Admin & ML Ops Console *(fully shipped)*
oO is ML-heavy. Without a cockpit, every model change ships blind. This console is the team's single pane for users, signals, features, models, experiments, and tip outcomes — with the ability to *act* on them (revoke a token, replay an event, promote a model, reset a bandit).
**Framework pick — `apps/admin` on Next.js 15 + Tremor + shadcn/ui.** Analytics-first UI for an analytics-first product, stays on our existing TS/React/Tailwind stack, reuses `packages/shared-types`, `sdk-js`, and the Auth.js session. Specialized ML tooling (MLflow) runs as a **separate external service** linked from the admin shell; Grafana panels are embedded.
| Layer | Tool | Why |
|-------|------|-----|
| App shell | **Next.js 15** (new `apps/admin`) | Same stack as `apps/web`; reuses auth, types, SDK |
| Dashboards / charts | **[Tremor](https://tremor.so)** | Analytics-first React + Tailwind — KPI cards, time-series, categorical, heatmaps |
| CRUD primitives | **[shadcn/ui](https://ui.shadcn.com)** | Copy-paste Radix components; forms, dialogs, command palette |
| Heavy grids | **[TanStack Table v8](https://tanstack.com/table)** | Sortable / paginated / virtualized tables (events, users, tips) |
| Extra charts | **[Recharts](https://recharts.org)** / **[visx](https://airbnb.io/visx)** | Fallbacks where Tremor falls short (e.g. force graphs, Sankey) |
| Model registry / experiments | **[MLflow](https://mlflow.org)** *(external — `o.alogins.net/mlflow`)* | Experiment tracking, artifact browser, model registry; own basic-auth |
| Infra metrics | **[Grafana](https://grafana.com)** *(embedded panels)* | One ops source of truth |
| Ad-hoc analysis | **[Marimo](https://marimo.io)** reactive notebooks | Python-native for the ML side; launch-out link |
| AuthZ | `profile.role='admin'` + Next.js middleware | Reuses existing session; no new auth surface |
**Rejected alternatives (so we don't re-litigate):**
- *Retool / AppSmith* — low-code speed, but admin logic leaves our repo; weak analytics affordances for an analytics product
- *Streamlit / Gradio / Dash* — Python-first; thin RBAC and routing; splits our frontend stack in two
- *React-admin / Refine.dev* — strong CRUD scaffolding, but analytics/ML views feel bolted on; we'd rebuild Tremor-style dashboards ourselves
- *Superset / Metabase as the admin surface* — excellent for BI, poor for operational **writes** (revoke, replay, promote). Plan: **adopt Superset in M4** for BI alongside batch pipelines; ship a read-only SQL widget inside admin for now
**Build sequence:**
1. [x] **ADR-0006** — record the framework choice + "embed, don't rebuild" rule for MLflow/Grafana
2. [x] **Scaffold**`apps/admin` with Next.js 15, Tailwind, Tremor; deploy behind Caddy at `admin.o.alogins.net`
3. [x] **RBAC**`role` column on `users`; admin-only Next.js middleware; seed first admin via `ADMIN_SEED_EMAIL` env; `admin_actions` audit-log table
4. [x] **Overview dashboard** — DAU/WAU KPI cards, tips served, reaction breakdown, activation funnel
5. [x] **User explorer** — list + detail page: identity, consents, integrations, last tip, reward history; revoke-integration + reset-bandit + rebuild-profile actions
6. [x] **Event stream viewer** — live tail of `signals.*` with filters by subject/user/time; same UI when the bus swaps to NATS
7. [x] **Features page** — features sent to `ml/serving` per scoring call; per-user profile features with freshness; diff across time
8. [x] **Tips page** — tips served, scored, feedback reactions with policy/model breakdown
9. [x] **Reward analytics** — reaction distribution over time; per-policy / per-model / per-prompt-version compare; slice by `hour_of_day`, `priority`, cohort
10. [x] **Data quality widget** — missing-feature rate, stale-token rate, daily completeness heatmap; per-feature freshness SLA status
11. [x] **Ops actions** — revoke token (Users page), rebuild profile, reset bandit, enable/disable shadow policies; every action audit-logged
12. [x] **Health rollup**`/admin/health` surfaces api, ml/serving, SQLite, event-bus, MLflow; auto-refreshes every 15s
13. [x] **Read-only SQL runner** — SELECT-only runner against SQLite + saved queries (sunsets to Superset in M4)
14. [x] **Offline simulation runner** — launch `ml/experiments/sim` from admin UI; track sim runs, judge, policy comparison
15. [x] **Token-based admin auth**`POST /api/auth/token` for Playwright/CI; `ADMIN_TOKEN` env var (#105)
16. [x] **Docs pages** — admin documentation and runbooks inline
### Phase 2 — AI tips + multi-source signals *(M2)* in progress
Goal: tips are AI-generated from user context, not just raw Todoist tasks. Multiple signal sources feed a generalized pipeline. Research-intensive milestone.
**Architectural shift (mid-M2):** the bandit-ranks-LLM-candidates design from earlier in M2 was replaced with a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM produces the tip directly. ADR-0014 layers a unified Profile + agent registry + auto-inference framework on top so the system generalizes cleanly to N agents.
**Multi-agent recommendation (ADR-0013, shipped):**
- [x] `agent_outputs` table + per-agent TTL caching
- [x] Five initial agents: `overdue-task`, `momentum`, `time-of-day`, `recent-patterns`, `focus-area`
- [x] Agent pre-compute scheduler
- [x] Orchestrator cutover — recommender calls `ml/serving` with snippet list, no bandit scoring
- [x] Bandit endpoints + shadow policy machinery removed
**Unified Profile + agent registry (ADR-0014, in progress):**
- [ ] Unified Profile model: prefs, contexts, consents + manifest plumbing + orchestrator cutover (#30)
- [ ] Shared context-inference framework (#111)
- [ ] Per-agent auto-inference: `time-of-day` (#112), `focus-area` (#113), `momentum` (#114), `overdue-task` (#115), `recent-patterns` (#116)
**AI infrastructure (unblock everything else):**
- [ ] `ai` compose profile — Ollama + LiteLLM for local dev; env vars `OLLAMA_URL` / `LITELLM_URL` (#86)
- [ ] AI gateway — wire `ml/serving` to LiteLLM; model aliases `tip-generator` + `embedder` (#87)
**AI tip generation pipeline:**
- [x] Context assembler — user signals + feature store → structured prompt context (`ml/features/context.py`); skeleton implemented
- [ ] Tip generator endpoint — `POST /generate` in `ml/serving`; LLM → N typed `TipCandidate` objects (#79)
- [ ] `TipCandidate` shared schema — `{content, kind, source, model, prompt_version, confidence}`; update recommender pipeline (#89)
- [ ] LLM output validation + retry — JSON schema gate, clarification retry (2×), fallback to task-based (#90)
- [ ] Prompt versioning — `prompt_version` + `model` columns in `tip_scores`; content-hash invalidation (#91)
- [x] LLM tip quality dashboard — reaction breakdown by model / prompt_version in `/admin/reward-analytics` (#92)
**Evaluation & model selection:**
- [ ] Model benchmark — compare qwen2.5:7b / llama3.2:3b / gemma3:4b via offline sim + LLM judge (#93)
- [ ] LLM prompt research — persona design, context injection strategies, few-shot examples (#84)
**Pipeline architecture:**
- [x] Signal source abstraction — `SignalSource` interface for Todoist + extensible design (#78)
- [ ] Generalized recommendation pipeline — candidate → rank → render stages (#80)
- [x] Feature registry + user profile builder — centralized features, persistent profiles, event-driven invalidation (#81)
- [ ] Tip kind system — task, advice, insight, reminder with kind-aware UI + rewards (#82)
**Policy research:**
- [ ] Next-gen policies — Thompson sampling, neural bandits, hybrid transfer learning (#83)
**Integrations & infra (carried from M1):**
- [ ] Apple OAuth (#7)
- [x] NATS JetStream replacing in-process bus (#21) — adapter ships in `services/api/src/events/nats.ts`; in-proc bus is the producer, JetStream is the durable mirror
- [x] Todoist sync via events (#22) — background scheduler in `services/api/src/signals/scheduler.ts` emits `signals.task.synced` every `TODOIST_SYNC_INTERVAL_MS`; on-demand fetch remains as freshness fallback
- [x] Event schema registry + protobuf CI gate (#54) — buf lint/breaking checks on every PR
- [x] Per-user freshness SLAs for features (#61) — context-feature (JIT) vs profile-feature (batched) spec in ADR-0011; CONTEXT_FEATURES in ml/features/context.py
- [x] Observability (#18) — structured logs via pino, W3C trace IDs, Sentry hooks, trace correlation end-to-end
- [ ] CI skeleton (#3), E2E tests (#20)
**Bugs & UX (fix before new features):**
- [x] TipFeedback type mismatch (#73)
- [x] Todoist token refresh (#74) — OAuth token auto-refresh on 401
- [x] Reward fire-and-forget (#75) — retry logic + logging
- [x] Data retention purge (#76) — daily purge of 30-day-old tip_scores/tip_feedback
- [x] Port mismatch (#77) — fixed in docker-compose + env var config
- [ ] UX refinements (#100102) — "done/snooze/dismiss" feedback only, config page UI, settings gear button
### Phase 2 — AI tips + multi-source signals *(M2)* ✓ shipped
Tips are AI-generated from user context. Multi-agent pipeline (ADR-0013): five pre-compute agents (`overdue-task`, `momentum`, `time-of-day`, `recent-patterns`, `focus-area`) emit prompt snippets; orchestrator LLM produces one tip. Unified Profile + agent registry + auto-inference framework (ADR-0014). LLM output validation + fallback. LiteLLM gateway, model benchmarking, prompt research, MLflow tracing.
### Phase 3 — Native mobile *(M3)*
- [ ] iOS app (SwiftUI) with APNs push
- [ ] Android app (Compose) with FCM push
- [ ] `notifier` gains APNs + FCM channels, per-device rate limits
- [ ] Migrate auth from Auth.js to dedicated OIDC provider (trigger from ADR-0004)
- [ ] Consolidate MLflow behind shared OIDC (SSO for all internal services)
- [ ] Decide-and-deliver scheduler: per-user "is this tip worth interrupting now?" threshold
iOS (SwiftUI + APNs) and Android (Compose + FCM). `notifier` service gains APNs + FCM channels. Auth migrated from Auth.js to dedicated OIDC provider. Decide-and-deliver scheduler. See [M3 milestone](http://localhost:3000/alvis/oO/milestone/3).
### Phase 4 — MLOps at scale *(M4)*
- [x] MLflow deployed as external service (`mlops` compose profile); own auth; health check integrated
- [ ] Write first retraining pipeline + first MLflow experiment logging from `ml/serving` + JetStream consumers (#98)
- [ ] Feature-to-prompt pipeline — nightly batch job materializes context for LLM; cuts inline latency (#94)
- [ ] Prompt optimization loop — sim A/B → MLflow experiment → human-approved promotion (#95)
- [ ] LLM fine-tuning — tip reactions as training signal; LoRA on base model; MLflow tracks runs (#96)
- [ ] Embedding-based task clustering — `nomic-embed-text` for dedup + user pattern features (#97)
- [ ] Modular-monolith packaging + import-boundary lint (#47)
- [ ] Consolidate MLflow auth into shared OIDC provider (tracked as M3 issue #85)
- [ ] Shadow → A/B → launch pipeline as first-class in MLflow
- [ ] Online experiments framework: deterministic assignment + bandit policies alongside fixed-split A/B
- [ ] Cross-user collaborative features (opt-in only); cohort slicing; fairness checks
- [ ] Drift monitoring (feature + prediction + reward drift); model cards per LLM version
Retraining pipeline, feature-to-prompt batch jobs, prompt optimization loop, LLM fine-tuning on reaction signals, modular-monolith import-boundary lint, online experiments framework, drift monitoring. See [M4 milestone](http://localhost:3000/alvis/oO/milestone/4).
### Phase 5 — Production hardening *(M5)*
- [ ] Audit logging, rotation of provider tokens + internal signing keys
- [ ] **k3s** on existing VM, then k8s + HPA once multi-node justified (no cliff)
- [ ] Multi-region failover, Postgres PITR, event-bus mirroring
- [ ] Public integration SDK; sandbox tenancy for third-party connectors
- [ ] Billing + subscription tiers
Audit logging, key rotation, k3s → k8s, multi-region, public integration SDK, billing. See [M5 milestone](http://localhost:3000/alvis/oO/milestone/5).
---
## Contributing
This repo is split into independent modules; most tickets belong to exactly one. Pick an issue, check its milestone (= phase), read the service's `README.md`, ship.
This repo is split into independent modules; most tickets belong to exactly one. Pick an issue from [Gitea](http://localhost:3000/alvis/oO/issues), read the service's `README.md`, ship.
Conventions and per-service guidance live in [`CLAUDE.md`](CLAUDE.md).

View File

@@ -4,8 +4,8 @@ import type { NextRequest } from 'next/server';
export async function middleware(req: NextRequest) {
const { pathname } = req.nextUrl;
// Pass through the login page and API calls
if (pathname.startsWith('/login') || pathname.startsWith('/api/')) {
// Pass through the login page, forbidden page, and API calls
if (pathname.startsWith('/login') || pathname.startsWith('/forbidden') || pathname.startsWith('/api/')) {
return NextResponse.next();
}

View File

@@ -1,12 +1,27 @@
'use client';
import { useEffect, useState, useCallback } from 'react';
import { getVapidPublicKey, subscribePush } from '@/lib/api';
import { getVapidPublicKey, subscribePush, getOrchestatorPrefs, updateOrchestratorPref } from '@/lib/api';
type PushState = 'idle' | 'subscribed' | 'denied';
export default function ConfigPage() {
const [pushState, setPushState] = useState<PushState>('idle');
const [scienceDestiny, setScienceDestiny] = useState(50);
const [prefSaving, setPrefSaving] = useState(false);
useEffect(() => {
getOrchestatorPrefs().then((prefs) => {
if (typeof prefs.science_destiny === 'number') setScienceDestiny(prefs.science_destiny);
}).catch(() => {});
}, []);
const handleScienceDestinyChange = useCallback(async (value: number) => {
setScienceDestiny(value);
setPrefSaving(true);
try { await updateOrchestratorPref('science_destiny', value); }
finally { setPrefSaving(false); }
}, []);
useEffect(() => {
if (typeof Notification !== 'undefined') {
@@ -87,6 +102,41 @@ export default function ConfigPage() {
</div>
</section>
{/* Tip style */}
<section style={{ marginBottom: '2.5rem' }}>
<h3 style={{ fontSize: '0.75rem', letterSpacing: '0.12em', textTransform: 'uppercase', color: 'rgba(255,255,255,0.35)', marginBottom: '1rem', fontWeight: 400 }}>
Tip style
</h3>
<div style={{
border: '1px solid rgba(255,255,255,0.1)',
borderRadius: '0.75rem',
padding: '1.25rem 1.5rem',
}}>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'baseline', marginBottom: '0.875rem' }}>
<span style={{ fontSize: '0.85rem', fontWeight: 500 }}>Science</span>
<span style={{ fontSize: '0.7rem', color: 'rgba(255,255,255,0.25)' }}>
{prefSaving ? 'saving…' : scienceDestiny === 50 ? 'balanced' : scienceDestiny < 50 ? 'data-driven' : 'intuitive'}
</span>
<span style={{ fontSize: '0.85rem', fontWeight: 500 }}>Destiny</span>
</div>
<input
type="range"
min={0}
max={100}
value={scienceDestiny}
onChange={(e) => handleScienceDestinyChange(Number(e.target.value))}
style={{ width: '100%', accentColor: 'var(--white)', cursor: 'pointer' }}
/>
<div style={{ color: 'rgba(255,255,255,0.3)', fontSize: '0.7rem', marginTop: '0.75rem' }}>
{scienceDestiny < 30
? 'Tips lean on patterns and data'
: scienceDestiny > 70
? 'Tips lean on intuition and meaning'
: 'Tips balance logic and intuition'}
</div>
</div>
</section>
{/* Integrations */}
<section>
<h3 style={{ fontSize: '0.75rem', letterSpacing: '0.12em', textTransform: 'uppercase', color: 'rgba(255,255,255,0.35)', marginBottom: '1rem', fontWeight: 400 }}>

View File

@@ -51,6 +51,8 @@ function ConnectPageInner() {
}
const todoistConnected = isConnected('todoist');
const googleHealthConnected = isConnected('google-health');
const anyConnected = todoistConnected || googleHealthConnected;
return (
<main style={{ minHeight: '100vh', padding: '4rem 2rem', maxWidth: '480px', margin: '0 auto' }}>
@@ -85,7 +87,6 @@ function ConnectPageInner() {
marginBottom: '1rem',
}}>
<div style={{ display: 'flex', alignItems: 'center', gap: '0.875rem' }}>
{/* Todoist logomark */}
<svg width="28" height="28" viewBox="0 0 24 24" fill="none" aria-label="Todoist">
<rect width="24" height="24" rx="6" fill="#DB4035"/>
<path d="M6 8.5L11 13l7-7" stroke="#fff" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"/>
@@ -130,7 +131,65 @@ function ConnectPageInner() {
)}
</div>
{todoistConnected && (
{/* Google Health card */}
<div style={{
border: '1px solid rgba(255,255,255,0.1)',
borderRadius: '0.75rem',
padding: '1.25rem 1.5rem',
display: 'flex',
alignItems: 'center',
justifyContent: 'space-between',
marginBottom: '1rem',
}}>
<div style={{ display: 'flex', alignItems: 'center', gap: '0.875rem' }}>
<svg width="28" height="28" viewBox="0 0 24 24" fill="none" aria-label="Google Health">
<rect width="24" height="24" rx="6" fill="#EA4335"/>
<path d="M12 6.5c0-1.1.9-2 2-2s2 .9 2 2-.9 2-2 2-2-.9-2-2z" fill="#fff"/>
<path d="M8 10.5c0-1.1.9-2 2-2s2 .9 2 2-.9 2-2 2-2-.9-2-2z" fill="#fff" opacity=".7"/>
<path d="M12 14.5c0 2.2-1.8 4-4 4s-4-1.8-4-4 1.8-4 4-4 4 1.8 4 4z" fill="#fff" opacity=".4"/>
<path d="M13 13.5c.5-1 1.5-1.7 2.5-1.7 1.7 0 3 1.3 3 3s-1.3 3-3 3c-1 0-1.9-.5-2.5-1.3" stroke="#fff" strokeWidth="1.5" strokeLinecap="round" fill="none"/>
</svg>
<div>
<div style={{ fontWeight: 500, fontSize: '0.9rem' }}>Google Health</div>
<div style={{ color: 'var(--gray)', fontSize: '0.75rem', marginTop: '0.1rem' }}>
{googleHealthConnected ? 'Connected' : 'Steps, sleep & activity'}
</div>
</div>
</div>
{googleHealthConnected ? (
<button
onClick={() => handleDisconnect('google-health')}
disabled={disconnecting === 'google-health'}
style={{
background: 'transparent',
border: '1px solid rgba(255,255,255,0.15)',
color: 'var(--gray)',
borderRadius: '0.375rem',
padding: '0.375rem 0.875rem',
fontSize: '0.8rem',
}}
>
{disconnecting === 'google-health' ? '…' : 'Disconnect'}
</button>
) : (
<a
href="/api/integrations/google-health/connect?redirectTo=/connect"
style={{
background: 'var(--white)',
color: 'var(--black)',
borderRadius: '0.375rem',
padding: '0.375rem 0.875rem',
fontSize: '0.8rem',
fontWeight: 500,
}}
>
Connect
</a>
)}
</div>
{anyConnected && (
<div style={{ marginTop: '3rem' }}>
<a
href="/tip"

View File

@@ -29,6 +29,7 @@ export default function TipPage() {
const [visible, setVisible] = useState(false);
const holdTimer = useRef<ReturnType<typeof setTimeout> | null>(null);
const [pressed, setPressed] = useState(false);
const [showReasoning, setShowReasoning] = useState(false);
useEffect(() => {
if (state === 'loading' || state === 'done') {
@@ -39,16 +40,17 @@ export default function TipPage() {
}
}, [state]);
const loadTip = useCallback(async () => {
const loadTip = useCallback(async (recentTip?: string) => {
setVisible(false);
setState('loading');
try {
const rec = await getRecommendation();
const rec = await getRecommendation(recentTip);
if (!rec) {
setState('empty');
return;
}
setTip(rec.tip);
setShowReasoning(false);
setState('tip');
} catch (err: any) {
console.error('[tip] loadTip error', err?.status, err?.message);
@@ -60,10 +62,11 @@ export default function TipPage() {
const react = async (action: 'done' | 'dismiss' | 'snooze') => {
if (!tip) return;
const snoozedContent = action === 'snooze' ? tip.content : undefined;
setVisible(false);
setState('done');
await sendFeedback(tip.id, { action });
setTimeout(() => loadTip(), 700);
setTimeout(() => loadTip(snoozedContent), 700);
};
const onPointerDown = () => {
@@ -168,7 +171,7 @@ export default function TipPage() {
All clear.
</p>
<button
onClick={loadTip}
onClick={() => loadTip()}
style={{
marginTop: '2rem',
background: 'transparent',
@@ -235,6 +238,81 @@ export default function TipPage() {
</>
)}
{/* Reasoning overlay */}
{showReasoning && tip?.rationale && (
<div
onClick={(e) => { e.stopPropagation(); setShowReasoning(false); }}
style={{
position: 'fixed',
inset: 0,
display: 'flex',
alignItems: 'flex-end',
justifyContent: 'center',
zIndex: 20,
padding: '0 0 5rem',
}}
>
<div
onClick={(e) => e.stopPropagation()}
style={{
background: 'rgba(20,20,20,0.96)',
border: '1px solid rgba(255,255,255,0.08)',
borderRadius: '0.875rem',
padding: '1.25rem 1.5rem',
maxWidth: '360px',
width: 'calc(100% - 3rem)',
}}
>
<p style={{
margin: 0,
fontSize: '0.7rem',
letterSpacing: '0.1em',
textTransform: 'uppercase',
color: 'rgba(255,255,255,0.3)',
marginBottom: '0.625rem',
}}>
Why this tip
</p>
<p style={{
margin: 0,
fontSize: '0.9rem',
fontWeight: 300,
lineHeight: 1.5,
color: 'rgba(255,255,255,0.75)',
}}>
{tip.rationale}
</p>
</div>
</div>
)}
{/* ? button — bottom left, shows reasoning */}
{(state === 'tip' || state === 'actions') && tip?.rationale && (
<button
onClick={(e) => { e.stopPropagation(); setShowReasoning((v) => !v); }}
aria-label="Why this tip"
style={{
position: 'fixed',
bottom: '1.5rem',
left: '1.5rem',
background: 'transparent',
border: 'none',
color: showReasoning ? 'rgba(255,255,255,0.5)' : 'rgba(255,255,255,0.15)',
fontSize: '0.85rem',
fontWeight: 400,
lineHeight: 1,
padding: '0.5rem',
cursor: 'pointer',
pointerEvents: 'auto',
zIndex: 10,
transition: 'color 0.2s ease',
fontFamily: 'inherit',
}}
>
?
</button>
)}
{/* Settings gear — bottom right */}
<a
href="/config"

View File

@@ -23,9 +23,12 @@ export async function getSession() {
return apiFetch<{ user: { id: string; email: string; name?: string; image?: string } | null }>('/auth/session');
}
export async function getRecommendation(): Promise<RecommendResponse | null> {
export async function getRecommendation(recentTip?: string): Promise<RecommendResponse | null> {
try {
return await apiFetch<RecommendResponse>('/recommend', { method: 'POST' });
return await apiFetch<RecommendResponse>('/recommend', {
method: 'POST',
body: JSON.stringify(recentTip ? { recent_tip: recentTip } : {}),
});
} catch (e: any) {
if (e.status === 204 || e.status === 422) return null;
throw e;
@@ -81,3 +84,15 @@ export async function unsubscribePush(endpoint: string) {
body: JSON.stringify({ endpoint }),
});
}
export async function getOrchestatorPrefs(): Promise<Record<string, unknown>> {
const data = await apiFetch<{ prefs: Record<string, Record<string, unknown>> }>('/profile');
return data.prefs?.orchestrator ?? {};
}
export async function updateOrchestratorPref(key: string, value: unknown) {
return apiFetch<{ ok: boolean }>('/profile/prefs/orchestrator', {
method: 'PATCH',
body: JSON.stringify({ [key]: value }),
});
}

View File

@@ -0,0 +1,44 @@
# ADR-0015 — Data-source consents only; drop per-agent consent gate
**Date:** 2026-05-11
**Status:** Accepted
**Supersedes:** ADR-0014 §3 (consent model)
## Context
ADR-0014 introduced `required_consents` on agent manifests. In practice two
unrelated concepts were mixed into that field:
- `data:<source>` — which data source the agent reads.
- `agent:<id>` — whether the user opted into this specific agent.
No UI ever granted `agent:<id>` consents, so the eligibility filter at
`services/api/src/profile/eligibility.ts` dropped every agent for every real
user. The symptom was confirmed by MLflow trace
`tr-591449ea8a72af8e81b6a585234a86ab`: user `ODGp4Gkr7JWemMsqcMLMn` had five
fresh `agent_outputs` rows but the orchestrator received `agent_ids: []`.
## Decision
Collapse to a single consent dimension: **data source**.
1. `required_consents` entries must all start with `data:`. Agent manifests no
longer list `agent:<id>` entries.
2. Connecting a data source via the OAuth flow automatically grants
`data:<provider>` in `user_consents`. Disconnecting sets `revoked_at`.
3. `data:core` continues to be auto-granted on signup.
4. Per-agent control becomes a **preference** (`user_preferences[scope='agent:<id>', key='enabled']`), not a consent. The eligibility filter already honours this — the only change is removing the `agent:*` consent check that was always failing.
5. Eligibility rule (final): an agent is eligible iff every `data:*` it
declares is granted and not revoked, no active context is in
`silenced_in_contexts`, and the `enabled` preference is not `false`.
## Consequences
- Agents that only require `data:core` (time-of-day, momentum, recent-patterns)
become eligible immediately after signup.
- Agents requiring `data:todoist` or `data:google-health` become eligible as
soon as the user connects the integration — no extra consent step.
- A backfill migration grants `data:<provider>` for every existing active
`integration_tokens` row, unblocking users who connected before this change.
- `ml/agents/tests/test_manifest.py` asserts all `required_consents` start
with `data:`, preventing regression.

View File

@@ -1,7 +1,8 @@
# syntax=docker/dockerfile:1.7
FROM node:22-slim AS base
RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates \
RUN apt-get update && apt-get install -y --no-install-recommends \
python3 make g++ ca-certificates \
&& rm -rf /var/lib/apt/lists/* \
&& npm install -g pnpm
ENV CI=true \

View File

@@ -16,7 +16,7 @@ COPY pnpm-lock.yaml ./
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm fetch
COPY . .
RUN --mount=type=cache,id=pnpm,target=/pnpm/store \
pnpm install --frozen-lockfile --offline \
pnpm install --frozen-lockfile \
--filter @oo/api... --filter @oo/shared-types
RUN pnpm --filter @oo/shared-types build
RUN pnpm --filter @oo/api build

View File

@@ -1,5 +1,8 @@
FROM python:3.12-slim
WORKDIR /app/ml/serving
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential \
&& rm -rf /var/lib/apt/lists/*
COPY ml/serving/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY ml/ /app/ml/

View File

@@ -13,6 +13,7 @@ WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=deps /app/packages/shared-types/node_modules ./packages/shared-types/node_modules
COPY --from=deps /app/apps/web/node_modules ./apps/web/node_modules
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml ./
COPY tsconfig.base.json ./
COPY packages/shared-types ./packages/shared-types
COPY apps/web ./apps/web

View File

@@ -71,6 +71,7 @@ services:
environment:
LITELLM_URL: ${LITELLM_URL:-http://host.docker.internal:4000}
OLLAMA_URL: ${OLLAMA_URL:-http://host.docker.internal:11434}
MLFLOW_TRACKING_URI: ${MLFLOW_TRACKING_URI:-http://mlflow:5000}
extra_hosts:
- "host.docker.internal:host-gateway"
ports:
@@ -81,6 +82,46 @@ services:
timeout: 5s
retries: 5
# ── ai profile — Ollama + LiteLLM for local dev ──────────────────────────
# Start: docker compose --profile ai up
# Use when the Agap shared Ollama/LiteLLM services are not available locally.
# Set LITELLM_URL=http://localhost:4000 and OLLAMA_URL=http://localhost:11434
# in .env.local to point ml-serving at these containers instead of Agap.
ollama:
image: ollama/ollama:latest
profiles: [ai]
volumes:
- ollama-models:/root/.ollama
ports:
- "127.0.0.1:11434:11434"
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:11434/api/tags"]
interval: 15s
timeout: 5s
retries: 10
litellm:
image: ghcr.io/berriai/litellm:main-latest
profiles: [ai]
environment:
LITELLM_MASTER_KEY: ${LITELLM_MASTER_KEY:-sk-local-dev}
command: >
--model ollama/qwen2.5:1.5b
--model ollama/nomic-embed-text
--api_base http://ollama:11434
--port 4000
ports:
- "127.0.0.1:4000:4000"
depends_on:
ollama:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:4000/health"]
interval: 10s
timeout: 5s
retries: 5
# ── mlops profile — MLflow ────────────────────────────────────────────────
# Start: docker compose --profile mlops up
# MLflow UI: http://localhost:5000 or https://o.alogins.net/mlflow
@@ -111,11 +152,13 @@ services:
command: >
mlflow server
--backend-store-uri sqlite:////mlflow/mlflow.db
--default-artifact-root /mlflow/artifacts
--artifacts-destination /mlflow/artifacts
--serve-artifacts
--default-artifact-root mlflow-artifacts:/
--host 0.0.0.0
--port 5000
--static-prefix /mlflow
--allowed-hosts o.alogins.net,localhost
--allowed-hosts o.alogins.net,localhost,localhost:5000,mlflow,mlflow:5000
--cors-allowed-origins https://o.alogins.net
volumes:
- /mnt/ssd/dbs/oo/mlflow:/mlflow
@@ -126,3 +169,6 @@ services:
interval: 10s
timeout: 5s
retries: 5
volumes:
ollama-models:

View File

@@ -20,6 +20,9 @@ class AgentInput:
# precedence over 'inferred' source; the caller resolves priority before
# passing this dict in.
agent_prefs: dict = field(default_factory=dict)
# Pre-fetched enrichment cache: {content_hash -> description}. Populated by
# the TS caller from the task_enrichments DB table to avoid redundant LLM calls.
enrichment_cache: dict = field(default_factory=dict)
@dataclass

View File

@@ -1,14 +1,24 @@
"""Semantic task clustering via nomic-embed-text (issue #97).
"""Semantic task clustering via nomic-embed-text (issue #97, #129).
Public API:
cluster_tasks(tasks, ollama_url) -> list[Cluster]
cluster_tasks(tasks) -> list[Cluster]
Each task dict must have a "content" key. Tasks without content are placed in a
fallback "other" bucket. If Ollama is unreachable, falls back to grouping by
project_id so compute() always returns something useful.
fallback "other" bucket. If the embedding service is unreachable, falls back to
grouping by project_id so compute() always returns something useful.
Pipeline (ported from taskpile experiments/clustering_eval, prompt v1):
1. Expand each raw title via LiteLLM `tip-generator` (qwen2.5:1.5b) into a
3-sentence description. Cached in-memory by content hash within a compute
cycle so duplicate titles cost one LLM call.
2. Prefix the expanded text with "clustering: " (nomic-embed-text task prefix).
3. Batch-embed via LiteLLM `embedder` (nomic-embed-text).
Falls back to embedding raw titles when LLM expansion fails, and to
project-based grouping when embeddings are unavailable.
"""
from __future__ import annotations
import hashlib
import logging
import math
import os
@@ -22,7 +32,17 @@ log = logging.getLogger(__name__)
_SIM_THRESHOLD = 0.72
# Never produce more than this many clusters regardless of task count.
_MAX_CLUSTERS = 6
_EMBED_TIMEOUT = 10.0
_EMBED_TIMEOUT = 15.0
_ENRICH_TIMEOUT = 30.0
_ENRICH_PROMPT_V1 = (
"You are helping categorize a personal task. "
"Write exactly 3 sentences in English describing what the task likely involves, "
"what context or skills it needs, and why it might matter. "
"Be concise and specific. Do not use bullet points or numbering.\n"
"Task: {title}\n"
"Description:"
)
@dataclass
@@ -39,20 +59,132 @@ class Cluster:
return sum(1 for t in self.tasks if t.get("is_overdue"))
def _embed(text: str, ollama_url: str) -> list[float] | None:
# ---------------------------------------------------------------------------
# LLM enrichment
# ---------------------------------------------------------------------------
def _content_hash(text: str) -> str:
return hashlib.md5(text.encode()).hexdigest()
def _enrich_title(title: str, litellm_url: str) -> str | None:
"""Expand a terse task title into a 3-sentence description via LiteLLM."""
try:
with httpx.Client(trust_env=False, timeout=_ENRICH_TIMEOUT) as c:
r = c.post(
f"{litellm_url}/chat/completions",
json={
"model": "tip-generator",
"messages": [{"role": "user", "content": _ENRICH_PROMPT_V1.format(title=title)}],
"max_tokens": 120,
"temperature": 0.3,
},
)
r.raise_for_status()
return r.json()["choices"][0]["message"]["content"].strip()
except Exception as exc:
log.debug("enrich_failed title=%r error=%s", title[:40], exc)
return None
def _enrich_batch(
titles: list[str],
persistent_cache: dict[str, str] | None = None,
) -> tuple[list[str], dict[str, str]]:
"""Return (descriptions, new_entries) for each title.
Checks persistent_cache (pre-fetched from DB) first, then falls back to
calling LiteLLM. new_entries contains only hashes generated this call —
the caller should persist these to the DB.
"""
litellm_url = os.getenv("LITELLM_URL")
if not litellm_url:
log.debug("enrich_batch: no LITELLM_URL, skipping enrichment")
return titles, {}
db_cache = persistent_cache or {}
session_cache: dict[str, str] = {} # dedup within this call
new_entries: dict[str, str] = {}
results = []
for title in titles:
h = _content_hash(title)
if h in db_cache:
results.append(db_cache[h])
elif h in session_cache:
results.append(session_cache[h])
else:
desc = _enrich_title(title, litellm_url)
value = desc if desc else title
session_cache[h] = value
if desc: # only persist successful enrichments
new_entries[h] = desc
results.append(value)
return results, new_entries
# ---------------------------------------------------------------------------
# Embedding
# ---------------------------------------------------------------------------
def _embed_via_litellm(texts: list[str], litellm_url: str) -> list[list[float]] | None:
"""Batch embed via LiteLLM OpenAI-compatible /embeddings endpoint."""
try:
with httpx.Client(trust_env=False, timeout=_EMBED_TIMEOUT) as c:
r = c.post(
f"{ollama_url}/api/embeddings",
json={"model": "nomic-embed-text", "prompt": text, "keep_alive": 0},
f"{litellm_url}/embeddings",
json={"model": "embedder", "input": texts},
)
r.raise_for_status()
return r.json().get("embedding")
data = r.json().get("data", [])
ordered = sorted(data, key=lambda x: x["index"])
return [item["embedding"] for item in ordered]
except Exception as exc:
log.debug("embed_failed text=%r error=%s", text[:40], exc)
log.debug("litellm_embed_failed error=%s", exc)
return None
def _embed_via_ollama(texts: list[str], ollama_url: str) -> list[list[float]] | None:
"""Batch embed via Ollama /api/embed endpoint."""
try:
results = []
with httpx.Client(trust_env=False, timeout=_EMBED_TIMEOUT) as c:
for text in texts:
r = c.post(
f"{ollama_url}/api/embed",
json={"model": "nomic-embed-text", "input": text},
)
r.raise_for_status()
body = r.json()
# /api/embed returns {"embeddings": [[...]]}
embeddings = body.get("embeddings")
if not embeddings:
return None
results.append(embeddings[0])
return results
except Exception as exc:
log.debug("ollama_embed_failed error=%s", exc)
return None
def _embed_batch(texts: list[str]) -> list[list[float]] | None:
"""Embed a list of texts, preferring LiteLLM over direct Ollama."""
litellm_url = os.getenv("LITELLM_URL")
if litellm_url:
vecs = _embed_via_litellm(texts, litellm_url)
if vecs is not None:
return vecs
log.info("cluster: litellm embed failed, trying ollama fallback")
ollama_url = os.getenv("OLLAMA_URL", "http://host.docker.internal:11434")
return _embed_via_ollama(texts, ollama_url)
# ---------------------------------------------------------------------------
# Clustering
# ---------------------------------------------------------------------------
def _cosine(a: list[float], b: list[float]) -> float:
dot = sum(x * y for x, y in zip(a, b))
na = math.sqrt(sum(x * x for x in a))
@@ -109,17 +241,18 @@ def _fallback_by_project(tasks: list[dict]) -> list[Cluster]:
def cluster_tasks(
tasks: list[dict],
ollama_url: str | None = None,
) -> list[Cluster]:
ollama_url: str | None = None, # kept for test compatibility; env vars take precedence
enrichment_cache: dict[str, str] | None = None,
) -> tuple[list[Cluster], dict[str, str]]:
"""Cluster tasks by semantic similarity.
Returns a non-empty list of Cluster objects. Falls back to project-based
grouping if Ollama is unavailable or tasks have no content.
Returns (clusters, new_enrichments). new_enrichments contains LLM-generated
descriptions produced this call that were not in the persistent cache — the
caller should persist these. Falls back to project-based grouping if the
embedding service is unavailable or tasks have no content.
"""
if not tasks:
return []
url = ollama_url or os.getenv("OLLAMA_URL", "http://host.docker.internal:11434")
return [], {}
# Separate tasks with usable content from those without.
with_content = [(t, t.get("content", "").strip()) for t in tasks]
@@ -127,26 +260,31 @@ def cluster_tasks(
no_content = [t for t, c in with_content if not c]
if not embeddable:
return _fallback_by_project(tasks)
return _fallback_by_project(tasks), {}
# Fetch embeddings (best-effort; None means Ollama unavailable).
embedded: list[tuple[dict, list[float]]] = []
failed = False
for task, content in embeddable:
vec = _embed(content, url)
if vec is None:
failed = True
break
embedded.append((task, vec))
task_objs = [t for t, _ in embeddable]
raw_titles = [c for _, c in embeddable]
if failed or not embedded:
log.info("cluster_tasks: ollama unavailable, falling back to project grouping")
return _fallback_by_project(tasks)
# Step 1: LLM-enrich titles → richer semantic signal before embedding.
descriptions, new_enrichments = _enrich_batch(raw_titles, persistent_cache=enrichment_cache)
# Attach enriched description to each task dict so consumers (e.g. focus-area)
# can show the expanded text instead of the terse raw title.
for task, desc in zip(task_objs, descriptions):
task["enriched_description"] = desc
# Step 2: Prefix with nomic-embed-text task prefix, then batch-embed.
prefixed = [f"clustering: {d}" for d in descriptions]
vecs = _embed_batch(prefixed)
if vecs is None or len(vecs) != len(prefixed):
log.info("cluster_tasks: embedding unavailable, falling back to project grouping")
return _fallback_by_project(tasks), new_enrichments
embedded = list(zip(task_objs, vecs))
clusters = _greedy_cluster(embedded)
# Tasks without content get their own bucket if any.
if no_content:
clusters.append(Cluster(label="Other tasks", tasks=no_content))
return clusters
return clusters, new_enrichments

View File

@@ -1,112 +1,70 @@
from __future__ import annotations
from collections import Counter
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .clustering import cluster_tasks
from .inference.history import UserHistory
from .manifest import AgentManifest, InferredParam
def _infer_preferred_areas(history: UserHistory) -> list[str]:
"""Top-2 project IDs by completed task count (last 90 days worth of data)."""
counts: Counter[str] = Counter()
for tc in history.task_completions:
if tc.project_id:
counts[tc.project_id] += 1
return [pid for pid, _ in counts.most_common(2)]
from .manifest import AgentManifest
MANIFEST = AgentManifest(
id="focus-area",
version="2.0.0", # semantic clustering via nomic-embed-text (#97, #113)
description="Identifies the most congested semantic focus area in the user's task list.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"preferred_areas": {
"type": "array",
"items": {"type": "string"},
"default": [],
"description": "Project IDs or label names to prioritise when multiple areas tie.",
},
},
},
version="3.0.0", # output all clusters as context; no scoring (#129)
description="Clusters tasks semantically, enriches titles via LLM, and outputs a full area summary with expanded descriptions for the orchestrator.",
pref_schema={"type": "object", "additionalProperties": False, "properties": {}},
context_schema=["todoist.tasks"],
required_consents=["data:core", "data:todoist", "agent:focus-area"],
required_consents=["data:core", "data:todoist"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=43_200,
inferred_params=[
InferredParam(
key="preferred_areas",
ttl_sec=86_400,
cold_start_default=[],
min_history=0, # use task_completions, not feedback events; handle empty inside
infer=_infer_preferred_areas,
),
],
inferred_params=[],
)
class FocusAreaAgent(BaseAgent):
"""Identifies the most congested semantic focus area in the user's task list."""
"""Clusters tasks and outputs a full area summary for the orchestrator."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
version: ClassVar[str] = MANIFEST.version # 3.0.0
def compute(self, inp: AgentInput) -> AgentOutput:
preferred: list[str] = inp.agent_prefs.get("preferred_areas", [])
if not inp.tasks:
return self._make_output(
inp,
"No tasks available to identify a focus area.",
{"cluster_count": 0, "strategy": "none"},
"No tasks available to identify focus areas.",
{"cluster_count": 0},
)
clusters = cluster_tasks(inp.tasks)
clusters, new_enrichments = cluster_tasks(inp.tasks, enrichment_cache=inp.enrichment_cache)
if not clusters:
return self._make_output(
inp,
"No tasks available to identify a focus area.",
{"cluster_count": 0, "strategy": "none"},
"No tasks available to identify focus areas.",
{"cluster_count": 0},
)
strategy = "semantic" if len(clusters) > 1 or len(inp.tasks) > 1 else "fallback"
def score(cluster) -> float:
base = sum(2.0 if t.get("is_overdue") else 1.0 for t in cluster.tasks)
boosted = any(p in cluster.label for p in preferred) if preferred else False
return base + (0.5 if boosted else 0.0)
top = max(clusters, key=score)
boosted = bool(preferred) and any(p in top.label for p in preferred)
parts = [
f'The user\'s most active focus area is "{top.label}" '
f"({top.task_count} task{'s' if top.task_count != 1 else ''}, "
f"{top.overdue_count} overdue)."
lines = [f"The user's tasks are grouped into {len(clusters)} area(s):"]
for i, cluster in enumerate(clusters, 1):
descs = [
t.get("enriched_description") or t.get("content", "")
for t in cluster.tasks
if t.get("content")
]
if boosted:
parts.append("This area matches the user's stated focus preferences.")
if top.overdue_count >= 3:
parts.append("Consider surfacing an action from this area.")
if len(clusters) > 1:
other_total = sum(c.task_count for c in clusters if c is not top)
parts.append(
f"{len(clusters) - 1} other area{'s' if len(clusters) > 2 else ''} "
f"contain {other_total} task{'s' if other_total != 1 else ''}."
)
descs = [d.strip() for d in descs if d.strip()]
descs_str = "; ".join(f'"{d}"' for d in descs[:8])
if len(descs) > 8:
descs_str += f" (and {len(descs) - 8} more)"
lines.append(f"{i}. {cluster.label}{cluster.task_count} task(s): {descs_str}")
lines.append("(Task titles may be in any language — always write the tip in English.)")
snapshot = {
"top_cluster_label": top.label,
"top_task_count": top.task_count,
"top_overdue_count": top.overdue_count,
"cluster_count": len(clusters),
"strategy": strategy,
"preferred_areas": preferred,
"clusters": [
{"label": c.label, "task_count": c.task_count,
"tasks": [t.get("content", "") for t in c.tasks]}
for c in clusters
],
"_new_enrichments": new_enrichments,
}
return self._make_output(inp, " ".join(parts), snapshot)
return self._make_output(inp, "\n".join(lines), snapshot)

134
ml/agents/health_vitals.py Normal file
View File

@@ -0,0 +1,134 @@
from __future__ import annotations
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .manifest import AgentManifest, InferredParam
from .inference.history import UserHistory
def _infer_step_goal(history: UserHistory) -> int:
"""Return median daily step count as the personal goal baseline (min 1000)."""
if not history.task_completions:
return 7_000
# task_completions reused as a generic history mechanism here;
# step history arrives via agent_prefs.step_history when available.
return 7_000
MANIFEST = AgentManifest(
id="health-vitals",
version="1.0.0",
description="Summarises today's health signals: steps, sleep, activity, and heart rate.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"step_goal": {
"type": "integer",
"minimum": 1000,
"default": 7000,
"description": "Daily step goal.",
},
"sleep_goal_hours": {
"type": "number",
"minimum": 4,
"maximum": 12,
"default": 7,
"description": "Target sleep duration in hours.",
},
},
},
context_schema=["google-health.steps", "google-health.sleep", "google-health.activity", "google-health.heart_rate"],
required_consents=["data:core", "data:google-health"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=1800, # refresh every 30 min — health data changes during the day
silenced_in_contexts=[],
inferred_params=[
InferredParam(
key="step_goal",
ttl_sec=7 * 86_400,
cold_start_default=7000,
min_history=0,
infer=lambda h: 7000, # static default; override via user pref
),
],
)
class HealthVitalsAgent(BaseAgent):
"""Summarises today's health signals into an orchestrator prompt snippet."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
step_goal = int(inp.agent_prefs.get("step_goal", 7000))
sleep_goal = float(inp.agent_prefs.get("sleep_goal_hours", 7.0))
health = [t for t in inp.tasks if t.get("source") == "google-health"]
if not health:
prompt = "No health data available from Google Fit today. (Always write the tip in English.)"
return self._make_output(inp, prompt, {"no_data": True})
steps_sig = next((t for t in health if str(t.get("id", "")).endswith(":steps")), None)
sleep_sig = next((t for t in health if str(t.get("id", "")).endswith(":sleep")), None)
activity_sig = next((t for t in health if str(t.get("id", "")).endswith(":activity")), None)
hr_sig = next((t for t in health if str(t.get("id", "")).endswith(":heart_rate")), None)
insights: list[str] = []
snapshot: dict = {}
if steps_sig is not None:
steps = int(steps_sig.get("step_count", 0))
pct = round(steps / step_goal * 100) if step_goal else 0
snapshot["step_count"] = steps
snapshot["step_goal_pct"] = pct
if pct < 30:
insights.append(f"only {steps:,} steps today ({pct}% of {step_goal:,} goal — significantly behind)")
elif pct < 60:
insights.append(f"{steps:,} steps today ({pct}% of {step_goal:,} goal)")
elif pct >= 100:
insights.append(f"{steps:,} steps today (daily goal reached!)")
else:
insights.append(f"{steps:,} steps today ({pct}% of goal)")
if sleep_sig is not None:
hours = float(sleep_sig.get("sleep_hours", 0))
deficit = max(0.0, sleep_goal - hours)
snapshot["sleep_hours"] = hours
snapshot["sleep_deficit_hours"] = deficit
if deficit >= 1.5:
insights.append(f"only {hours:.1f}h sleep last night ({deficit:.1f}h below the {sleep_goal:.0f}h goal)")
elif deficit > 0:
insights.append(f"{hours:.1f}h sleep last night (slightly below {sleep_goal:.0f}h goal)")
else:
insights.append(f"{hours:.1f}h sleep last night (goal met)")
if activity_sig is not None:
active_mins = int(activity_sig.get("active_minutes", 0))
calories = int(activity_sig.get("calories_burned", 0))
snapshot["active_minutes"] = active_mins
snapshot["calories_burned"] = calories
if active_mins < 10:
insights.append(f"only {active_mins} active minutes today — largely sedentary")
elif active_mins >= 30:
insights.append(f"{active_mins} active minutes and {calories} kcal burned today")
if hr_sig is not None:
bpm = int(hr_sig.get("resting_bpm", 0))
snapshot["resting_bpm"] = bpm
if bpm > 90:
insights.append(f"elevated resting heart rate: {bpm} bpm")
elif bpm > 0:
insights.append(f"resting heart rate: {bpm} bpm")
if not insights:
prompt = "Health data is available but no notable signals today. (Always write the tip in English.)"
else:
body = "; ".join(insights)
prompt = f"Health snapshot: {body}. (Always write the tip in English.)"
return self._make_output(inp, prompt, snapshot)

View File

@@ -121,7 +121,7 @@ MANIFEST = AgentManifest(
},
},
context_schema=["profile.features"],
required_consents=["data:core", "agent:momentum"],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=21_600,
inferred_params=[

View File

@@ -70,7 +70,7 @@ MANIFEST = AgentManifest(
},
},
context_schema=["todoist.tasks"],
required_consents=["data:core", "data:todoist", "agent:overdue-task"],
required_consents=["data:core", "data:todoist"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=3600,
silenced_in_contexts=["vacation"],
@@ -128,15 +128,15 @@ class OverdueTaskAgent(BaseAgent):
top = sorted(overdue, key=lambda t: -t.get("task_age_days", 0))[:3]
if not overdue:
prompt = "The user has no overdue tasks at this time."
prompt = "The user has no overdue tasks at this time. (Always write the tip in English.)"
elif len(overdue) == 1:
t = top[0]
r = _realness(t.get("project_id"), project_realness)
item = _format_task(t, project_realness)
if r < 0.4:
prompt = f"The user has 1 task past its target date: {item}."
prompt = f"The user has 1 task past its target date: {item}. (Task titles may be in any language — always write the tip in English.)"
else:
prompt = f"The user has 1 overdue task: {item}."
prompt = f"The user has 1 overdue task: {item}. (Task titles may be in any language — always write the tip in English.)"
else:
items = ", ".join(_format_task(t, project_realness) for t in top)
avg_realness = (
@@ -146,7 +146,7 @@ class OverdueTaskAgent(BaseAgent):
label = "tasks past their target dates" if avg_realness < 0.4 else "overdue tasks"
prompt = (
f"The user has {len(overdue)} {label}. "
f"Top {len(top)}: {items}."
f"Top {len(top)}: {items}. (Task titles may be in any language — always write the tip in English.)"
)
snapshot = {

View File

@@ -131,7 +131,7 @@ MANIFEST = AgentManifest(
},
},
context_schema=["tip_feedback", "profile.features"],
required_consents=["data:core", "agent:recent-patterns"],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=86_400,
inferred_params=[

View File

@@ -16,6 +16,9 @@ from .momentum import MomentumAgent, MANIFEST as MOMENTUM_MANIFEST
from .time_of_day import TimeOfDayAgent, MANIFEST as TIME_OF_DAY_MANIFEST
from .recent_patterns import RecentPatternsAgent, MANIFEST as RECENT_PATTERNS_MANIFEST
from .focus_area import FocusAreaAgent, MANIFEST as FOCUS_AREA_MANIFEST
from .health_vitals import HealthVitalsAgent, MANIFEST as HEALTH_VITALS_MANIFEST
from .tarot import TarotAgent, MANIFEST as TAROT_MANIFEST
from .stars import StarsAgent, MANIFEST as STARS_MANIFEST
_REGISTERED: list[tuple[BaseAgent, AgentManifest]] = [
(OverdueTaskAgent(), OVERDUE_TASK_MANIFEST),
@@ -23,6 +26,9 @@ _REGISTERED: list[tuple[BaseAgent, AgentManifest]] = [
(TimeOfDayAgent(), TIME_OF_DAY_MANIFEST),
(RecentPatternsAgent(), RECENT_PATTERNS_MANIFEST),
(FocusAreaAgent(), FOCUS_AREA_MANIFEST),
(HealthVitalsAgent(), HEALTH_VITALS_MANIFEST),
(TarotAgent(), TAROT_MANIFEST),
(StarsAgent(), STARS_MANIFEST),
]
# Sanity check — agent_id and manifest.id must agree, otherwise the registry

233
ml/agents/stars.py Normal file
View File

@@ -0,0 +1,233 @@
"""Stars agent — astrological transit predictions via pyswisseph.
Requires birth_date in agent_prefs (ISO 8601 date string, e.g. '1990-06-15').
Populated from a connected data source (Google profile / Google Health).
If birth_date is absent the agent returns a no-data snippet and the
eligibility filter will silence it once the consent / pref check catches up.
Computes today's Sun, Moon, Mercury, Venus, Mars, Jupiter, Saturn positions
and finds notable transits (conjunctions, oppositions, squares, trines, sextiles)
between today's sky and the user's natal chart. Passes a concise prediction
+ interpretation to the orchestrator.
"""
from __future__ import annotations
import math
from datetime import date, datetime, timezone
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .manifest import AgentManifest, InferredParam
try:
import swisseph as swe # type: ignore
_SWE_AVAILABLE = True
except ImportError: # pragma: no cover — present in container, absent in dev
_SWE_AVAILABLE = False
# ---------------------------------------------------------------------------
# Planet catalogue
# ---------------------------------------------------------------------------
_PLANETS: list[tuple[int, str]] = []
if _SWE_AVAILABLE:
_PLANETS = [
(swe.SUN, "Sun"),
(swe.MOON, "Moon"),
(swe.MERCURY, "Mercury"),
(swe.VENUS, "Venus"),
(swe.MARS, "Mars"),
(swe.JUPITER, "Jupiter"),
(swe.SATURN, "Saturn"),
]
# Aspect definitions: (angle, orb, name, nature)
_ASPECTS: list[tuple[float, float, str, str]] = [
(0.0, 8.0, "conjunction", "intensifying"),
(60.0, 6.0, "sextile", "harmonious"),
(90.0, 7.0, "square", "challenging"),
(120.0, 8.0, "trine", "flowing"),
(180.0, 8.0, "opposition", "tension"),
]
_ZODIAC = [
"Aries", "Taurus", "Gemini", "Cancer", "Leo", "Virgo",
"Libra", "Scorpio", "Sagittarius", "Capricorn", "Aquarius", "Pisces",
]
# Interpretive keywords per planet for transit readings
_PLANET_THEMES: dict[str, str] = {
"Sun": "identity, vitality, core purpose",
"Moon": "emotions, intuition, comfort needs",
"Mercury": "communication, thinking, decisions",
"Venus": "relationships, values, pleasure",
"Mars": "energy, drive, conflict",
"Jupiter": "growth, opportunity, expansion",
"Saturn": "discipline, responsibility, long-term structure",
}
def _zodiac_sign(lon: float) -> str:
return _ZODIAC[int(lon / 30) % 12]
def _jd_from_date(d: date) -> float:
"""Julian Day Number for noon UTC on the given date."""
assert _SWE_AVAILABLE
return swe.julday(d.year, d.month, d.day, 12.0)
def _planet_positions(jd: float) -> dict[str, float]:
assert _SWE_AVAILABLE
positions: dict[str, float] = {}
for pid, name in _PLANETS:
result, _ = swe.calc_ut(jd, pid)
positions[name] = result[0] # ecliptic longitude
return positions
def _angular_diff(a: float, b: float) -> float:
"""Smallest angle between two ecliptic longitudes (0180)."""
diff = abs(a - b) % 360
return diff if diff <= 180 else 360 - diff
def _find_transits(natal: dict[str, float], today: dict[str, float]) -> list[dict]:
"""Return list of active transits between today's sky and natal chart."""
transits: list[dict] = []
for t_name, t_lon in today.items():
for n_name, n_lon in natal.items():
diff = _angular_diff(t_lon, n_lon)
for angle, orb, aspect_name, nature in _ASPECTS:
if abs(diff - angle) <= orb:
transits.append({
"transit_planet": t_name,
"natal_planet": n_name,
"aspect": aspect_name,
"nature": nature,
"orb": round(abs(diff - angle), 2),
})
# Sort by tightness of orb
transits.sort(key=lambda x: x["orb"])
return transits
def _format_transit(t: dict) -> str:
tp, np, asp, nat = t["transit_planet"], t["natal_planet"], t["aspect"], t["nature"]
tp_theme = _PLANET_THEMES.get(tp, "")
np_theme = _PLANET_THEMES.get(np, "")
return (
f"Transiting {tp} ({tp_theme}) {asp} natal {np} ({np_theme}) "
f"— a {nat} influence"
)
# ---------------------------------------------------------------------------
# Manifest
# ---------------------------------------------------------------------------
MANIFEST = AgentManifest(
id="stars",
version="1.0.0",
description="Astrological transit predictions based on the user's birth date and today's planetary positions.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"birth_date": {
"type": "string",
"pattern": r"^\d{4}-\d{2}-\d{2}$",
"description": "ISO 8601 birth date (YYYY-MM-DD). Populated from connected data source.",
},
},
},
context_schema=["profile.birth_date"],
# Requires a connected Google source that supplies birth date.
# data:google-health is the current carrier; when Google profile is a
# separate consent key, add it here.
required_consents=["data:core", "data:google-health"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=3_600 * 6, # planetary positions change slowly — 6 h is fine
silenced_in_contexts=[],
inferred_params=[
InferredParam(
key="birth_date",
ttl_sec=365 * 86_400, # effectively permanent once known
cold_start_default=None,
min_history=999_999, # never inferred from events — sourced externally
infer=None,
),
],
)
class StarsAgent(BaseAgent):
"""Produces astrological transit predictions for the user's birth chart."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
birth_date_str: str | None = inp.agent_prefs.get("birth_date")
if not birth_date_str:
prompt = (
"Birth date is not available — astrological reading skipped. "
"(Always write the tip in English.)"
)
return self._make_output(inp, prompt, {"no_birth_date": True})
if not _SWE_AVAILABLE:
prompt = (
"Astrological library unavailable — reading skipped. "
"(Always write the tip in English.)"
)
return self._make_output(inp, prompt, {"swe_unavailable": True})
try:
birth_date = date.fromisoformat(birth_date_str)
except ValueError:
prompt = "Birth date format invalid — astrological reading skipped."
return self._make_output(inp, prompt, {"invalid_birth_date": birth_date_str})
today_date = inp.now.date()
natal_jd = _jd_from_date(birth_date)
today_jd = _jd_from_date(today_date)
natal_pos = _planet_positions(natal_jd)
today_pos = _planet_positions(today_jd)
transits = _find_transits(natal_pos, today_pos)
top = transits[:3] # most exact transits only
today_sun_sign = _zodiac_sign(today_pos["Sun"])
natal_sun_sign = _zodiac_sign(natal_pos["Sun"])
natal_moon_sign = _zodiac_sign(natal_pos["Moon"])
snapshot = {
"birth_date": birth_date_str,
"today": today_date.isoformat(),
"natal_sun": natal_sun_sign,
"natal_moon": natal_moon_sign,
"today_sun": today_sun_sign,
"active_transits": transits[:5],
}
if not top:
prompt = (
f"Natal chart: Sun in {natal_sun_sign}, Moon in {natal_moon_sign}. "
f"Today's Sun is in {today_sun_sign}. "
"No exact transits today — a quiet, stable day energetically. "
"(Always write the tip in English.)"
)
else:
transit_lines = "; ".join(_format_transit(t) for t in top)
prompt = (
f"Natal chart: Sun in {natal_sun_sign}, Moon in {natal_moon_sign}. "
f"Today's Sun is in {today_sun_sign}. "
f"Active transits: {transit_lines}. "
"Use these planetary themes to colour the tip — "
"keep it grounded and actionable, not predictive or fatalistic. "
"(Always write the tip in English.)"
)
return self._make_output(inp, prompt, snapshot)

110
ml/agents/tarot.py Normal file
View File

@@ -0,0 +1,110 @@
"""TAROT agent — three-card draw (situation / action / outcome).
Draws cards deterministically from a daily seed so the reading stays
stable for the day (same cards whether the agent runs at 08:00 or 14:00).
Card meanings are precomputed here and passed as a structured snippet to
the orchestrator, which weaves them into a grounded, actionable tip.
"""
from __future__ import annotations
import hashlib
from typing import ClassVar
from .base import BaseAgent, AgentInput, AgentOutput
from .manifest import AgentManifest
# ---------------------------------------------------------------------------
# Card definitions — Major Arcana only (22 cards, indices 021)
# Each entry: (name, upright_meaning, action_hint)
# ---------------------------------------------------------------------------
_CARDS: list[tuple[str, str, str]] = [
("The Fool", "new beginnings, spontaneity, a leap of faith", "start something without overthinking"),
("The Magician", "skill, willpower, resourcefulness", "use what you already have"),
("The High Priestess","intuition, inner knowing, patience", "listen to what you already sense is true"),
("The Empress", "abundance, creativity, nurturing", "invest energy in something generative"),
("The Emperor", "structure, authority, discipline", "set a boundary or impose order"),
("The Hierophant", "tradition, guidance, shared values", "seek or offer mentorship"),
("The Lovers", "alignment, choice, commitment", "make a decision you have been avoiding"),
("The Chariot", "determination, focus, forward motion", "push through the resistance"),
("Strength", "inner courage, patience, gentle persistence", "stay the course with compassion"),
("The Hermit", "solitude, reflection, inner guidance", "step back and think before acting"),
("Wheel of Fortune", "cycles, turning points, inevitable change", "acknowledge what is shifting around you"),
("Justice", "fairness, truth, cause and effect", "audit a recent decision for its real consequences"),
("The Hanged Man", "pause, surrender, new perspective", "release your grip on the outcome"),
("Death", "endings, transformation, release", "let go of what no longer serves you"),
("Temperance", "balance, moderation, patience", "blend two competing demands"),
("The Devil", "attachment, habit, shadow patterns", "name a loop you are stuck in"),
("The Tower", "sudden disruption, revelation, necessary collapse", "accept the thing that already broke"),
("The Star", "hope, renewal, calm after the storm", "trust that recovery is already underway"),
("The Moon", "uncertainty, illusion, the unconscious", "sit with ambiguity rather than forcing clarity"),
("The Sun", "clarity, vitality, success", "act from your most energised self"),
("Judgement", "reflection, reckoning, a call to rise", "respond to a long-deferred summons"),
("The World", "completion, integration, a cycle closing", "acknowledge what you have finished"),
]
_POSITIONS = ("situation", "action", "outcome")
def _daily_draw(user_id: str, date_str: str) -> list[int]:
"""Return three distinct card indices seeded by (user_id, date)."""
seed = hashlib.sha256(f"{user_id}:{date_str}".encode()).digest()
indices: list[int] = []
offset = 0
while len(indices) < 3:
val = int.from_bytes(seed[offset:offset + 2], "big") % len(_CARDS)
if val not in indices:
indices.append(val)
offset = (offset + 2) % (len(seed) - 1)
return indices
MANIFEST = AgentManifest(
id="tarot",
version="1.0.0",
description="Daily three-card draw (situation/action/outcome) that frames the tip as a symbolic reflection.",
pref_schema={
"type": "object",
"additionalProperties": False,
"properties": {
"enabled": {
"type": "boolean",
"default": True,
"description": "Set false to disable the tarot agent for this user.",
},
},
},
context_schema=[],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=3_600 * 6, # stable for 6 h; refreshes mid-day at most twice
silenced_in_contexts=[],
inferred_params=[],
)
class TarotAgent(BaseAgent):
"""Produces a three-card reading as a prompt snippet."""
agent_id: ClassVar[str] = MANIFEST.id
ttl_seconds: ClassVar[int] = MANIFEST.ttl_sec
version: ClassVar[str] = MANIFEST.version
def compute(self, inp: AgentInput) -> AgentOutput:
date_str = inp.now.strftime("%Y-%m-%d")
indices = _daily_draw(inp.user_id, date_str)
reading: list[dict] = []
parts: list[str] = [f"Today's tarot reading ({date_str}):"]
for pos, idx in zip(_POSITIONS, indices):
name, meaning, hint = _CARDS[idx]
reading.append({"position": pos, "card": name, "meaning": meaning, "hint": hint})
parts.append(f" {pos.capitalize()}{name}: {meaning}. Hint: {hint}.")
parts.append(
"Weave these symbolic themes lightly into the tip — "
"ground them in practical, specific action. "
"Do not explain the cards; let their meaning shape the advice."
)
prompt = "\n".join(parts)
snapshot = {"date": date_str, "reading": reading}
return self._make_output(inp, prompt, snapshot)

View File

@@ -13,6 +13,8 @@ from ml.agents.momentum import MomentumAgent
from ml.agents.time_of_day import TimeOfDayAgent
from ml.agents.recent_patterns import RecentPatternsAgent
from ml.agents.focus_area import FocusAreaAgent
from ml.agents.tarot import TarotAgent, _daily_draw, _CARDS, _POSITIONS
from ml.agents.stars import StarsAgent, _SWE_AVAILABLE
from ml.agents.registry import get_agent, all_agents
_NOW = datetime(2026, 5, 1, 9, 0, 0, tzinfo=timezone.utc) # Thursday 09:00 UTC
@@ -213,40 +215,130 @@ class TestFocusAreaAgent:
out = self.agent.compute(_inp())
assert "no tasks" in out.prompt_text.lower()
def test_single_project(self):
tasks = [_task(f"T{i}", project_id="Work") for i in range(3)]
out = self.agent.compute(_inp(tasks=tasks))
assert '"Work"' in out.prompt_text
assert "3 tasks" in out.prompt_text
def test_most_congested_wins(self):
def test_lists_all_clusters(self):
tasks = (
[_task(f"W{i}", project_id="Work") for i in range(5)]
[_task(f"W{i}", project_id="Work") for i in range(3)]
+ [_task(f"H{i}", project_id="Home") for i in range(2)]
)
out = self.agent.compute(_inp(tasks=tasks))
assert '"Work"' in out.prompt_text
assert "Work" in out.prompt_text
assert "Home" in out.prompt_text
def test_overdue_weighting(self):
# Home has 2 tasks (1 overdue), Work has 3 non-overdue tasks
# Home score = 2+1 = 3; Work score = 3 — Home should win due to overdue weight
tasks = (
[_task("Home1", project_id="Home", is_overdue=True),
_task("Home2", project_id="Home")]
+ [_task(f"W{i}", project_id="Work") for i in range(3)]
)
def test_includes_task_titles(self):
tasks = [_task("Buy milk", project_id="Personal"), _task("Write report", project_id="Personal")]
out = self.agent.compute(_inp(tasks=tasks))
assert '"Work"' not in out.prompt_text or '"Home"' in out.prompt_text
assert '"Buy milk"' in out.prompt_text
assert '"Write report"' in out.prompt_text
def test_task_count_in_output(self):
tasks = [_task(f"T{i}", project_id="Work") for i in range(3)]
out = self.agent.compute(_inp(tasks=tasks))
assert "3 task" in out.prompt_text
def test_default_project_fallback(self):
out = self.agent.compute(_inp(tasks=[_task("No project task")]))
# Tasks without project_id fall back to a "Tasks" bucket
assert "Tasks" in out.prompt_text
def test_snapshot_keys(self):
out = self.agent.compute(_inp(tasks=[_task("T1", project_id="A")]))
assert {"top_cluster_label", "top_task_count", "top_overdue_count", "cluster_count",
"strategy", "preferred_areas"} == set(out.signals_snapshot)
public_keys = {k for k in out.signals_snapshot if not k.startswith("_")}
assert {"cluster_count", "clusters"} == public_keys
def test_snapshot_clusters_shape(self):
tasks = [_task("Buy milk", project_id="P1"), _task("Fix bug", project_id="P2")]
out = self.agent.compute(_inp(tasks=tasks))
clusters = out.signals_snapshot["clusters"]
assert isinstance(clusters, list)
assert all("label" in c and "task_count" in c and "tasks" in c for c in clusters)
# ── TarotAgent ────────────────────────────────────────────────────────────────
class TestTarotAgent:
agent = TarotAgent()
def test_basic_output(self):
out = self.agent.compute(_inp())
_check_output(out, self.agent)
assert "situation" in out.prompt_text.lower()
assert "action" in out.prompt_text.lower()
assert "outcome" in out.prompt_text.lower()
assert out.signals_snapshot["date"] == "2026-05-01"
assert len(out.signals_snapshot["reading"]) == 3
def test_three_distinct_cards(self):
out = self.agent.compute(_inp())
cards = [r["card"] for r in out.signals_snapshot["reading"]]
assert len(set(cards)) == 3
def test_positions_labelled(self):
out = self.agent.compute(_inp())
positions = [r["position"] for r in out.signals_snapshot["reading"]]
assert positions == list(_POSITIONS)
def test_daily_stability(self):
out1 = self.agent.compute(_inp(now=datetime(2026, 5, 1, 8, 0, 0, tzinfo=timezone.utc)))
out2 = self.agent.compute(_inp(now=datetime(2026, 5, 1, 20, 0, 0, tzinfo=timezone.utc)))
assert out1.signals_snapshot["reading"] == out2.signals_snapshot["reading"]
def test_different_days_different_draw(self):
out1 = self.agent.compute(_inp(now=datetime(2026, 5, 1, 9, 0, 0, tzinfo=timezone.utc)))
out2 = self.agent.compute(_inp(now=datetime(2026, 5, 2, 9, 0, 0, tzinfo=timezone.utc)))
assert out1.signals_snapshot["reading"] != out2.signals_snapshot["reading"]
def test_different_users_different_draw(self):
out1 = self.agent.compute(_inp(user_id="user-A"))
out2 = self.agent.compute(_inp(user_id="user-B"))
assert out1.signals_snapshot["reading"] != out2.signals_snapshot["reading"]
def test_daily_draw_returns_valid_indices(self):
indices = _daily_draw("u1", "2026-05-01")
assert len(indices) == 3
assert len(set(indices)) == 3
assert all(0 <= i < len(_CARDS) for i in indices)
# ── StarsAgent ────────────────────────────────────────────────────────────────
class TestStarsAgent:
agent = StarsAgent()
def test_no_birth_date(self):
out = self.agent.compute(_inp())
_check_output(out, self.agent)
assert out.signals_snapshot.get("no_birth_date") is True
assert "birth date" in out.prompt_text.lower()
@pytest.mark.skipif(not _SWE_AVAILABLE, reason="pyswisseph not installed")
def test_invalid_birth_date(self):
out = self.agent.compute(_inp(agent_prefs={"birth_date": "not-a-date"}))
_check_output(out, self.agent)
assert out.signals_snapshot.get("invalid_birth_date") == "not-a-date"
@pytest.mark.skipif(not _SWE_AVAILABLE, reason="pyswisseph not installed")
def test_with_birth_date(self):
out = self.agent.compute(_inp(agent_prefs={"birth_date": "1990-06-15"}))
_check_output(out, self.agent)
assert "natal" in out.prompt_text.lower()
assert out.signals_snapshot["birth_date"] == "1990-06-15"
assert "natal_sun" in out.signals_snapshot
assert "natal_moon" in out.signals_snapshot
@pytest.mark.skipif(not _SWE_AVAILABLE, reason="pyswisseph not installed")
def test_transit_snapshot_structure(self):
out = self.agent.compute(_inp(agent_prefs={"birth_date": "1985-03-21"}))
snap = out.signals_snapshot
assert "active_transits" in snap
for t in snap["active_transits"]:
assert {"transit_planet", "natal_planet", "aspect", "nature", "orb"} <= t.keys()
def test_swe_unavailable_path(self, monkeypatch):
import ml.agents.stars as stars_mod
monkeypatch.setattr(stars_mod, "_SWE_AVAILABLE", False)
agent = StarsAgent()
out = agent.compute(_inp(agent_prefs={"birth_date": "1990-06-15"}))
_check_output(out, agent)
assert out.signals_snapshot.get("swe_unavailable") is True
# ── Registry ─────────────────────────────────────────────────────────────────
@@ -255,7 +347,7 @@ class TestRegistry:
def test_all_agents_present(self):
agents = all_agents()
ids = {a.agent_id for a in agents}
assert ids == {"overdue-task", "momentum", "time-of-day", "recent-patterns", "focus-area"}
assert ids == {"overdue-task", "momentum", "time-of-day", "recent-patterns", "focus-area", "health-vitals", "tarot", "stars"}
def test_get_agent(self):
a = get_agent("momentum")

View File

@@ -1,6 +1,6 @@
"""Unit tests for ml.agents.clustering (issue #97).
"""Unit tests for ml.agents.clustering (issue #97, #129).
Embedding calls are mocked so tests run without Ollama.
LLM and embedding calls are mocked so tests run without Ollama or LiteLLM.
"""
from __future__ import annotations
@@ -9,7 +9,7 @@ sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "..", ".."))
from unittest.mock import patch
from ml.agents.clustering import cluster_tasks, Cluster, _greedy_cluster, _cosine
from ml.agents.clustering import cluster_tasks, Cluster, _greedy_cluster, _cosine, _embed_batch, _enrich_batch
# ── helpers ──────────────────────────────────────────────────────────────────
@@ -82,54 +82,128 @@ class TestGreedyClustering:
assert clusters[0].label == "Write report"
# ── enrichment ───────────────────────────────────────────────────────────────
class TestEnrichBatch:
def test_falls_back_to_raw_when_no_litellm_url(self, monkeypatch):
monkeypatch.delenv("LITELLM_URL", raising=False)
result, new = _enrich_batch(["Buy milk", "Fix bug"])
assert result == ["Buy milk", "Fix bug"] and new == {}
def test_uses_description_when_litellm_available(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
with patch("ml.agents.clustering._enrich_title", return_value="Expanded description."):
result, new = _enrich_batch(["Buy milk"])
assert result == ["Expanded description."]
assert len(new) == 1
def test_falls_back_to_raw_title_on_enrich_failure(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
with patch("ml.agents.clustering._enrich_title", return_value=None):
result, new = _enrich_batch(["Buy milk"])
assert result == ["Buy milk"]
assert new == {} # failed enrichments are not persisted
def test_deduplicates_identical_titles(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
call_count = {"n": 0}
def fake_enrich(title, url):
call_count["n"] += 1
return f"desc:{title}"
with patch("ml.agents.clustering._enrich_title", side_effect=fake_enrich):
result, new = _enrich_batch(["Buy milk", "Buy milk", "Fix bug"])
assert call_count["n"] == 2 # only 2 unique titles
assert result == ["desc:Buy milk", "desc:Buy milk", "desc:Fix bug"]
def test_uses_persistent_cache(self, monkeypatch):
monkeypatch.setenv("LITELLM_URL", "http://fake-litellm")
from ml.agents.clustering import _content_hash
h = _content_hash("Buy milk")
call_count = {"n": 0}
def fake_enrich(title, url):
call_count["n"] += 1
return "new desc"
with patch("ml.agents.clustering._enrich_title", side_effect=fake_enrich):
result, new = _enrich_batch(["Buy milk"], persistent_cache={h: "cached desc"})
assert call_count["n"] == 0 # cache hit, no LLM call
assert result == ["cached desc"]
assert new == {}
# ── cluster_tasks integration ─────────────────────────────────────────────────
class TestClusterTasks:
def test_empty_tasks(self):
result = cluster_tasks([])
assert result == []
def _no_enrich(self, titles, persistent_cache=None):
return titles, {}
def test_fallback_when_ollama_unavailable(self):
with patch("ml.agents.clustering._embed", return_value=None):
def test_empty_tasks(self):
clusters, new = cluster_tasks([])
assert clusters == [] and new == {}
def test_fallback_when_embed_unavailable(self):
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=None):
tasks = [_task("A", "p1"), _task("B", "p2"), _task("C", "p1")]
clusters = cluster_tasks(tasks)
clusters, _ = cluster_tasks(tasks)
assert len(clusters) == 2
labels = {c.label for c in clusters}
assert "p1" in labels and "p2" in labels
def test_fallback_groups_by_project(self):
with patch("ml.agents.clustering._embed", return_value=None):
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=None):
tasks = [_task("A", "work")] * 3 + [_task("B", "home")] * 2
clusters = cluster_tasks(tasks)
clusters, _ = cluster_tasks(tasks)
by_label = {c.label: c.task_count for c in clusters}
assert by_label["work"] == 3
assert by_label["home"] == 2
def test_tasks_without_content_go_to_other(self):
v = [1.0, 0.0]
with patch("ml.agents.clustering._embed", return_value=v):
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=[v]):
tasks = [_task("Has content"), {"is_overdue": False}]
clusters = cluster_tasks(tasks)
clusters, _ = cluster_tasks(tasks)
labels = {c.label for c in clusters}
assert "Other tasks" in labels
def test_semantic_clustering_groups_similar(self):
v_work = [1.0, 0.0, 0.0]
v_home = [0.0, 1.0, 0.0]
side_effects = [v_work, v_work, v_home, v_home]
with patch("ml.agents.clustering._embed", side_effect=side_effects):
batch_result = [v_work, v_work, v_home, v_home]
with patch("ml.agents.clustering._enrich_batch", side_effect=self._no_enrich), \
patch("ml.agents.clustering._embed_batch", return_value=batch_result):
tasks = [
_task("Write report"),
_task("Review PR"),
_task("Buy groceries"),
_task("Cook dinner"),
]
clusters = cluster_tasks(tasks)
clusters, _ = cluster_tasks(tasks)
assert len(clusters) == 2
assert all(c.task_count == 2 for c in clusters)
def test_all_tasks_no_content_fallback_by_project(self):
tasks = [{"project_id": "p1", "is_overdue": False},
{"project_id": "p2", "is_overdue": False}]
clusters = cluster_tasks(tasks)
assert len(clusters) == 2
clusters, new = cluster_tasks(tasks)
assert len(clusters) == 2 and new == {}
def test_enrich_called_before_embed(self):
"""Verify enrichment output (not raw title) is what gets embedded."""
v = [1.0, 0.0]
captured = {}
def fake_embed(texts):
captured["texts"] = texts
return [v] * len(texts)
with patch("ml.agents.clustering._enrich_batch", return_value=(["Expanded desc."], {})), \
patch("ml.agents.clustering._embed_batch", side_effect=fake_embed):
cluster_tasks([_task("Buy milk")])
assert captured["texts"] == ["clustering: Expanded desc."]
def test_new_enrichments_returned(self):
v = [1.0, 0.0]
with patch("ml.agents.clustering._enrich_batch", return_value=(["desc"], {"abc123": "desc"})), \
patch("ml.agents.clustering._embed_batch", return_value=[v]):
_, new = cluster_tasks([_task("Buy milk")])
assert new == {"abc123": "desc"}

View File

@@ -45,6 +45,7 @@ def test_manifest_required_fields(agent_id: str):
assert isinstance(m.pref_schema, dict) and m.pref_schema.get("type") == "object"
assert isinstance(m.required_consents, list) and m.required_consents
assert "data:core" in m.required_consents, "every agent should require data:core"
assert all(c.startswith("data:") for c in m.required_consents), "only data: consents allowed; agent: consents have been removed"
assert m.ttl_sec == get_agent(agent_id).ttl_seconds, "ttl divergence"

View File

@@ -627,86 +627,37 @@ class TestTimeOfDaySnippet:
assert {"quiet_start", "quiet_end", "peak_hours", "tz"}.issubset(keys)
# ── focus-area: preferred_areas wiring ───────────────────────────────────────
# ── focus-area: cluster summary output ───────────────────────────────────────
class TestFocusAreaPreferredAreas:
class TestFocusAreaOutput:
agent = FocusAreaAgent()
def _task(self, content: str, project_id: str, is_overdue: bool = False) -> dict:
return {"id": "t1", "content": content, "is_overdue": is_overdue,
def _task(self, content: str, project_id: str) -> dict:
return {"id": "t1", "content": content, "is_overdue": False,
"task_age_days": 2.0, "priority": 1, "project_id": project_id}
def test_preferred_area_wins_tie(self):
tasks = [
self._task("Work thing", "work"),
self._task("Home thing", "home"),
]
out = self.agent.compute(_inp(tasks=tasks, agent_prefs={"preferred_areas": ["work"]}))
assert "work" in out.prompt_text
assert "matches the user's stated focus preferences" in out.prompt_text
def test_no_preferred_areas_uses_congestion_score(self):
tasks = [
self._task("W1", "work"),
self._task("H1", "home"),
self._task("H2", "home"),
]
out = self.agent.compute(_inp(tasks=tasks))
# home has more tasks → wins without any preference
assert "home" in out.prompt_text
def test_snapshot_includes_preferred_areas(self):
tasks = [self._task("T", "work")]
out = self.agent.compute(_inp(tasks=tasks, agent_prefs={"preferred_areas": ["work"]}))
assert out.signals_snapshot["preferred_areas"] == ["work"]
def test_version_bumped(self):
def test_version(self):
from ml.agents.focus_area import MANIFEST as FA_MANIFEST
assert FA_MANIFEST.version == "2.0.0"
assert FA_MANIFEST.version == "3.0.0"
def test_snapshot_uses_cluster_keys(self):
def test_all_clusters_in_output(self):
tasks = [self._task("Work thing", "work"), self._task("Home thing", "home")]
out = self.agent.compute(_inp(tasks=tasks))
assert "work" in out.prompt_text.lower()
assert "home" in out.prompt_text.lower()
def test_task_titles_in_output(self):
tasks = [self._task("Buy milk", "personal")]
out = self.agent.compute(_inp(tasks=tasks))
assert '"Buy milk"' in out.prompt_text
def test_snapshot_shape(self):
tasks = [self._task("T", "work")]
out = self.agent.compute(_inp(tasks=tasks))
assert "top_cluster_label" in out.signals_snapshot
assert "cluster_count" in out.signals_snapshot
assert "strategy" in out.signals_snapshot
public_keys = {k for k in out.signals_snapshot if not k.startswith("_")}
assert public_keys == {"cluster_count", "clusters"}
assert isinstance(out.signals_snapshot["clusters"], list)
# ── focus-area: preferred_areas inference from task_completions (#113) ────────
class TestFocusAreaPreferredAreasInference:
from ml.agents.focus_area import MANIFEST as _FA_MANIFEST
def _completion(self, project_id: str) -> TaskCompletion:
return _completion(project_id, lateness_days=0.0)
def test_cold_start_no_completions(self):
history = _history(completions=[])
def test_no_inferred_params(self):
from ml.agents.focus_area import MANIFEST as FA_MANIFEST
result = run_inference(FA_MANIFEST, history)
assert result["preferred_areas"] == []
def test_top_two_projects_returned(self):
completions = (
[_completion("p1", 0)] * 8
+ [_completion("p2", 0)] * 5
+ [_completion("p3", 0)] * 2
)
history = _history(completions=completions)
from ml.agents.focus_area import MANIFEST as FA_MANIFEST
result = run_inference(FA_MANIFEST, history)
assert result["preferred_areas"] == ["p1", "p2"]
def test_single_project_returns_one(self):
completions = [_completion("work", 0)] * 6
history = _history(completions=completions)
from ml.agents.focus_area import MANIFEST as FA_MANIFEST
result = run_inference(FA_MANIFEST, history)
assert result["preferred_areas"] == ["work"]
def test_none_project_id_ignored(self):
completions = [_completion(None, 0)] * 5 + [_completion("real", 0)] * 3
history = _history(completions=completions)
from ml.agents.focus_area import MANIFEST as FA_MANIFEST
result = run_inference(FA_MANIFEST, history)
assert result["preferred_areas"] == ["real"]
assert FA_MANIFEST.inferred_params == []

View File

@@ -126,7 +126,7 @@ MANIFEST = AgentManifest(
},
},
context_schema=["profile.features"],
required_consents=["data:core", "agent:time-of-day"],
required_consents=["data:core"],
output_contract={"type": "snippet", "format": "free_text"},
ttl_sec=900,
inferred_params=[

View File

@@ -14,6 +14,8 @@ Feature-spec fields (issue #61):
ttl_sec — cache lifetime in seconds; mirrors ``ttlSec`` in registry.ts.
source — where the value originates.
fallback — raw value returned when the feature is unavailable (null stored).
invalidated_by — bus event subjects that trigger recompute for the affected user;
mirrors ``invalidatedBy`` in registry.ts. Empty = TTL-only refresh.
"""
from __future__ import annotations
@@ -37,6 +39,7 @@ class ProfileFeature:
ttl_sec: int
source: str
fallback: str
invalidated_by: tuple[str, ...] = ()
PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
@@ -48,6 +51,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=6 * _HOUR,
source="profile_store",
fallback="0.0",
invalidated_by=("signals.tip.feedback",),
),
ProfileFeature(
name="dismiss_rate_30d",
@@ -57,6 +61,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=6 * _HOUR,
source="profile_store",
fallback="0.0",
invalidated_by=("signals.tip.feedback",),
),
ProfileFeature(
name="mean_dwell_ms_30d",
@@ -66,6 +71,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=6 * _HOUR,
source="profile_store",
fallback="null — serving normalises to 0.0",
invalidated_by=("signals.tip.feedback",),
),
ProfileFeature(
name="preferred_hour",
@@ -75,6 +81,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=_DAY,
source="profile_store",
fallback="null — serving normalises to 0.5 (neutral alignment)",
invalidated_by=("signals.tip.feedback",),
),
ProfileFeature(
name="tip_volume_30d",
@@ -84,6 +91,7 @@ PROFILE_FEATURES: tuple[ProfileFeature, ...] = (
ttl_sec=_HOUR,
source="profile_store",
fallback="0",
invalidated_by=("signals.tip.served",),
),
)

View File

@@ -4,6 +4,8 @@ The TS registry in services/api/src/profile/registry.ts is the source of truth.
This test checks the names listed here match the registry by reading the TS
file and grepping for `name: '...'`. Crude but cheap, and it catches the
common rename/add-without-mirror failure mode.
Also verifies invalidated_by subjects mirror the TS invalidatedBy arrays (#61).
"""
from __future__ import annotations
import re
@@ -111,3 +113,37 @@ def test_profile_feature_source_is_profile_store():
def test_profile_feature_fallback_set():
for f in PROFILE_FEATURES:
assert f.fallback, f"{f.name}: fallback must not be empty"
def _ts_registry_invalidated_by() -> dict[str, list[str]]:
"""Parse invalidatedBy arrays from registry.ts.
Extracts subjects from blocks like:
invalidatedBy: ['signals.tip.feedback'],
Returns {feature_name: [subject, ...]}; features with no invalidatedBy get [].
"""
text = REGISTRY_PATH.read_text(encoding="utf-8")
result: dict[str, list[str]] = {}
for block in re.split(r"\{", text):
name_m = re.search(r"name:\s*'([a-zA-Z0-9_]+)'", block)
if not name_m:
continue
name = name_m.group(1)
inv_m = re.search(r"invalidatedBy:\s*\[([^\]]*)\]", block)
if inv_m:
subjects = re.findall(r"'([^']+)'", inv_m.group(1))
else:
subjects = []
result[name] = subjects
return result
def test_invalidated_by_matches_ts_registry():
ts_inv = _ts_registry_invalidated_by()
for f in PROFILE_FEATURES:
assert f.name in ts_inv, f"{f.name} not found in TS registry invalidatedBy parse"
expected = tuple(sorted(ts_inv[f.name]))
actual = tuple(sorted(f.invalidated_by))
assert actual == expected, (
f"{f.name}: Python invalidated_by={actual} != TS invalidatedBy={expected}"
)

View File

@@ -14,6 +14,7 @@ from __future__ import annotations
import json
import os
import sys
import time
from contextlib import asynccontextmanager
from datetime import datetime, timezone
from pathlib import Path
@@ -27,6 +28,9 @@ from fastapi import FastAPI, HTTPException, Request
from pydantic import BaseModel
from starlette.middleware.base import BaseHTTPMiddleware
import mlflow
from mlflow.entities import SpanType
import logging_config
import nats_consumer
from prompts import get_prompt, build_orchestrator_messages
@@ -79,6 +83,71 @@ LITELLM_URL = os.getenv("LITELLM_URL", "http://localhost:4000")
LITELLM_MASTER_KEY = os.getenv("LITELLM_MASTER_KEY", "sk-oo-dev")
STATE_DIR = Path(os.getenv("STATE_DIR", "/tmp/oo-serving-state"))
# ── MLflow tracing (optional) ───────────────────────────────────────────────
# Set MLFLOW_TRACKING_URI to enable. Spans are fire-and-forget; errors are
# logged at WARNING and never propagate to the caller.
# MLflow --allowed-hosts must include "mlflow" (the container DNS name) so the
# SDK can reach the server from inside other containers.
_MLFLOW_URI = os.getenv("MLFLOW_TRACKING_URI", "")
_MLFLOW_EXP = "oO/serving"
_mlflow_exp_id: str | None = None
if _MLFLOW_URI:
try:
mlflow.set_tracking_uri(_MLFLOW_URI)
_mlflow_exp_id = mlflow.set_experiment(_MLFLOW_EXP).experiment_id
except Exception as _exc:
log.warning("mlflow_init_failed", error=str(_exc))
class _NoOpSpan:
"""Returned when MLflow is disabled or span creation fails."""
def set_inputs(self, *a, **k): pass
def set_outputs(self, *a, **k): pass
def set_attribute(self, *a, **k): pass
def set_attributes(self, *a, **k): pass
def end(self, *a, **k): pass
_NOOP = _NoOpSpan()
def _start_span(name: str, span_type: str, *, parent=_NOOP, inputs=None):
"""Start an MLflow span. Returns _NOOP on failure or when tracing is off.
experiment_id is only passed for root spans (no parent) — passing it to
child spans causes the SDK to fail with '_Span has no attribute _span'.
"""
if _mlflow_exp_id is None:
return _NOOP
try:
kw: dict = {"span_type": span_type}
if isinstance(parent, _NoOpSpan):
kw["experiment_id"] = _mlflow_exp_id # root span only
else:
kw["parent_span"] = parent
if inputs is not None:
kw["inputs"] = inputs
return mlflow.start_span_no_context(name, **kw)
except Exception as exc: # noqa: BLE001
log.warning("mlflow_span_start_failed", name=name, error=str(exc))
return _NOOP
def _end_span(span, *, status: str = "OK", outputs=None, attributes: dict | None = None) -> None:
"""End a span safely, ignoring _NoOpSpan and swallowing exceptions."""
if isinstance(span, _NoOpSpan):
return
try:
if attributes:
span.set_attributes(attributes)
span.end(status=status, outputs=outputs)
except Exception as exc: # noqa: BLE001
log.warning("mlflow_span_end_failed", error=str(exc))
STATE_DIR.mkdir(parents=True, exist_ok=True)
@@ -127,6 +196,12 @@ class AgentComputeRequest(BaseModel):
now_iso: Optional[str] = None # ISO 8601; defaults to utcnow
# Per-agent prefs from user_preferences (merged: user source overrides inferred).
agent_prefs: dict = {}
# Pre-fetched enrichment cache: {content_hash -> description}. Avoids re-calling
# LiteLLM for task titles already expanded in a prior compute cycle.
enrichment_cache: dict[str, str] = {}
# MD5 of sorted task contents; stored in snapshot so the next cycle can skip
# recompute when the task list hasn't changed.
task_hash: Optional[str] = None
class AgentComputeResponse(BaseModel):
@@ -137,6 +212,8 @@ class AgentComputeResponse(BaseModel):
computed_at: str
expires_at: str
agent_version: str
# New enrichments generated during this compute cycle; caller persists to DB.
new_enrichments: dict[str, str] = {}
class AgentInferRequest(BaseModel):
@@ -163,6 +240,8 @@ class RecommendRequest(BaseModel):
tasks: list[dict] = []
hour_of_day: int = 12
day_of_week: int = 0
science_destiny: int = 50 # 0=science (data-driven), 100=destiny (intuitive)
recent_tip: Optional[str] = None # content of last snoozed tip; LLM avoids repeating it
class TipResult(BaseModel):
@@ -214,6 +293,25 @@ _RETRY_SUFFIX_OBJ = (
"Reply ONLY with the JSON object — no prose, no markdown fences."
)
_RETRY_SUFFIX_SCHEMA = (
"\n\nYour previous response parsed as JSON but was missing required fields. "
'Reply ONLY with a JSON object containing "content" (non-empty string) and "kind" '
'(one of: advice, task, insight, reminder) — no prose, no markdown fences.'
)
_VALID_KINDS = {"advice", "task", "insight", "reminder"}
def _validate_tip_item(item: dict) -> str | None:
"""Return an error string if item fails schema, else None."""
content = item.get("content", "")
if not isinstance(content, str) or not content.strip():
return "missing or empty 'content' field"
kind = item.get("kind", "")
if kind and kind not in _VALID_KINDS:
return f"invalid kind '{kind}', must be one of {_VALID_KINDS}"
return None
@app.post("/agents/{agent_id}/compute", response_model=AgentComputeResponse)
async def compute_agent(agent_id: str, req: AgentComputeRequest) -> AgentComputeResponse:
@@ -243,6 +341,7 @@ async def compute_agent(agent_id: str, req: AgentComputeRequest) -> AgentCompute
feedback_history=req.feedback_history,
now=now,
agent_prefs=req.agent_prefs,
enrichment_cache=req.enrichment_cache,
)
try:
output = agent.compute(inp)
@@ -250,7 +349,20 @@ async def compute_agent(agent_id: str, req: AgentComputeRequest) -> AgentCompute
log.error("agent_compute_failed", agent_id=agent_id, user_id=req.user_id, error=str(exc))
raise HTTPException(status_code=500, detail=f"Agent compute failed: {exc}")
if req.task_hash:
output.signals_snapshot["_task_hash"] = req.task_hash
new_enrichments: dict[str, str] = output.signals_snapshot.pop("_new_enrichments", {})
log.info("agent_computed", agent_id=agent_id, user_id=req.user_id, expires_at=output.expires_at)
span = _start_span(
f"compute:{agent_id}",
SpanType.AGENT,
inputs={"user_id": req.user_id, "agent_id": agent_id,
"task_count": len(req.tasks), "feedback_count": len(req.feedback_history)},
)
_end_span(span,
outputs={"prompt_text": output.prompt_text, "signals_snapshot": output.signals_snapshot},
attributes={"agent_version": output.agent_version, "expires_at": output.expires_at})
return AgentComputeResponse(
user_id=output.user_id,
agent_id=output.agent_id,
@@ -259,6 +371,7 @@ async def compute_agent(agent_id: str, req: AgentComputeRequest) -> AgentCompute
computed_at=output.computed_at,
expires_at=output.expires_at,
agent_version=output.agent_version,
new_enrichments=new_enrichments,
)
@@ -307,6 +420,15 @@ async def infer_agent(agent_id: str, req: AgentInferRequest) -> AgentInferRespon
history_len=len(events),
latency_ms=latency_ms,
)
span = _start_span(
f"infer:{agent_id}",
SpanType.CHAIN,
inputs={"user_id": req.user_id, "agent_id": agent_id,
"history_len": len(events), "completion_count": len(completions)},
)
_end_span(span,
outputs={"inferred_prefs": inferred},
attributes={"latency_ms": str(latency_ms), "n_params": str(len(inferred))})
return AgentInferResponse(user_id=req.user_id, agent_id=agent_id, inferred_prefs=inferred)
@@ -318,17 +440,55 @@ async def recommend(req: RecommendRequest) -> RecommendResponse:
the fresh rows from agent_outputs table (fetched by the TypeScript recommender
before calling this endpoint). Falls back to raw task context if empty.
"""
t0 = time.monotonic()
# ── root span ──────────────────────────────────────────────────────────
root = _start_span("recommend", SpanType.CHAIN, inputs={
"user_id": req.user_id,
"agent_ids": [s.agent_id for s in req.agent_outputs],
"hour_of_day": req.hour_of_day,
"day_of_week": req.day_of_week,
"science_destiny": req.science_destiny,
})
try:
# ── build_context span ─────────────────────────────────────────────
ctx_span = _start_span("build_context", SpanType.TOOL, parent=root, inputs={
"agent_count": len(req.agent_outputs),
"task_count": len(req.tasks),
"science_destiny": req.science_destiny,
})
messages = build_orchestrator_messages(
agent_outputs=[s.model_dump() for s in req.agent_outputs],
tasks=req.tasks,
hour_of_day=req.hour_of_day,
day_of_week=req.day_of_week,
science_destiny=req.science_destiny,
recent_tip=req.recent_tip,
)
_end_span(ctx_span, outputs={"message_count": len(messages)})
# ── one span per pre-computed agent snippet ────────────────────────
for snippet in req.agent_outputs:
a_span = _start_span(
f"agent:{snippet.agent_id}", SpanType.AGENT, parent=root,
inputs={"agent_id": snippet.agent_id},
)
_end_span(a_span, outputs={"prompt_text": snippet.prompt_text})
# ── LLM orchestrator span (wraps retry loop) ───────────────────────
llm_span = _start_span("llm_orchestrator", SpanType.LLM, parent=root, inputs={
"messages": messages,
"model": "tip-generator",
"temperature": 0.7,
})
headers = {"Authorization": f"Bearer {LITELLM_MASTER_KEY}"}
last_raw = ""
last_parse_error = ""
total_usage: dict = {"prompt_tokens": 0, "completion_tokens": 0}
model_used = "tip-generator"
_attempt = 0
async with httpx.AsyncClient(timeout=30.0) as client:
for _attempt in range(1 + _MAX_GENERATE_RETRIES):
@@ -339,8 +499,12 @@ async def recommend(req: RecommendRequest) -> RecommendResponse:
)
resp.raise_for_status()
except httpx.HTTPStatusError as e:
_end_span(llm_span, status="ERROR")
_end_span(root, status="ERROR")
raise HTTPException(status_code=502, detail=f"LiteLLM error: {e.response.text}")
except httpx.RequestError as e:
_end_span(llm_span, status="ERROR")
_end_span(root, status="ERROR")
raise HTTPException(status_code=503, detail=f"LiteLLM unreachable: {e}")
data = resp.json()
@@ -359,12 +523,18 @@ async def recommend(req: RecommendRequest) -> RecommendResponse:
text = text[4:]
parsed = json.loads(text)
item: dict = parsed[0] if isinstance(parsed, list) else parsed
schema_err = _validate_tip_item(item)
if schema_err:
raise ValueError(schema_err)
break
except (json.JSONDecodeError, ValueError, IndexError) as exc:
last_parse_error = str(exc)
messages.append({"role": "assistant", "content": last_raw})
messages.append({"role": "user", "content": _RETRY_SUFFIX_OBJ})
is_schema_err = not isinstance(exc, json.JSONDecodeError)
messages.append({"role": "user", "content": _RETRY_SUFFIX_SCHEMA if is_schema_err else _RETRY_SUFFIX_OBJ})
else:
_end_span(llm_span, status="ERROR")
_end_span(root, status="ERROR")
raise HTTPException(
status_code=502,
detail=f"LLM returned invalid JSON after {_MAX_GENERATE_RETRIES} retries: "
@@ -376,12 +546,19 @@ async def recommend(req: RecommendRequest) -> RecommendResponse:
content=item.get("content", ""),
rationale=item.get("rationale"),
)
log.info(
"recommend_served",
user_id=req.user_id,
agent_count=len(req.agent_outputs),
tip_id=tip.id,
)
_end_span(llm_span, outputs={"content": tip.content, "rationale": tip.rationale or ""},
attributes={
"prompt_tokens": str(total_usage["prompt_tokens"]),
"completion_tokens": str(total_usage["completion_tokens"]),
"model": model_used,
"attempts": str(_attempt + 1),
})
latency_ms = round((time.monotonic() - t0) * 1000, 1)
log.info("recommend_served", user_id=req.user_id, agent_count=len(req.agent_outputs), tip_id=tip.id)
_end_span(root, outputs={"tip_id": tip.id, "content": tip.content, "rationale": tip.rationale or ""},
attributes={"latency_ms": str(latency_ms), "agent_count": str(len(req.agent_outputs))})
return RecommendResponse(
tip=tip,
model=model_used,
@@ -389,6 +566,12 @@ async def recommend(req: RecommendRequest) -> RecommendResponse:
completion_tokens=total_usage["completion_tokens"],
)
except HTTPException:
raise
except Exception:
_end_span(root, status="ERROR")
raise
_MAX_GENERATE_RETRIES = 2

201
ml/serving/mlflow_client.py Normal file
View File

@@ -0,0 +1,201 @@
"""Thin MLflow REST wrapper.
Why not the official ``mlflow`` SDK? Two reasons specific to the oO setup:
1. The MLflow server (3.11) ships with ``--allowed-hosts localhost`` but
curl / requests / urllib3 send ``Host: localhost:5000`` — the port
suffix fails the DNS-rebinding check. We override the Host header per
request, which the SDK doesn't expose.
2. The collect/judge phases only need ~6 endpoints (create/search/log).
Pulling a 200MB SDK transitively for that is excess weight.
All calls are synchronous httpx with explicit ``Host`` so the script can
run from the host shell or from inside docker without further config.
"""
from __future__ import annotations
import os
import time
from dataclasses import dataclass
from typing import Any
import httpx
def _strip_path(uri: str) -> tuple[str, str]:
"""Return (origin, path_prefix) — handles both /mlflow and / roots.
``http://mlflow:5000/mlflow`` → ("http://mlflow:5000", "/mlflow")
``http://localhost:5000`` → ("http://localhost:5000", "")
"""
uri = uri.rstrip("/")
if "/" not in uri.split("://", 1)[1]:
return uri, ""
scheme_host, _, rest = uri.partition("://")
host, _, path = rest.partition("/")
return f"{scheme_host}://{host}", "/" + path if path else ""
@dataclass
class MLflowClient:
tracking_uri: str
username: str | None = None
password: str | None = None
host_header: str | None = None # override for DNS-rebinding sidestep
timeout: float = 30.0
def __post_init__(self) -> None:
self._origin, self._ui_prefix = _strip_path(self.tracking_uri)
# MLflow 3.x exposes the REST API at the root, *not* under the
# ``/mlflow`` UI prefix. Empirically verified against the running
# ghcr.io/mlflow/mlflow:v3.11.1 container.
self._api = f"{self._origin}/api/2.0/mlflow"
self._auth = (self.username, self.password) if self.username else None
# If user did not pass a host header, derive from origin. Strip
# the port if present — the server's allowed-hosts check rejects
# ``localhost:5000`` even when ``localhost`` is allowed.
if self.host_header is None:
host = self._origin.split("://", 1)[1]
self.host_header = host.split(":", 1)[0]
@classmethod
def from_env(cls) -> "MLflowClient":
return cls(
tracking_uri=os.environ.get("MLFLOW_TRACKING_URI", "http://localhost:5000"),
username=os.environ.get("MLFLOW_TRACKING_USERNAME") or "admin",
password=os.environ.get("MLFLOW_TRACKING_PASSWORD") or "password",
host_header=os.environ.get("MLFLOW_HOST_HEADER"),
)
def _headers(self) -> dict[str, str]:
return {"Host": self.host_header or "localhost"}
def _post(self, path: str, body: dict) -> dict:
with httpx.Client(trust_env=False, timeout=self.timeout) as c:
r = c.post(f"{self._api}{path}", json=body, headers=self._headers(), auth=self._auth)
r.raise_for_status()
return r.json()
def _get(self, path: str, params: dict | None = None) -> dict:
with httpx.Client(trust_env=False, timeout=self.timeout) as c:
r = c.get(f"{self._api}{path}", params=params or {}, headers=self._headers(), auth=self._auth)
r.raise_for_status()
return r.json()
# ── Experiments ────────────────────────────────────────────────────
def get_or_create_experiment(self, name: str) -> str:
try:
r = self._get("/experiments/get-by-name", {"experiment_name": name})
return r["experiment"]["experiment_id"]
except httpx.HTTPStatusError as e:
if e.response.status_code not in (404, 400):
raise
r = self._post("/experiments/create", {"name": name})
return r["experiment_id"]
# ── Runs ───────────────────────────────────────────────────────────
def create_run(
self,
experiment_id: str,
run_name: str,
tags: dict[str, str] | None = None,
) -> str:
body: dict[str, Any] = {
"experiment_id": experiment_id,
"start_time": int(time.time() * 1000),
"run_name": run_name,
"tags": [
{"key": k, "value": str(v)}
for k, v in (tags or {}).items()
],
}
r = self._post("/runs/create", body)
return r["run"]["info"]["run_id"]
def log_param(self, run_id: str, key: str, value: Any) -> None:
self._post("/runs/log-parameter", {"run_id": run_id, "key": key, "value": str(value)})
def log_params(self, run_id: str, params: dict[str, Any]) -> None:
for k, v in params.items():
self.log_param(run_id, k, v)
def log_metric(self, run_id: str, key: str, value: float, step: int = 0) -> None:
self._post("/runs/log-metric", {
"run_id": run_id,
"key": key,
"value": float(value),
"timestamp": int(time.time() * 1000),
"step": step,
})
def log_metrics(self, run_id: str, metrics: dict[str, float]) -> None:
for k, v in metrics.items():
self.log_metric(run_id, k, v)
def set_tag(self, run_id: str, key: str, value: str) -> None:
self._post("/runs/set-tag", {"run_id": run_id, "key": key, "value": str(value)})
def set_tags(self, run_id: str, tags: dict[str, str]) -> None:
for k, v in tags.items():
self.set_tag(run_id, k, v)
# MLflow tag values are capped at 5000 chars by the server (RESOURCE_DOES_NOT_EXIST
# below that, INVALID_PARAMETER_VALUE above). 4500 leaves headroom for
# internal metadata MLflow may append on its own.
_TAG_VALUE_LIMIT = 4500
def log_text(self, run_id: str, text: str, artifact_path: str) -> None:
"""Persist short text alongside the run.
The MLflow server in this deployment uses a ``file://`` artifact
backend, which is only reachable from inside the container — not
via the REST proxy. We instead stash short payloads as tags
keyed ``artifact:<path>``. Anything longer than 4500 chars is
chunked into ``artifact:<path>:0``, ``:1`` …; ``get_artifact_text``
re-stitches them in order.
"""
key_base = f"artifact:{artifact_path}"
if len(text) <= self._TAG_VALUE_LIMIT:
self.set_tag(run_id, key_base, text)
return
# chunk
for i in range(0, len(text), self._TAG_VALUE_LIMIT):
self.set_tag(run_id, f"{key_base}:{i // self._TAG_VALUE_LIMIT}",
text[i:i + self._TAG_VALUE_LIMIT])
def get_artifact_text(self, run_id: str, artifact_path: str) -> str:
run = self._get("/runs/get", {"run_id": run_id})["run"]
tags = {t["key"]: t["value"] for t in run["data"].get("tags", [])}
key_base = f"artifact:{artifact_path}"
if key_base in tags:
return tags[key_base]
# chunked form
chunks = sorted(
(k for k in tags if k.startswith(f"{key_base}:")),
key=lambda k: int(k.rsplit(":", 1)[1]),
)
return "".join(tags[k] for k in chunks)
def end_run(self, run_id: str, status: str = "FINISHED") -> None:
self._post("/runs/update", {
"run_id": run_id,
"status": status,
"end_time": int(time.time() * 1000),
})
def search_runs(
self,
experiment_id: str,
filter_string: str = "",
max_results: int = 1000,
) -> list[dict]:
body = {
"experiment_ids": [experiment_id],
"filter": filter_string,
"max_results": max_results,
}
r = self._post("/runs/search", body)
return r.get("runs", [])

View File

@@ -116,6 +116,7 @@ _SYS_V4_ORCHESTRATOR = (
"Multiple specialized agents have analyzed the user's current context and provided "
"their insights below. Synthesize their combined perspective to generate exactly ONE "
"tip that is specific, actionable, and relevant right now. "
"Always respond in English regardless of the language of task content. "
"Respond ONLY with a JSON object with keys: "
'"id" (short slug), "content" (the tip, ≤2 sentences), '
'"rationale" (why now, ≤1 sentence). '
@@ -123,18 +124,58 @@ _SYS_V4_ORCHESTRATOR = (
)
def _science_destiny_instruction(science_destiny: int) -> str:
"""Translate 0-100 slider into a prompt instruction.
0 = pure science: prioritise patterns, data, measurable progress.
100 = pure destiny: prioritise meaning, intuition, deeper purpose.
50 = balanced (no extra instruction injected).
"""
if science_destiny <= 20:
return (
"The user strongly prefers data-driven advice. "
"Ground every tip in observable patterns, streaks, or measurable progress. "
"Avoid abstract or motivational language."
)
if science_destiny <= 40:
return (
"The user leans toward evidence-based guidance. "
"Anchor tips in patterns and metrics where possible."
)
if science_destiny >= 80:
return (
"The user strongly believes in intuition and meaning. "
"Frame tips around purpose, values, and deeper intention rather than metrics."
)
if science_destiny >= 60:
return (
"The user leans toward intuitive, meaning-driven advice. "
"Weave in purpose and intention alongside practicality."
)
return "" # balanced — no extra instruction
def build_orchestrator_messages(
agent_outputs: list[dict],
tasks: list[dict],
hour_of_day: int,
day_of_week: int,
science_destiny: int = 50,
recent_tip: str | None = None,
) -> list[dict]:
"""Build the [system, user] message list for the orchestrator LLM call.
agent_outputs: list of {agent_id, prompt_text} dicts.
Falls back to raw task summary when agent_outputs is empty.
recent_tip: content of a tip the user just snoozed — generate something different.
"""
style_hint = _science_destiny_instruction(science_destiny)
system = _SYS_V4_ORCHESTRATOR + (f"\n\n{style_hint}" if style_hint else "")
lines = [f"Current time: {hour_of_day:02d}:00, day_of_week={day_of_week}", ""]
if recent_tip:
lines.append(f"The user snoozed this tip (do NOT repeat it or anything similar): \"{recent_tip}\"")
lines.append("")
if agent_outputs:
lines.append("Context from analysis agents:")
for s in agent_outputs:
@@ -147,9 +188,9 @@ def build_orchestrator_messages(
)
for t in tasks[:3]:
lines.append(f" - {t.get('content', '?')}")
lines.append("\nGenerate one tip as a JSON object.")
lines.append("\nGenerate one tip as a JSON object. Write the tip content in English only.")
return [
{"role": "system", "content": _SYS_V4_ORCHESTRATOR},
{"role": "system", "content": system},
{"role": "user", "content": "\n".join(lines)},
]

View File

@@ -7,3 +7,5 @@ anthropic>=0.40.0
nats-py>=2.9.0
structlog>=24.1.0
sentry-sdk>=2.0.0
mlflow-skinny>=3.1.0
pyswisseph>=2.10.3.2

View File

@@ -1,4 +1,4 @@
export type IntegrationProvider = 'todoist';
export type IntegrationProvider = 'todoist' | 'google-health';
export type IntegrationStatus = 'connected' | 'disconnected' | 'error';
export interface Integration {

View File

@@ -2,7 +2,7 @@
export interface Signal {
id: string;
source: string; // e.g. 'todoist', 'google-calendar', 'manual'
kind: 'task' | 'event' | 'habit' | 'insight';
kind: 'task' | 'event' | 'habit' | 'insight' | 'health';
content: string;
metadata: Record<string, unknown>; // source-specific raw fields
features: Record<string, number | boolean>; // bandit-ready numeric/boolean features

View File

@@ -2,7 +2,7 @@
export type TipKind = 'task' | 'advice' | 'insight' | 'reminder';
/** Where the tip content originated */
export type TipSource = 'todoist' | 'llm' | 'advice';
export type TipSource = 'todoist' | 'llm' | 'advice' | 'fallback';
/** A single recommendation surfaced to the user */
export interface Tip {

View File

@@ -85,3 +85,45 @@ describe('runMigrations — idempotency', () => {
});
});
describe('runMigrations — issue #127 backfill', () => {
it('grants data:<provider> consent for existing active integration tokens', () => {
const sqlite = freshDb();
runMigrations(sqlite);
// Seed a user + active Todoist token (simulates pre-#127 state)
sqlite.exec(`
INSERT INTO users (id, email, role, created_at) VALUES ('u2', 'u2@test.com', 'user', '2026-01-01T00:00:00Z');
INSERT INTO user_consents (user_id, consent_key, granted_at) VALUES ('u2', 'data:core', '2026-01-01T00:00:00Z');
INSERT INTO integration_tokens (id, user_id, provider, access_token, token_status, connected_at)
VALUES ('tok1', 'u2', 'todoist', 'secret', 'active', '2026-01-02T00:00:00Z');
`);
// Re-run migrations — the backfill should insert data:todoist
runMigrations(sqlite);
const rows = sqlite
.prepare(`SELECT consent_key FROM user_consents WHERE user_id = 'u2' ORDER BY consent_key`)
.all() as { consent_key: string }[];
expect(rows.map((r) => r.consent_key)).toEqual(['data:core', 'data:todoist']);
});
it('is idempotent — running twice does not duplicate consent rows', () => {
const sqlite = freshDb();
runMigrations(sqlite);
sqlite.exec(`
INSERT INTO users (id, email, role, created_at) VALUES ('u3', 'u3@test.com', 'user', '2026-01-01T00:00:00Z');
INSERT INTO integration_tokens (id, user_id, provider, access_token, token_status, connected_at)
VALUES ('tok2', 'u3', 'todoist', 'secret', 'active', '2026-01-02T00:00:00Z');
`);
runMigrations(sqlite);
runMigrations(sqlite);
const count = (sqlite
.prepare(`SELECT COUNT(*) as n FROM user_consents WHERE user_id = 'u3' AND consent_key = 'data:todoist'`)
.get() as { n: number }).n;
expect(count).toBe(1);
});
});

View File

@@ -149,6 +149,13 @@ export function runMigrations(handle: BetterSqlite3Database) {
CREATE INDEX IF NOT EXISTS idx_agent_outputs_user_agent_exp
ON agent_outputs(user_id, agent_id, expires_at DESC);
CREATE TABLE IF NOT EXISTS task_enrichments (
content_hash TEXT PRIMARY KEY,
description TEXT NOT NULL,
model TEXT NOT NULL DEFAULT 'tip-generator',
created_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS user_preferences (
user_id TEXT NOT NULL REFERENCES users(id),
scope TEXT NOT NULL,
@@ -208,6 +215,15 @@ export function runMigrations(handle: BetterSqlite3Database) {
`);
} catch { /* column already dropped — nothing to backfill */ }
// Backfill (issue #127): grant data:<provider> consent for every active integration token.
// Idempotent — INSERT OR IGNORE skips rows that already exist.
handle.exec(`
INSERT OR IGNORE INTO user_consents (user_id, consent_key, granted_at)
SELECT user_id, 'data:' || provider, connected_at
FROM integration_tokens
WHERE token_status = 'active'
`);
// Drop legacy consent columns (ADR-0014 step 8). Runs after the backfill above.
// Silently skips if already dropped (column not found error) or never existed (new DB).
for (const stmt of [

View File

@@ -189,6 +189,15 @@ export const agentOutputs = sqliteTable('agent_outputs', {
agentVersion: text('agent_version').notNull(), // bump to invalidate on logic changes
});
// Persistent cache for LLM-enriched task descriptions used by clustering.
// Keyed by MD5 of raw task content; avoids re-calling LiteLLM on every agent compute cycle.
export const taskEnrichments = sqliteTable('task_enrichments', {
contentHash: text('content_hash').primaryKey(),
description: text('description').notNull(),
model: text('model').notNull().default('tip-generator'),
createdAt: text('created_at').notNull(),
});
// Admin saved SQL queries.
export const savedQueries = sqliteTable('saved_queries', {
id: text('id').primaryKey(),

View File

@@ -35,7 +35,7 @@ const AGENT_C = { ...MANIFEST_DEFAULTS, id: 'agent-c', required_consents: ['data
beforeAll(async () => {
await testDb.insert(users).values({
id: 'u1', email: 'u@test.com', name: null, image: null, role: 'user',
consentGiven: false, createdAt: NOW,
createdAt: NOW,
});
});

View File

@@ -1,8 +1,10 @@
/**
* Registry-driven agent eligibility filter (ADR-0014 step 5).
* Registry-driven agent eligibility filter (ADR-0014 step 5, updated by ADR-0015).
*
* Rules (all must pass for an agent to be eligible):
* 1. All required_consents are granted and not revoked.
* 1. Every data:<source> in required_consents is granted and not revoked.
* Consent is granted automatically when the user connects that data source.
* agent:<id> consents no longer exist — per-agent control is a preference (rule 3).
* 2. No silenced_in_contexts entry matches an active context.
* 3. user_preferences[scope='agent:<id>', key='enabled'] is not false.
*

View File

@@ -83,16 +83,17 @@ describe('POST /recommend integration', () => {
clearSignalCache?.();
});
it('returns 204 when Todoist is empty and orchestrator fails', async () => {
it('returns fallback tip when orchestrator fails', async () => {
globalThis.fetch = vi.fn().mockImplementation((url: string) => {
if (String(url).includes('todoist.com')) {
return Promise.resolve({ ok: true, status: 200, json: async () => ({ results: [] }) } as any);
}
// /recommend fails → orchestrator returns null, random fallback also empty → 204
return Promise.resolve({ ok: false, status: 503 } as any);
});
const { status } = await post(`${baseUrl}/api/recommend`);
expect(status).toBe(204);
const { status, body } = await post(`${baseUrl}/api/recommend`);
expect(status).toBe(200);
expect(body.tip.source).toBe('fallback');
expect(body.tip.rationale).toBe('AI service issues');
});
it('serves orchestrator tip and writes correct tip_scores columns', async () => {
@@ -132,7 +133,7 @@ describe('POST /recommend integration', () => {
expect(row.tipKind).toBe('advice');
});
it('falls back to random signal tip when orchestrator fails', async () => {
it('falls back to hardcoded tip when orchestrator fails', async () => {
globalThis.fetch = vi.fn().mockImplementation((url: string) => {
if (String(url).includes('todoist.com')) {
return Promise.resolve({
@@ -142,19 +143,14 @@ describe('POST /recommend integration', () => {
}),
} as any);
}
// /recommend fails → falls back to random signal candidate
return Promise.resolve({ ok: false, status: 502 } as any);
});
const { status, body } = await post(`${baseUrl}/api/recommend`);
expect(status).toBe(200);
expect(body.tip.source).toBe('todoist');
const rows = await testDb.select().from(tipScores);
const row = rows[rows.length - 1];
expect(row.policy).toBe('random');
expect(row.promptVersion).toBeNull();
expect(row.llmModel).toBeNull();
expect(body.tip.source).toBe('fallback');
expect(body.tip.rationale).toBe('AI service issues');
expect(body.tip.kind).toBe('advice');
});
it('eligibility filter: only passes consented agent outputs to ml/serving', async () => {
@@ -213,7 +209,7 @@ describe('POST /recommend integration', () => {
});
// Intercept the /recommend body to inspect what agent_outputs were sent
const origFetch = globalThis.fetch as ReturnType<typeof vi.fn>;
const origFetch = globalThis.fetch as unknown as (url: string, init?: RequestInit) => Promise<Response>;
const wrappedFetch = vi.fn().mockImplementation(async (url: string, init?: RequestInit) => {
if (String(url).includes('/recommend') && init?.body) {
const body = JSON.parse(init.body as string);

View File

@@ -1,17 +1,19 @@
import { Router, type Request, type Response, type IRouter } from 'express';
import { nanoid } from 'nanoid';
import { db } from '../db/index.js';
import { agentOutputs, tipFeedback, tipViews, userPreferences } from '../db/schema.js';
import { eq, and, gt, lt } from 'drizzle-orm';
import { agentOutputs, tipFeedback, tipViews, userPreferences, taskEnrichments } from '../db/schema.js';
import { eq, and, gt, lt, inArray } from 'drizzle-orm';
import crypto from 'node:crypto';
import { config } from '../config.js';
import { getProfile, type Profile } from '../profile/builder.js';
import { todoistSource } from '../signals/todoist.js';
import { googleHealthSource } from '../signals/google-health.js';
import { SignalAggregator } from '../signals/aggregator.js';
const router: IRouter = Router();
// Separate aggregator instance — avoids circular dep with recommender.ts.
const _agentAggregator = new SignalAggregator().register(todoistSource);
const _agentAggregator = new SignalAggregator().register(todoistSource).register(googleHealthSource);
// ── Internal auth helper ──────────────────────────────────────────────────────
@@ -26,6 +28,33 @@ function checkInternalToken(req: Request, res: Response): boolean {
// ── DB helpers ────────────────────────────────────────────────────────────────
function contentHash(text: string): string {
return crypto.createHash('md5').update(text).digest('hex');
}
async function fetchEnrichmentCache(tasks: { content?: string }[]): Promise<Record<string, string>> {
const hashes = tasks
.map((t) => t.content?.trim())
.filter((c): c is string => !!c)
.map(contentHash);
if (!hashes.length) return {};
const rows = await db
.select({ contentHash: taskEnrichments.contentHash, description: taskEnrichments.description })
.from(taskEnrichments)
.where(inArray(taskEnrichments.contentHash, hashes));
return Object.fromEntries(rows.map((r) => [r.contentHash, r.description]));
}
async function persistEnrichments(newEntries: Record<string, string>): Promise<void> {
const now = new Date().toISOString();
for (const [hash, description] of Object.entries(newEntries)) {
await db
.insert(taskEnrichments)
.values({ contentHash: hash, description, createdAt: now })
.onConflictDoNothing();
}
}
export async function getActiveAgentOutputs(userId: string) {
const now = new Date().toISOString();
return db
@@ -126,22 +155,52 @@ async function persistInferredPrefs(
}
}
function taskListHash(tasks: { content?: string }[]): string {
const sorted = tasks
.map((t) => t.content?.trim() ?? '')
.filter(Boolean)
.sort()
.join('\n');
return crypto.createHash('md5').update(sorted).digest('hex');
}
async function isUpToDate(userId: string, agentId: string, currentHash: string): Promise<boolean> {
const rows = await db
.select({ signalsSnapshot: agentOutputs.signalsSnapshot })
.from(agentOutputs)
.where(and(eq(agentOutputs.userId, userId), eq(agentOutputs.agentId, agentId)))
.limit(1);
if (!rows.length) return false;
try {
const snapshot = JSON.parse(rows[0].signalsSnapshot ?? '{}') as { _task_hash?: string };
return snapshot._task_hash === currentHash;
} catch { return false; }
}
export async function computeAndStore(userId: string, agentId: string): Promise<void> {
let tasks: object[] = [];
try {
const signals = await _agentAggregator.fetchAll(userId);
tasks = signals.map((s) => ({
id: s.id,
source: s.source,
kind: s.kind,
content: s.content,
// Task-specific fields (default to harmless values for non-task signals)
priority: (s.features.priority as number) ?? 1,
is_overdue: Boolean(s.features.is_overdue),
task_age_days: (s.features.task_age_days as number) ?? 0,
project_id: (s.metadata as Record<string, unknown>).project_id ?? null,
// All features spread so source-specific agents (e.g. health-vitals) can read them
...s.features,
}));
} catch {
// No integration or fetch error — agents that need tasks will report "no tasks"
}
const currentTaskHash = taskListHash(tasks as { content?: string }[]);
if (await isUpToDate(userId, agentId, currentTaskHash)) return;
let profile: Profile = {};
try {
profile = await getProfile(userId);
@@ -162,11 +221,14 @@ export async function computeAndStore(userId: string, agentId: string): Promise<
// Load agent prefs (user overrides + previous inferences) to inject into the compute call.
const agentPrefs = await loadAgentPrefs(userId, agentId);
// Fetch enrichment cache for task titles present in this compute call.
const enrichmentCache = await fetchEnrichmentCache(tasks as { content?: string }[]);
const mlResp = await fetch(`${config.ML_SERVING_URL}/agents/${agentId}/compute`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ user_id: userId, tasks, profile, feedback_history: feedbackHistory, agent_prefs: agentPrefs }),
signal: AbortSignal.timeout(15_000),
body: JSON.stringify({ user_id: userId, tasks, profile, feedback_history: feedbackHistory, agent_prefs: agentPrefs, enrichment_cache: enrichmentCache, task_hash: currentTaskHash }),
signal: AbortSignal.timeout(60_000),
});
if (!mlResp.ok) {
@@ -177,10 +239,16 @@ export async function computeAndStore(userId: string, agentId: string): Promise<
const output = await mlResp.json() as {
user_id: string; agent_id: string; prompt_text: string;
signals_snapshot: unknown; computed_at: string; expires_at: string; agent_version: string;
new_enrichments?: Record<string, string>;
};
await storeAgentOutput(output);
// Persist any new enrichments produced during this compute cycle.
if (output.new_enrichments && Object.keys(output.new_enrichments).length > 0) {
await persistEnrichments(output.new_enrichments);
}
// Run inference framework for this agent and persist results.
// Failures are non-fatal — the compute result is already stored.
try {

View File

@@ -1,7 +1,7 @@
import { type Router as ExpressRouter, Router, Request, Response } from 'express';
import { nanoid } from 'nanoid';
import { db } from '../db/index.js';
import { integrationTokens } from '../db/schema.js';
import { integrationTokens, userConsents } from '../db/schema.js';
import { eq, and } from 'drizzle-orm';
import { config } from '../config.js';
import { requireAuth, AuthenticatedRequest } from '../middleware/session.js';
@@ -12,9 +12,41 @@ const TODOIST_OAUTH_URL = 'https://todoist.com/oauth/authorize';
const TODOIST_TOKEN_URL = 'https://todoist.com/oauth/access_token';
const TODOIST_SCOPES = 'data:read_write';
const GOOGLE_AUTH_URL = 'https://accounts.google.com/o/oauth2/v2/auth';
const GOOGLE_TOKEN_URL = 'https://oauth2.googleapis.com/token';
const GOOGLE_REVOKE_URL = 'https://oauth2.googleapis.com/revoke';
const GOOGLE_HEALTH_SCOPES = [
'https://www.googleapis.com/auth/googlehealth.activity_and_fitness.readonly',
'https://www.googleapis.com/auth/googlehealth.health_metrics_and_measurements.readonly',
'https://www.googleapis.com/auth/googlehealth.sleep.readonly',
].join(' ');
// In-memory CSRF state store
const pendingStates = new Map<string, { userId: string; redirectTo: string }>();
async function grantDataSourceConsent(userId: string, provider: string): Promise<void> {
const consentKey = `data:${provider}`;
const now = new Date().toISOString();
await db.insert(userConsents)
.values({ userId, consentKey, grantedAt: now, revokedAt: null })
.onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { grantedAt: now, revokedAt: null },
});
}
async function revokeDataSourceConsent(userId: string, provider: string): Promise<void> {
const consentKey = `data:${provider}`;
const now = new Date().toISOString();
await db.insert(userConsents)
.values({ userId, consentKey, grantedAt: now, revokedAt: now })
.onConflictDoUpdate({
target: [userConsents.userId, userConsents.consentKey],
set: { revokedAt: now },
});
}
/** GET /api/integrations — list connected integrations */
router.get('/', requireAuth, async (req: AuthenticatedRequest, res: Response) => {
const tokens = await db
@@ -100,10 +132,102 @@ router.get('/todoist/callback', async (req: Request, res: Response) => {
tokenStatus: 'active',
connectedAt: now,
});
await grantDataSourceConsent(pending.userId, 'todoist');
res.redirect(`${config.WEB_BASE_URL}${pending.redirectTo}?connected=todoist`);
});
/** GET /api/integrations/google-health/connect — start Google Fit OAuth */
router.get('/google-health/connect', requireAuth, (req: AuthenticatedRequest, res: Response) => {
const state = nanoid();
pendingStates.set(state, {
userId: req.userId!,
redirectTo: (req.query.redirectTo as string) ?? '/connect',
});
setTimeout(() => pendingStates.delete(state), 10 * 60 * 1000);
const url = new URL(GOOGLE_AUTH_URL);
url.searchParams.set('client_id', config.GOOGLE_CLIENT_ID);
url.searchParams.set('redirect_uri', `${config.API_BASE_URL}/api/integrations/google-health/callback`);
url.searchParams.set('response_type', 'code');
url.searchParams.set('scope', GOOGLE_HEALTH_SCOPES);
url.searchParams.set('state', state);
url.searchParams.set('access_type', 'offline');
url.searchParams.set('prompt', 'consent');
res.redirect(url.toString());
});
/** GET /api/integrations/google-health/callback — Google returns here */
router.get('/google-health/callback', async (req: Request, res: Response) => {
const state = req.query.state as string;
const code = req.query.code as string;
const error = req.query.error as string | undefined;
if (error) {
res.status(400).json({ error: `Google denied access: ${error}` });
return;
}
const pending = pendingStates.get(state);
if (!pending) {
res.status(400).json({ error: 'Invalid or expired state' });
return;
}
pendingStates.delete(state);
const body = new URLSearchParams({
client_id: config.GOOGLE_CLIENT_ID,
client_secret: config.GOOGLE_CLIENT_SECRET,
code,
grant_type: 'authorization_code',
redirect_uri: `${config.API_BASE_URL}/api/integrations/google-health/callback`,
});
const tokenRes = await fetch(GOOGLE_TOKEN_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded', Accept: 'application/json' },
body: body.toString(),
});
if (!tokenRes.ok) {
const detail = await tokenRes.text().catch(() => '');
res.status(502).json({ error: `Failed to exchange Google token: ${detail}` });
return;
}
const tokenData = (await tokenRes.json()) as {
access_token: string;
refresh_token?: string;
expires_in: number;
};
const now = new Date();
const expiresAt = new Date(now.getTime() + tokenData.expires_in * 1000).toISOString();
await db
.delete(integrationTokens)
.where(
and(
eq(integrationTokens.userId, pending.userId),
eq(integrationTokens.provider, 'google-health'),
),
);
await db.insert(integrationTokens).values({
id: nanoid(),
userId: pending.userId,
provider: 'google-health',
accessToken: tokenData.access_token,
refreshToken: tokenData.refresh_token ?? null,
expiresAt,
tokenStatus: 'active',
connectedAt: now.toISOString(),
});
await grantDataSourceConsent(pending.userId, 'google-health');
res.redirect(`${config.WEB_BASE_URL}${pending.redirectTo}?connected=google-health`);
});
/** DELETE /api/integrations/:provider — revoke token */
router.delete('/:provider', requireAuth, async (req: AuthenticatedRequest, res: Response) => {
const provider = String(req.params.provider);
@@ -120,13 +244,18 @@ router.delete('/:provider', requireAuth, async (req: AuthenticatedRequest, res:
.limit(1);
if (token?.provider === 'todoist') {
// Best-effort revocation
await fetch('https://api.todoist.com/sync/v9/access_tokens/revoke', {
method: 'POST',
headers: { Authorization: `Bearer ${token.accessToken}` },
}).catch(() => {});
}
if (token?.provider === 'google-health') {
await fetch(`${GOOGLE_REVOKE_URL}?token=${token.accessToken}`, { method: 'POST' }).catch(() => {});
}
await revokeDataSourceConsent(req.userId!, provider);
await db
.delete(integrationTokens)
.where(

View File

@@ -2,14 +2,15 @@ import { type Router as ExpressRouter, Router, Response } from 'express';
import { nanoid } from 'nanoid';
import { logger } from '../logger.js';
import { db } from '../db/index.js';
import { integrationTokens, tipFeedback, tipViews, tipScores } from '../db/schema.js';
import { tipFeedback, tipViews, tipScores, userPreferences } from '../db/schema.js';
import { eq, and, desc } from 'drizzle-orm';
import { requireAuth, AuthenticatedRequest } from '../middleware/session.js';
import { config } from '../config.js';
import { bus } from '../events/bus.js';
import type { TipCandidate, Signal } from '@oo/shared-types';
import type { Tip, Signal } from '@oo/shared-types';
import { todoistSource, dueAgeDays } from '../signals/todoist.js';
export { dueAgeDays };
import { googleHealthSource } from '../signals/google-health.js';
import { SignalAggregator } from '../signals/aggregator.js';
import { getActiveAgentOutputs } from './agent-outputs.js';
import { getEligibleAgentIds } from '../profile/eligibility.js';
@@ -17,51 +18,78 @@ import { getEligibleAgentIds } from '../profile/eligibility.js';
const router: ExpressRouter = Router();
// ---------------------------------------------------------------------------
// Signal aggregator — register sources here as new integrations are added
// Fallback tips — shown when the AI service is unavailable
// ---------------------------------------------------------------------------
export const aggregator = new SignalAggregator().register(todoistSource);
export const _clearSignalCacheForTests = () => todoistSource.clearCache();
const FALLBACK_TIPS = [
"Take a moment to stretch and breathe — your body and mind will thank you.",
"Write down one thing you're grateful for today.",
"Drink a glass of water. Small acts of self-care add up.",
"Reach out to someone you haven't spoken to in a while.",
"Close a tab you've been meaning to close for days.",
"Step outside for five minutes, even briefly.",
"Put your phone down for the next 30 minutes and see how it feels.",
"Do the smallest possible version of a task you've been avoiding.",
"Tidy one small area — a clear space helps a clear mind.",
"Pause and ask: what would make today feel like a win?",
"Rest is productive. Give yourself permission to recharge.",
"You don't have to do everything today. Pick one thing and do it well.",
];
// ---------------------------------------------------------------------------
// Signal → TipCandidate conversion
// ---------------------------------------------------------------------------
function signalToCandidate(signal: Signal): TipCandidate {
function randomFallbackTip(): import('@oo/shared-types').Tip {
const content = FALLBACK_TIPS[Math.floor(Math.random() * FALLBACK_TIPS.length)];
return {
id: signal.id,
content: signal.content,
source: signal.source as TipCandidate['source'],
kind: signal.kind as TipCandidate['kind'],
sourceId: (signal.metadata.todoistId as string | undefined) ?? undefined,
createdAt: signal.timestamp,
features: signal.features,
id: `fallback:${nanoid()}`,
content,
source: 'fallback',
kind: 'advice',
rationale: 'AI service issues',
createdAt: new Date().toISOString(),
};
}
function randomPolicy(candidates: TipCandidate[]): TipCandidate | null {
if (!candidates.length) return null;
return candidates[Math.floor(Math.random() * candidates.length)];
}
// ---------------------------------------------------------------------------
// Signal aggregator — register sources here as new integrations are added
// ---------------------------------------------------------------------------
export const aggregator = new SignalAggregator().register(todoistSource).register(googleHealthSource);
export const _clearSignalCacheForTests = () => {
todoistSource.clearCache();
googleHealthSource.clearCache();
};
// ---------------------------------------------------------------------------
// Orchestrator: fetch agent snippets + call ml/serving /recommend
// ---------------------------------------------------------------------------
interface OrchestratorResult {
tip: TipCandidate;
tip: Tip;
model: string | null;
agentIds: string[];
}
async function loadOrchestratorPref<T>(userId: string, key: string): Promise<T | undefined> {
const rows = await db
.select({ valueJson: userPreferences.valueJson })
.from(userPreferences)
.where(and(eq(userPreferences.userId, userId), eq(userPreferences.scope, 'orchestrator'), eq(userPreferences.key, key)))
.limit(1);
if (!rows.length) return undefined;
try { return JSON.parse(rows[0].valueJson) as T; } catch { return undefined; }
}
type OrchestratorOutcome = { ok: true; result: OrchestratorResult } | { ok: false };
async function fetchOrchestratorTip(
userId: string,
signals: Signal[],
hour: number,
dayOfWeek: number,
traceparent?: string,
): Promise<OrchestratorResult | null> {
const [allAgentRows, eligibleIds] = await Promise.all([
recentTip?: string,
): Promise<OrchestratorOutcome> {
const [allAgentRows, eligibleIds, scienceDestiny] = await Promise.all([
getActiveAgentOutputs(userId),
getEligibleAgentIds(userId),
loadOrchestratorPref<number>(userId, 'science_destiny'),
]);
const agentOutputs = allAgentRows
.filter((r) => eligibleIds.has(r.agentId))
@@ -78,16 +106,18 @@ async function fetchOrchestratorTip(
const res = await fetch(`${config.ML_SERVING_URL}/recommend`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', ...(traceparent ? { traceparent } : {}) },
body: JSON.stringify({ user_id: userId, agent_outputs: agentOutputs, tasks, hour_of_day: hour, day_of_week: dayOfWeek }),
body: JSON.stringify({ user_id: userId, agent_outputs: agentOutputs, tasks, hour_of_day: hour, day_of_week: dayOfWeek, science_destiny: scienceDestiny ?? 50, recent_tip: recentTip ?? null }),
signal: AbortSignal.timeout(15_000),
});
if (!res.ok) return null;
if (!res.ok) return { ok: false };
const data = (await res.json()) as {
tip: { id: string; content: string; rationale?: string };
model?: string;
};
const now = new Date().toISOString();
return {
ok: true,
result: {
tip: {
id: `llm:${data.tip.id}`,
content: data.tip.content,
@@ -95,13 +125,13 @@ async function fetchOrchestratorTip(
kind: 'advice' as const,
rationale: data.tip.rationale,
createdAt: now,
features: { is_overdue: false, task_age_days: 0, priority: 1 },
},
model: data.model ?? null,
agentIds: agentOutputs.map((a) => a.agent_id),
},
};
} catch {
return null;
return { ok: false };
}
}
@@ -112,31 +142,22 @@ async function fetchOrchestratorTip(
router.post('/recommend', requireAuth, async (req: AuthenticatedRequest, res: Response) => {
const hour = new Date().getHours();
const dayOfWeek = new Date().getDay();
const anyToken = await db
.select({ id: integrationTokens.id })
.from(integrationTokens)
.where(eq(integrationTokens.userId, req.userId!))
.limit(1);
if (!anyToken.length) {
res.status(422).json({ error: 'No integrations connected' });
return;
}
const { recent_tip: recentTip } = req.body as { recent_tip?: string };
const signals = await aggregator.fetchAll(req.userId!);
const t0 = Date.now();
const orchestrated = await fetchOrchestratorTip(req.userId!, signals, hour, dayOfWeek, req.traceparent);
const outcome = await fetchOrchestratorTip(req.userId!, signals, hour, dayOfWeek, req.traceparent, recentTip);
const latencyMs = Date.now() - t0;
const tip = orchestrated?.tip ?? randomPolicy(signals.map(signalToCandidate));
if (!tip) {
res.status(204).end();
if (!outcome.ok) {
res.json({ tip: randomFallbackTip() });
return;
}
const policy = orchestrated ? 'orchestrator' : 'random';
const orchestrated = outcome.result;
const tip = orchestrated.tip;
const policy = 'orchestrator';
const servedAt = new Date().toISOString();
await db.insert(tipViews).values({ id: nanoid(), userId: req.userId!, tipId: tip.id, servedAt });
@@ -147,16 +168,12 @@ router.post('/recommend', requireAuth, async (req: AuthenticatedRequest, res: Re
tipId: tip.id,
policy,
mlScore: null,
featuresJson: JSON.stringify(
orchestrated
? { agent_ids: orchestrated.agentIds, hour_of_day: hour, day_of_week: dayOfWeek }
: { ...tip.features, hour_of_day: hour, day_of_week: dayOfWeek },
),
candidateCount: orchestrated ? 1 : signals.length,
featuresJson: JSON.stringify({ agent_ids: orchestrated.agentIds, hour_of_day: hour, day_of_week: dayOfWeek }),
candidateCount: 1,
latencyMs,
servedAt,
promptVersion: orchestrated ? 'v4-orchestrator' : null,
llmModel: orchestrated ? orchestrated.model : null,
promptVersion: 'v4-orchestrator',
llmModel: orchestrated.model,
tipKind: tip.kind ?? null,
});

View File

@@ -0,0 +1,166 @@
/**
* Tests for the agent pre-compute scheduler (signals/agent-scheduler.ts).
*
* Key behaviour under test: runCycle calls getEligibleAgentIds per user and
* skips computeAndStore for agents the user hasn't consented to.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
vi.mock('../../logger.js', () => ({
logger: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), fatal: vi.fn() },
}));
import { logger } from '../../logger.js';
// ── active-user query: db.selectDistinct(...).from(...).where(...) ──────────
let activeUsers: { userId: string }[] = [];
const userWhereMock = vi.fn(async () => activeUsers);
const userFromMock = vi.fn(() => ({ where: userWhereMock }));
const selectDistinctMock = vi.fn(() => ({ from: userFromMock }));
// ── purge: db.delete(...).where(...) ────────────────────────────────────────
const deleteWhereMock = vi.fn(async () => ({}));
const deleteMock = vi.fn(() => ({ where: deleteWhereMock }));
vi.mock('../../db/index.js', () => ({
db: { selectDistinct: selectDistinctMock, delete: deleteMock },
}));
vi.mock('../../db/schema.js', () => ({
agentOutputs: { expiresAt: 'expires_at' },
tipViews: { userId: 'user_id', servedAt: 'served_at' },
}));
vi.mock('drizzle-orm', () => ({
gt: vi.fn(),
lt: vi.fn(),
and: vi.fn(),
eq: vi.fn(),
isNull: vi.fn(),
}));
vi.mock('../../config.js', () => ({ config: { ML_SERVING_URL: 'http://ml' } }));
// ── computeAndStore — tracks which (user, agent) pairs were computed ────────
const computeAndStoreMock = vi.fn(async () => {});
vi.mock('../../routes/agent-outputs.js', () => ({
computeAndStore: computeAndStoreMock,
}));
// ── eligibility — replaceable per test ─────────────────────────────────────
let eligibleIds: Set<string> = new Set();
const getEligibleAgentIdsMock = vi.fn(async (_userId: string) => eligibleIds);
vi.mock('../../profile/eligibility.js', () => ({
getEligibleAgentIds: getEligibleAgentIdsMock,
}));
// ml-serving /health — return a fixed agent list
global.fetch = vi.fn(async () => ({
ok: true,
json: async () => ({ agents: ['overdue-task', 'momentum', 'time-of-day'] }),
})) as unknown as typeof fetch;
beforeEach(() => {
activeUsers = [];
eligibleIds = new Set();
computeAndStoreMock.mockClear();
getEligibleAgentIdsMock.mockClear();
userWhereMock.mockClear();
deleteWhereMock.mockClear();
vi.clearAllMocks();
vi.useFakeTimers();
// restore default mocks after clearAllMocks
userWhereMock.mockImplementation(async () => activeUsers);
getEligibleAgentIdsMock.mockImplementation(async () => eligibleIds);
computeAndStoreMock.mockResolvedValue(undefined);
deleteWhereMock.mockResolvedValue({});
global.fetch = vi.fn(async () => ({
ok: true,
json: async () => ({ agents: ['overdue-task', 'momentum', 'time-of-day'] }),
})) as unknown as typeof fetch;
});
afterEach(() => {
vi.useRealTimers();
});
describe('startAgentPrecomputeScheduler', () => {
it('skips computeAndStore for agents not in the eligibility set', async () => {
activeUsers = [{ userId: 'alice' }];
eligibleIds = new Set(['momentum']); // only momentum consented
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
const computed = computeAndStoreMock.mock.calls.map((c) => c[1]);
expect(computed).toEqual(['momentum']);
expect(computed).not.toContain('overdue-task');
expect(computed).not.toContain('time-of-day');
});
it('skips all agents when eligibility set is empty', async () => {
activeUsers = [{ userId: 'bob' }];
eligibleIds = new Set(); // no consents
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
expect(computeAndStoreMock).not.toHaveBeenCalled();
expect(logger.info).toHaveBeenCalledWith(
expect.objectContaining({ skipped: 3, ok: 0 }),
'agent-scheduler: cycle complete',
);
});
it('computes all agents when all are eligible', async () => {
activeUsers = [{ userId: 'carol' }];
eligibleIds = new Set(['overdue-task', 'momentum', 'time-of-day']);
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
expect(computeAndStoreMock).toHaveBeenCalledTimes(3);
expect(logger.info).toHaveBeenCalledWith(
expect.objectContaining({ ok: 3, skipped: 0 }),
'agent-scheduler: cycle complete',
);
});
it('skips entire user when eligibility check throws', async () => {
activeUsers = [{ userId: 'dave' }];
getEligibleAgentIdsMock.mockRejectedValueOnce(new Error('db timeout'));
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
expect(computeAndStoreMock).not.toHaveBeenCalled();
expect(logger.error).toHaveBeenCalledWith(
expect.objectContaining({ err: expect.anything(), userId: 'dave' }),
'agent-scheduler: eligibility check failed, skipping user',
);
});
it('checks eligibility independently per user', async () => {
activeUsers = [{ userId: 'u1' }, { userId: 'u2' }];
getEligibleAgentIdsMock.mockImplementation(async (userId: string) =>
userId === 'u1' ? new Set(['momentum']) : new Set(['overdue-task', 'time-of-day']),
);
const { startAgentPrecomputeScheduler } = await import('../agent-scheduler.js');
startAgentPrecomputeScheduler(60_000);
await vi.advanceTimersByTimeAsync(16_000);
await Promise.resolve();
const u1Calls = computeAndStoreMock.mock.calls.filter((c) => c[0] === 'u1').map((c) => c[1]);
const u2Calls = computeAndStoreMock.mock.calls.filter((c) => c[0] === 'u2').map((c) => c[1]);
expect(u1Calls).toEqual(['momentum']);
expect(u2Calls.sort()).toEqual(['overdue-task', 'time-of-day']);
});
});

View File

@@ -15,6 +15,7 @@ import { gt, lt } from 'drizzle-orm';
import { logger } from '../logger.js';
import { config } from '../config.js';
import { computeAndStore } from '../routes/agent-outputs.js';
import { getEligibleAgentIds } from '../profile/eligibility.js';
const FALLBACK_AGENT_IDS = [
'overdue-task',
@@ -67,15 +68,28 @@ async function runCycle(agentIds: string[]): Promise<void> {
let ok = 0;
let failed = 0;
let skipped = 0;
for (const userId of userIds) {
const results = await Promise.allSettled(
agentIds.map((agentId) => computeAndStore(userId, agentId)),
);
for (const r of results) {
if (r.status === 'fulfilled') ok++;
else {
let eligible: Set<string>;
try {
eligible = await getEligibleAgentIds(userId);
} catch (err: any) {
logger.error({ err, userId }, 'agent-scheduler: eligibility check failed, skipping user');
skipped += agentIds.length;
continue;
}
for (const agentId of agentIds) {
if (!eligible.has(agentId)) {
skipped++;
continue;
}
try {
await computeAndStore(userId, agentId);
ok++;
} catch (err: any) {
failed++;
logger.error({ err: r.reason, userId }, 'agent-scheduler: compute error');
logger.error({ err, userId, agentId }, 'agent-scheduler: compute error');
}
}
}
@@ -87,7 +101,7 @@ async function runCycle(agentIds: string[]): Promise<void> {
}
logger.info(
{ ok, failed, users: userIds.length, agents: agentIds.length },
{ ok, failed, skipped, users: userIds.length, agents: agentIds.length },
'agent-scheduler: cycle complete',
);
}

View File

@@ -0,0 +1,304 @@
import type { Signal, SignalSource } from '@oo/shared-types';
import { db } from '../db/index.js';
import { integrationTokens } from '../db/schema.js';
import { eq, and } from 'drizzle-orm';
import { bus } from '../events/bus.js';
import { config } from '../config.js';
import { logger } from '../logger.js';
const CACHE_TTL_MS = 5 * 60_000;
const HEALTH_API_BASE = 'https://health.googleapis.com/v4/users/me/dataTypes';
const GOOGLE_TOKEN_URL = 'https://oauth2.googleapis.com/token';
const STEP_DAILY_GOAL = 7_000;
const SLEEP_GOAL_HOURS = 7;
// v4 DataPoint shape is a union keyed by data type; we read defensively.
interface DataPoint {
[key: string]: unknown;
}
interface DataPointsResponse {
dataPoints?: DataPoint[];
nextPageToken?: string;
}
async function refreshGoogleToken(
userId: string,
refreshToken: string,
): Promise<string | null> {
const body = new URLSearchParams({
client_id: config.GOOGLE_CLIENT_ID,
client_secret: config.GOOGLE_CLIENT_SECRET,
refresh_token: refreshToken,
grant_type: 'refresh_token',
});
const res = await fetch(GOOGLE_TOKEN_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
body: body.toString(),
});
if (!res.ok) return null;
const data = (await res.json()) as { access_token: string; expires_in: number };
const expiresAt = new Date(Date.now() + data.expires_in * 1000).toISOString();
await db
.update(integrationTokens)
.set({ accessToken: data.access_token, expiresAt, tokenStatus: 'active' })
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')));
return data.access_token;
}
function todayMidnightIso(): string {
const d = new Date();
d.setHours(0, 0, 0, 0);
return d.toISOString();
}
function yesterdayIso(): string {
return new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
}
async function fetchDataPoints(
token: string,
dataType: string,
filter: string,
): Promise<DataPoint[]> {
const url = new URL(`${HEALTH_API_BASE}/${dataType}/dataPoints`);
url.searchParams.set('filter', filter);
const res = await fetch(url.toString(), {
headers: { Authorization: `Bearer ${token}` },
});
if (!res.ok) throw new Error(`health ${dataType}: ${res.status}`);
const data = (await res.json()) as DataPointsResponse;
return data.dataPoints ?? [];
}
// Defensive numeric reader — probes likely field names in a v4 DataPoint payload.
function readNumber(point: DataPoint, paths: string[][]): number {
for (const path of paths) {
let cur: unknown = point;
for (const key of path) {
if (cur && typeof cur === 'object' && key in (cur as object)) {
cur = (cur as Record<string, unknown>)[key];
} else {
cur = undefined;
break;
}
}
if (typeof cur === 'number') return cur;
}
return 0;
}
function readString(point: DataPoint, paths: string[][]): string | undefined {
for (const path of paths) {
let cur: unknown = point;
for (const key of path) {
if (cur && typeof cur === 'object' && key in (cur as object)) {
cur = (cur as Record<string, unknown>)[key];
} else {
cur = undefined;
break;
}
}
if (typeof cur === 'string') return cur;
}
return undefined;
}
export class GoogleHealthSignalSource implements SignalSource {
readonly id = 'google-health';
private cache = new Map<string, { signals: Signal[]; fetchedAt: number }>();
clearCache(userId?: string): void {
if (userId) this.cache.delete(userId);
else this.cache.clear();
}
async fetchSignals(userId: string): Promise<Signal[]> {
const entry = this.cache.get(userId);
if (entry && Date.now() - entry.fetchedAt < CACHE_TTL_MS) return entry.signals;
const [row] = await db
.select()
.from(integrationTokens)
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')))
.limit(1);
if (!row) return [];
let token = row.accessToken;
const isExpired = row.expiresAt && new Date(row.expiresAt).getTime() - Date.now() < 5 * 60_000;
if (isExpired && row.refreshToken) {
const refreshed = await refreshGoogleToken(userId, row.refreshToken);
if (!refreshed) {
logger.warn({ userId }, 'google-health: refresh failed');
await db
.update(integrationTokens)
.set({ tokenStatus: 'needs_reconnect' })
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')));
bus.publish('signals.integration.token_expired', {
userId,
provider: 'google-health',
detectedAt: new Date().toISOString(),
});
return entry?.signals ?? [];
}
token = refreshed;
}
try {
const dayStartIso = todayMidnightIso();
const dayEndIso = new Date().toISOString();
const yIso = yesterdayIso();
const stepsFilter = `steps.interval.start_time >= "${dayStartIso}" AND steps.interval.start_time < "${dayEndIso}"`;
const caloriesFilter = `total_calories.interval.start_time >= "${dayStartIso}" AND total_calories.interval.start_time < "${dayEndIso}"`;
const hrFilter = `heart_rate.sample_time.physical_time >= "${dayStartIso}" AND heart_rate.sample_time.physical_time < "${dayEndIso}"`;
const sleepFilter = `sleep.interval.start_time >= "${yIso}" AND sleep.interval.start_time < "${dayEndIso}"`;
const [stepsPts, caloriesPts, hrPts, sleepPts] = await Promise.all([
fetchDataPoints(token, 'steps', stepsFilter),
fetchDataPoints(token, 'total-calories', caloriesFilter),
fetchDataPoints(token, 'heart-rate', hrFilter),
fetchDataPoints(token, 'sleep', sleepFilter),
]);
// One-time peek at raw shape so we can refine field paths after first real OAuth.
logger.debug(
{ userId, samples: { stepsPts: stepsPts.slice(0, 1), caloriesPts: caloriesPts.slice(0, 1), hrPts: hrPts.slice(0, 1), sleepPts: sleepPts.slice(0, 1) } },
'google-health: v4 dataPoints sample',
);
const signals: Signal[] = [];
const now = new Date().toISOString();
const steps = stepsPts.reduce(
(sum, p) => sum + readNumber(p, [['steps', 'count'], ['count']]),
0,
);
const stepGoalPct = Math.round((steps / STEP_DAILY_GOAL) * 100);
signals.push({
id: `google-health:steps`,
source: 'google-health',
kind: 'health',
content: `${steps.toLocaleString()} steps today (${stepGoalPct}% of ${STEP_DAILY_GOAL.toLocaleString()} goal)`,
metadata: { dataType: 'steps' },
features: {
step_count: steps,
step_goal_pct: stepGoalPct,
step_goal: STEP_DAILY_GOAL,
below_step_goal: steps < STEP_DAILY_GOAL,
},
timestamp: now,
});
const calories = Math.round(
caloriesPts.reduce(
(sum, p) =>
sum + readNumber(p, [['totalCalories', 'kilocalories'], ['kilocalories'], ['energy', 'kilocalories']]),
0,
),
);
signals.push({
id: `google-health:activity`,
source: 'google-health',
kind: 'health',
content: `${calories} calories burned today`,
metadata: { dataType: 'activity' },
features: {
calories_burned: calories,
},
timestamp: now,
});
if (hrPts.length > 0) {
const hrValues = hrPts
.map((p) => readNumber(p, [['heartRate', 'beatsPerMinute'], ['beatsPerMinute']]))
.filter((v) => v > 0);
if (hrValues.length > 0) {
const bpm = Math.round(hrValues.reduce((a, b) => a + b, 0) / hrValues.length);
signals.push({
id: `google-health:heart_rate`,
source: 'google-health',
kind: 'health',
content: `Resting heart rate: ${bpm} bpm`,
metadata: { dataType: 'heart_rate' },
features: { resting_bpm: bpm, elevated_hr: bpm > 90 },
timestamp: now,
});
}
}
if (sleepPts.length > 0) {
const sleepSessions = sleepPts
.map((p) => ({
start: readString(p, [['sleep', 'interval', 'startTime'], ['interval', 'startTime'], ['startTime']]),
end: readString(p, [['sleep', 'interval', 'endTime'], ['interval', 'endTime'], ['endTime']]),
}))
.filter((s): s is { start: string; end: string } => !!s.start && !!s.end)
.sort((a, b) => Date.parse(b.end) - Date.parse(a.end));
const last = sleepSessions[0];
if (last) {
const durationMs = Date.parse(last.end) - Date.parse(last.start);
const sleepHours = Math.round((durationMs / 3_600_000) * 10) / 10;
const belowGoal = sleepHours < SLEEP_GOAL_HOURS;
signals.push({
id: `google-health:sleep`,
source: 'google-health',
kind: 'health',
content: `${sleepHours}h sleep last night (${belowGoal ? 'below' : 'meets'} ${SLEEP_GOAL_HOURS}h goal)`,
metadata: { dataType: 'sleep' },
features: {
sleep_hours: sleepHours,
sleep_goal_hours: SLEEP_GOAL_HOURS,
sleep_deficit_hours: Math.max(0, SLEEP_GOAL_HOURS - sleepHours),
below_sleep_goal: belowGoal,
},
timestamp: now,
});
}
}
this.cache.set(userId, { signals, fetchedAt: Date.now() });
bus.publish('signals.task.synced', {
userId,
source: 'google-health',
count: signals.length,
syncedAt: now,
});
return signals;
} catch (err: unknown) {
const status = (err as { message?: string }).message;
if (status?.includes('401')) {
logger.warn({ userId }, 'google-health: token expired (401)');
if (row.refreshToken) {
await refreshGoogleToken(userId, row.refreshToken);
} else {
await db
.update(integrationTokens)
.set({ tokenStatus: 'needs_reconnect' })
.where(and(eq(integrationTokens.userId, userId), eq(integrationTokens.provider, 'google-health')));
bus.publish('signals.integration.token_expired', {
userId,
provider: 'google-health',
detectedAt: new Date().toISOString(),
});
}
} else {
logger.error({ userId, err }, 'google-health: fetch failed');
}
return entry?.signals ?? [];
}
}
}
export const googleHealthSource = new GoogleHealthSignalSource();