Step 4 — /api/profile read-through API:
GET /api/profile → { user, prefs, consents, contexts }
PATCH /api/profile/prefs/:scope upsert user_preferences (source='user')
PATCH /api/profile/consents grant / revoke consent keys
PATCH /api/profile/contexts create / activate / deactivate contexts
Legacy consentGiven bit folded in as data:core fallback.
Step 5 — registry-driven eligibility filter:
fetchRegistry() exported from agent-registry.ts.
profile/eligibility.ts: getEligibleAgentIds(userId) — filters by required
consents, silenced_in_contexts, and user_preferences[enabled=false].
fetchOrchestratorTip filters agent_outputs to eligible set before calling
ml/serving /recommend. Fail-closed: registry unavailable → empty set.
Step 6 — shared context-inference framework (#111) + time-of-day proof (#112):
ml/agents/inference/: UserHistory, FeedbackEvent, run_inference().
Framework: cold-start, min_history gating, error fallback, structured logs.
TimeOfDayAgent v1.1.0: inferred_params=[preferred_hour]; also reads
quiet_start/quiet_end from agent_prefs. agent_prefs injected by TS caller.
AgentInput gains agent_prefs field.
ml/serving: POST /agents/{agent_id}/infer endpoint.
agent-outputs.ts computeAndStore: loads prefs before compute, calls /infer
after, persists results (source='inferred'); user overrides never touched.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ml/
Python. Owns models, features, training, online scoring.
| Dir | Role | Phase |
|---|---|---|
serving/ |
FastAPI online scorer (/score, /generate) + LiteLLM gateway + prompt registry (prompts.py) + JetStream consumers for signals.> / feedback.>, called by recommender |
1–2 |
features/ |
context assembler (context.py): signals → PromptContext; profile-feature schema mirror (profile_schema.py); Feast adapter later |
2 |
pipelines/ |
batch feature + training scripts | 4 |
registry/ |
MLflow-backed model registry integration | 4 |
experiments/ |
A/B assignment + multi-armed bandit policies | 4 |
notebooks/ |
research; never imported by production code | — |
Principles
- Every model has a model card in
registry/describing inputs, offline metrics, fairness checks, and rollout history. - Online inference must be stateless and < 50ms p99.
- Training reads from the offline feature store; serving reads from the online feature store; definitions are shared (no train/serve skew).
- Shadow deploys before any policy change that affects real users.
Feature contract
Profile features (batched)
User-level features (completion rate, preferred hour, tip volume…) are computed
by the TypeScript recommender and shipped to ml/serving on every /score and
/generate call as profile_features: dict | None. The Python mirror in
features/profile_schema.py documents each feature's name, dtype, TTL, source,
and null fallback — keep it in sync with services/api/src/profile/registry.ts
(a CI-style test asserts names and ttlSec values match). See ADR-0011.
Context features (JIT)
Request-time signals assembled by features/context.py (hour_of_day,
day_of_week, task list). These are never cached — they are derived from the
system clock and the live Todoist feed at the moment of the score call.
CONTEXT_FEATURES in context.py declares freshness, source, and fallback for
each field (issue #61).
Prompt registry
serving/prompts.py keys tip-generation prompts by stable version string. Adding a new variant means adding an entry — no caller changes. Selection precedence: POST /generate body's prompt_version field → env DEFAULT_PROMPT_VERSION → "v1". The TypeScript recommender drives selection via TIP_PROMPT_VERSION (single value or comma-separated rotation); the version actually used flows back in the response and is persisted to tip_scores.prompt_version so the admin reward-analytics dashboard can bucket reactions per variant.