When fetchOrchestratorTip returned null (LiteLLM timeout, bad JSON, etc.)
the recommender silently fell back to randomPolicy, serving a raw Todoist
task with no rationale — explaining both reported symptoms.
- Remove randomPolicy/signalToCandidate; return 204 when orchestrator fails
so the UI shows "All clear" instead of a confusing Todoist task
- Pass recent_tip through the stack (frontend → POST /recommend →
fetchOrchestratorTip → ml/serving RecommendRequest → build_orchestrator_messages)
so after snooze the LLM is instructed not to repeat the snoozed content
Fixes#122
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OAuth2 flow with all 11 Google Fitness scopes (activity, body, sleep,
heart rate, nutrition, location, blood glucose/pressure/temperature,
oxygen saturation, reproductive health). Stores access + refresh tokens;
auto-refreshes on expiry.
GoogleHealthSignalSource fetches steps, sleep sessions, active minutes,
calories, and heart rate from the Fit aggregate + sessions APIs. Signals
flow into both the tip orchestrator and the health-vitals pre-compute
agent, which generates prompt snippets about step progress, sleep
deficit, sedentary time, and elevated heart rate.
Signal.kind extended with 'health'; IntegrationProvider extended with
'google-health'. Agent compute signal mapping enriched to include source,
kind, and all features so health-vitals can filter its own signals.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Convert ml-serving from isolated MLflow runs to nested traces using
mlflow.start_span_no_context(). The recommend endpoint now emits a full
span tree: recommend (CHAIN) → build_context (TOOL), agent:* (AGENT) ×N,
llm_orchestrator (LLM). Compute and infer endpoints each emit a single span.
Supporting changes:
- mlflow-skinny>=3.1.0 added to requirements
- MLflow configured with --serve-artifacts + mlflow-artifacts:/ default root
for cross-container artifact proxy (spans now persist from ml-serving)
- --allowed-hosts extended to include mlflow:5000 (SDK includes port in Host)
- science_destiny slider wired through prompts.py and recommend endpoint
- Config page exposes science/destiny slider (0=data-driven, 100=intuitive)
- Tip page shows rationale inline on tap
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deletes shadowPolicies map, getShadowPolicies, setPolicyActive from
recommender.ts; removes /api/admin/policies routes from admin.ts; removes
getPolicies, togglePolicy, PolicyInfo from admin api.ts; removes the
policy toggle section from the ops page.
168 API tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
POST /recommend now calls ml/serving /recommend with pre-computed agent
snippets + task context instead of /generate + /score/egreedy/v2. Falls
back to a random signal candidate when ml/serving is unavailable.
Removes: remotePolicy, fetchLlmCandidates, sendRewardWithRetry,
candidateCache, pickPromptVersion. Feedback handler keeps inferReward +
tipFeedback writes for observability; reward delivery to the bandit is gone.
tipScores.policy is now 'orchestrator'; promptVersion is 'v4-orchestrator'.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- TS: pino + pino-http; every HTTP request log includes traceId from
W3C traceparent header (generated if absent); forwarded to ml/serving
on all /score, /generate, /reward, and /api/ml proxy calls
- Python: structlog JSON; FastAPI middleware binds trace_id via
contextvars so every log line within a request carries it
- Sentry: optional SENTRY_DSN init in both runtimes (no-op if unset)
- Replace all console.* calls across services/api with pino logger
- Update tests to spy on logger instead of console
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ship the scaffolding for #99 (phase B.3 of #81):
- ml/serving: add /score/egreedy/v2, /reward/egreedy/v2, /stats/egreedy/v2
endpoints (D=12). New feature dims: completion/dismiss rates, mean dwell
(clipped 10min), preferred-hour alignment (cosine, 1-dim), tip volume (log).
Separate state file per user (_egreedy_v2.json). /reset clears v2 state too.
- ADR-0012: documents D=7→12 dimension change, normalization choices, shadow
rollout protocol, and promotion gate (offline sim win per ADR-0002).
- recommender.ts: register egreedy-v2-shadow in shadow-policy map (disabled by
default). When enabled, calls /score/egreedy/v2 fire-and-forget and publishes
shadow:egreedy-v2-shadow serve signal. No reward to shadow — sim is the gate.
- sim runner/personas: personas carry synthetic profile_features per persona;
_call_score/_call_reward thread profile_features through (None-safe for v1/linucb).
- 18 new Python tests; all 56 Python + 170 TS tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Centralizes user-level features (completion_rate_30d, dismiss_rate_30d,
mean_dwell_ms_30d, preferred_hour, tip_volume_30d) in a TS registry that
owns both definition and SQL aggregation, since the data lives in the
TS-owned SQLite tables (tip_views/tip_feedback). Lazy TTL refresh keeps
recommend latency bounded; values persist in user_profile_features (KV).
ml/serving accepts profile_features on /score + /generate but does not
yet consume them — extending the bandit feature vector changes D and
resets every user's learned state, so that's a deliberate phase-B step.
Includes ml/features/profile_schema.py as a contract mirror with a sync
test that diffs name sets against registry.ts.
ADR-0011 records the data-locality reasoning (registry in TS, not Python
as the issue originally suggested).
Phase B (deferred): event-driven incremental updates, bandit consumption
with state migration, admin per-user profile page, staleness alerts.
Refs #81.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the hardcoded "v1" label with a real prompt registry:
ml/serving/prompts.py — keyed by version: v1 (baseline),
v2-mentor (calm/specific persona),
v3-few-shot (v1 persona + curated examples)
ml/serving/main.py — POST /generate accepts optional prompt_version,
422 on unknown, echoes the version actually used
back in the response
services/api/src/config.ts — TIP_PROMPT_VERSION: empty / single / comma-list
(uniform random per request)
services/api/src/routes/recommender.ts
— pickPromptVersion() drives selection; the
response's prompt_version (not a stale TS
constant) is what lands in tip_scores so the
#92 reward-analytics dashboard shows real
per-variant reaction rates
Closes#84.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add Signal + SignalSource interfaces to packages/shared-types
- TipCandidate.features widened to Record<string,number|boolean> to match Signal
- TodoistSignalSource: encapsulates fetch, cache, 401 handling, bus events, and act()
- SignalAggregator: parallel fan-out across sources with per-source failure isolation
- Recommender refactored to consume Signal[] via aggregator; source action dispatch via aggregator.act()
- ADR-0009: signal normalization strategy
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ML serving:
- LinUCB contextual bandit (disjoint, d=5 features: hour_sin/cos, is_overdue, task_age, priority)
- /score endpoint replaces stub random; /reward endpoint for online learning
- Per-user model state persisted to disk as JSON (survives restarts)
- venv at ml/serving/.venv; start with pnpm dev from ml/serving
Recommender:
- Todoist fetch now extracts features (is_overdue, task_age_days, priority)
- RemotePolicy calls ml/serving with 3s timeout; falls back to RandomPolicy
- Reward sent to /reward on feedback (done=+1, snooze=0, dismiss=-1)
Web Push:
- VAPID keys in config; push_subscriptions table in DB
- POST/DELETE /api/push/subscribe; GET /api/push/vapid-public-key
- Service worker (public/sw.js): push → showNotification, notificationclick → focus/open
- "notify me" button on tip page; registers SW + subscribes on permission grant
Event bus:
- services/api/src/events/bus.ts: typed EventEmitter wrapper
- Subjects: signals.tip.served, signals.tip.feedback, signals.task.synced
- Same publish/subscribe API NATS JetStream will implement — swap is mechanical
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- /legal/terms and /legal/privacy pages (linked from sign-in)
- Consent (consentGiven=true) recorded on first Google sign-in
- tip_views table: one row per tip served — enables activation + reaction rate queries
- tip_views purged on account deletion
- Delete account button on /connect (confirm → revoke tokens → purge data → sign out)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>