From c5ea18ec6e17a1e7a73996ee1e7ef1ea747f9d20 Mon Sep 17 00:00:00 2001 From: alvis Date: Thu, 16 Apr 2026 03:57:29 +0000 Subject: [PATCH] docs: mark M1 fully shipped in roadmap Co-Authored-By: Claude Sonnet 4.6 --- README.md | 53 +++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 49 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 1933098..1583dc4 100644 --- a/README.md +++ b/README.md @@ -84,16 +84,61 @@ Goal: a single user signs in with Google, connects Todoist, and sees one random ### Phase 1 — Real signal + in-the-moment delivery *(M1)* ✓ shipped Goal: tips are picked, not drawn from a hat — and they arrive at the right moment on the web. -- [x] Event bus scaffold: typed in-process EventEmitter (`services/api/src/events/bus.ts`); subjects match future NATS JetStream — swap is mechanical +- [x] Event bus scaffold: typed in-process EventEmitter with 500-event ring buffer; subjects match future NATS JetStream — swap is mechanical - [x] Todoist sync emits `signals.task.synced`; tip served/feedback emit `signals.tip.*` - [x] Features extracted per task: `is_overdue`, `task_age_days`, `priority`; context: `hour_of_day`, `day_of_week` -- [x] `ml/serving` LinUCB bandit (d=5, alpha=1.0); per-user state persisted to disk; `/score` + `/reward` endpoints -- [x] `RemotePolicy` in recommender: calls ml/serving, falls back to RandomPolicy on timeout/error -- [x] Feedback loop: reactions mapped to rewards (done=+1, snooze=0, dismiss=-1) → online LinUCB update +- [x] `ml/serving` LinUCB bandit (d=5, alpha=1.0); per-user state persisted to disk; `/score` + `/reward` + `/reset` + `/stats` + `/features` endpoints +- [x] `RemotePolicy` in recommender: calls ml/serving, falls back to RandomPolicy on timeout/error; logs explainability to `tip_scores` +- [x] Feedback loop: reactions mapped to rewards (done=+1, helpful=+0.5, snooze=0, not_helpful=−0.5, dismiss=−1) → online LinUCB update +- [x] In-app **helpful / not helpful** coarse signal (#62) — long-press action sheet on tip page - [x] **Web Push** (VAPID): SW, subscribe/unsubscribe API, "notify me" button on tip page +- [x] Shadow-policy registry: run N shadow policies per request, log picks without serving them (#56) - [ ] Quiet-hours + dedupe for push delivery - [ ] Delayed rewards: tasks completed directly in Todoist (requires webhook from Todoist) - [ ] NATS JetStream replacing in-process bus (when multi-process pressure arrives) + +#### M1 add-on — Admin & ML Ops Console *(fully shipped)* + +oO is ML-heavy. Without a cockpit, every model change ships blind. This console is the team's single pane for users, signals, features, models, experiments, and tip outcomes — with the ability to *act* on them (revoke a token, replay an event, promote a model, reset a bandit). + +**Framework pick — `apps/admin` on Next.js 15 + Tremor + shadcn/ui.** Analytics-first UI for an analytics-first product, stays on our existing TS/React/Tailwind stack, reuses `packages/shared-types`, `sdk-js`, and the Auth.js session. Specialized ML tooling (MLflow, Grafana, Marimo) is **embedded** via authenticated reverse-proxy, not re-implemented. + +| Layer | Tool | Why | +|-------|------|-----| +| App shell | **Next.js 15** (new `apps/admin`) | Same stack as `apps/web`; reuses auth, types, SDK | +| Dashboards / charts | **[Tremor](https://tremor.so)** | Analytics-first React + Tailwind — KPI cards, time-series, categorical, heatmaps | +| CRUD primitives | **[shadcn/ui](https://ui.shadcn.com)** | Copy-paste Radix components; forms, dialogs, command palette | +| Heavy grids | **[TanStack Table v8](https://tanstack.com/table)** | Sortable / paginated / virtualized tables (events, users, tips) | +| Extra charts | **[Recharts](https://recharts.org)** / **[visx](https://airbnb.io/visx)** | Fallbacks where Tremor falls short (e.g. force graphs, Sankey) | +| Model registry | **[MLflow UI](https://mlflow.org)** *(embedded)* | Artifact + run browser; don't re-build | +| Infra metrics | **[Grafana](https://grafana.com)** *(embedded panels)* | One ops source of truth | +| Ad-hoc analysis | **[Marimo](https://marimo.io)** reactive notebooks | Python-native for the ML side; launch-out link | +| AuthZ | `profile.role='admin'` + Next.js middleware | Reuses existing session; no new auth surface | + +**Rejected alternatives (so we don't re-litigate):** +- *Retool / AppSmith* — low-code speed, but admin logic leaves our repo; weak analytics affordances for an analytics product +- *Streamlit / Gradio / Dash* — Python-first; thin RBAC and routing; splits our frontend stack in two +- *React-admin / Refine.dev* — strong CRUD scaffolding, but analytics/ML views feel bolted on; we'd rebuild Tremor-style dashboards ourselves +- *Superset / Metabase as the admin surface* — excellent for BI, poor for operational **writes** (revoke, replay, promote). Plan: **adopt Superset in M4** for BI alongside batch pipelines; ship a read-only SQL widget inside admin for now + +**Build sequence (plan, not code):** +1. [x] **ADR-0006** — record the framework choice + "embed, don't rebuild" rule for MLflow/Grafana +2. [x] **Scaffold** — `apps/admin` with Next.js 15, Tailwind, Tremor; deploy behind Caddy at `admin.o.alogins.net` +3. [x] **RBAC** — `role` column on `users`; admin-only Next.js middleware; seed first admin via `ADMIN_SEED_EMAIL` env; `admin_actions` audit-log table +4. [x] **Overview dashboard** — DAU/WAU KPI cards, tips served, reaction breakdown, activation funnel +5. [x] **User explorer** — list + detail page: identity, consents, integrations, last tip, reward history; revoke-integration + reset-bandit actions +6. [x] **Event stream viewer** — live tail of `signals.*` with filters by subject/user/time; same UI when the bus swaps to NATS +7. [x] **Feature store browser** — features sent to `ml/serving` per scoring call; diff across time for a user +8. [x] **Model registry panel** — embed MLflow UI at `/admin/models`; promote / archive via admin context menu (writes audit-logged) +9. [x] **Experiment dashboard** — LinUCB per-arm stats (pulls, reward mean, α), cohort compare, bandit reset control +10. [x] **Recommendation log (explainability)** — per served tip: `(user, features, policy, score, feedback, latency)`; `tip_scores` table, 30-day retention +11. [x] **Reward analytics** — reaction distribution over time; per-policy compare; slice by `hour_of_day`, `priority`, cohort +12. [x] **Data quality widget** — missing-feature rate, stale-token rate, daily completeness heatmap +13. [x] **Ops actions** — revoke token (Users page), replay signal, disable/promote shadow policy; every action audit-logged +14. [x] **Read-only SQL runner** — SELECT-only runner against SQLite + saved queries (sunsets to Superset in M4) +15. [x] **Health rollup** — `/admin/health` surfaces api, ml/serving, SQLite, event-bus; auto-refreshes every 15s +16. [ ] **Docs** — `apps/admin/README.md`, runbook for common ops actions, ADR-0006 merged + - [ ] Apple OAuth (deferred to M2) ### Phase 2 — Multi-source profile & trust *(M2)*