refactor: architecture revision — modular monolith, auth-commit, event protobuf, privacy-from-day-0
- ADR-0003: modular monolith for Phase 0 with documented extraction triggers - ADR-0004: Auth.js + OIDC-shaped boundary; dedicated provider when mobile ships - ADR-0005: protobuf for events, OpenAPI for HTTP, schema-registry CI gate - New architecture docs: data-model, metrics (magic proxies), privacy (Phase-0 feature) - Prime directives updated: privacy-as-feature, modular-by-package-deployable-by-stage - Roadmap revised: Apple OAuth deferred to M1; web push in M1; k3s intermediate; tip-kind-aware UI - PLAN updated: Phase-0 deletion endpoint, metrics baseline, compose profiles, import-boundary lint - License decision in README (ARR with OSS plan in Phase 5)
This commit is contained in:
43
docs/architecture/metrics.md
Normal file
43
docs/architecture/metrics.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Metrics: measuring "magic"
|
||||
|
||||
We cannot build a product whose core promise is "feels like magic" without proxies for it. These are the metrics every change is measured against.
|
||||
|
||||
## North star
|
||||
|
||||
**Week-2 tip-reaction rate** — of users who saw a tip in week 1, what fraction reacted to *any* tip in week 2? Captures "did this become part of your life."
|
||||
|
||||
## Activation (single-session)
|
||||
|
||||
- **Time-to-first-tip** — sign-in → tip rendered. Target: ≤ 60 s on the happy path.
|
||||
- **First-tip reaction rate** — fraction of users who interact (done/snooze/dismiss/save) with their very first tip. Target: > 50%.
|
||||
|
||||
## Engagement
|
||||
|
||||
- **Dwell-before-action** — seconds between tip render and first reaction. Too short = glance-away; too long = confused.
|
||||
- **Done rate / (Done + Snooze + Dismiss)** — the quality proxy. Rising = tips feel on-target.
|
||||
- **Snooze:Dismiss ratio** — high snooze = "good tip, wrong moment" (timing problem). High dismiss = "wrong tip entirely" (relevance problem). These point at different fixes.
|
||||
- **Return cadence** — median inter-session gap. Stable-and-short > spiky.
|
||||
|
||||
## Retention
|
||||
|
||||
- D1, D7, D28 retention. Cohort-sliced by connected integrations.
|
||||
- Churn signal: 7 days without a session.
|
||||
|
||||
## ML health (from M1)
|
||||
|
||||
- Policy latency p50/p95/p99 at the recommender boundary.
|
||||
- Feature null-rate per feature, per user.
|
||||
- Online/offline reward disagreement for shadowed policies.
|
||||
- Bandit regret proxy: observed reward vs an oracle's best-possible on the same candidates.
|
||||
|
||||
## Privacy & trust
|
||||
|
||||
- Account-deletion completion time (target: < 24 h).
|
||||
- Provider-revocation success rate on disconnect.
|
||||
- Number of active credentials per user (low = healthy).
|
||||
|
||||
## How metrics become decisions
|
||||
|
||||
- **Per-change.** Any policy or UX change declares which metric it expects to move and by how much. Missing the target triggers a review, not an automatic rollback (humans judge).
|
||||
- **Shadow > A/B > launch.** Policy changes ship in shadow first (log what it *would* have recommended); then A/B on live traffic; then launch once online reward estimate ≥ incumbent by a CI margin.
|
||||
- **Dashboards before features.** If we cannot measure a feature's impact on the north-star metric, we defer the feature.
|
||||
Reference in New Issue
Block a user