alvis/oO

Files

alvis b8113d4bda docs(adr-0011): point B.3 at new issue #99

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-25 00:41:20 +00:00

4.2 KiB

Raw Blame History

ADR-0011 — User-profile feature registry

Status: Accepted (phase A) Date: 2026-04-25 Issue: #81

Context

The bandit and LLM tip generator only saw per-candidate features (is_overdue, task_age_days, priority) plus contextual time signals. There was no notion of a user-level profile — completion rate, dismiss rate, preferred hour, tip volume — even though all the raw data already lives in tip_views, tip_feedback, and tip_scores.

#81 originally proposed putting the feature registry in ml/features/ (Python). We're choosing differently for the data-locality reason: the aggregations are SQL queries against tables owned by services/api. Computing them in Python means a network round-trip per recommendation for queries that are sub-ms in TS.

Decision

Two-sided design with one source of truth:

services/api/src/profile/registry.ts — source of truth. Each FeatureDefinition declares { name, dtype, ttlSec, description, compute }. compute(userId, sqlite) runs the aggregation SQL directly via the raw better-sqlite3 client.
services/api/src/profile/builder.ts — getProfile(userId) returns the full feature dict, lazily recomputing any entry whose stored row is past its ttlSec. rebuildProfile(userId) force-refreshes everything.
user_profile_features table — KV per (user_id, name) with value (REAL) for numeric and value_text (TEXT) for categorical. Phase A ships only numeric features.
ml/features/profile_schema.py — contract mirror. Names, dtypes, and descriptions only — no compute. A test reads the TS file and asserts the name sets match, catching drift.
POST /score and POST /generate in ml/serving accept an optional profile_features: dict | None. Stored on the request object but not consumed by the bandit yet — extending the feature vector changes D and resets every user's learned state. That's a deliberate phase-B decision.

Initial features: completion_rate_30d, dismiss_rate_30d, mean_dwell_ms_30d, preferred_hour, tip_volume_30d.

Consequences

Good:

Adding a feature = one entry in registry.ts + one mirror line in profile_schema.py. No DB migration required (KV table).
TTL keeps recommendation latency bounded: every recommend call refreshes at most 5 features, each a single indexed query against an already-warm DB.
Profile data is now visible to ml/serving via the request payload — eval harnesses and the LLM tip generator can use it without a DB round-trip.

Trade-offs:

TS owns compute → ml-side changes that need new features still require a TS PR. Acceptable while the modular monolith holds; if ml/serving becomes the system of record for any feature, it should own its own table.
TTL-based refresh has up-to-ttlSec lag on user-visible behavior change. Phase B replaces this with event-driven incremental updates subscribing to signals.tip.feedback.

Phase B

✅ B.1 — Per-user profile view + rebuild action in /admin/users/:id.
✅ B.2 — Event-driven invalidation: features declare invalidatedBy subjects in the registry; profile/subscriber.ts deletes the affected stored rows on publish so the next getProfile call recomputes immediately rather than waiting up to ttlSec. TTL stays as a safety net for clock drift / dropped events.
✅ B.4 — Staleness panel in /admin/data-quality (counts missing + stale per feature across eligible users).
⏳ B.3 — Extend the bandit feature vector to include profile features (deliberate D change with state-migration plan + shadow rollout per ADR-0002). Tracked separately as #99 since it's a multi-step initiative, not an incremental phase.

Alternatives considered

Registry in Python (per the original issue text) — rejected: the aggregations live in TS-owned tables; round-tripping per recommend adds latency for no architectural gain.

Compute in the recommender route inline — rejected: features would be recomputed on every recommendation with no cache or staleness semantics.

Use tip_scores.featuresJson as the profile store — rejected: that column is per-tip explainability, not per-user state. Mixing them complicates both reads.

4.2 KiB Raw Blame History