Files
oO/ml
alvis 7d4c29e137 feat(profile): user-profile feature registry + builder (phase A)
Centralizes user-level features (completion_rate_30d, dismiss_rate_30d,
mean_dwell_ms_30d, preferred_hour, tip_volume_30d) in a TS registry that
owns both definition and SQL aggregation, since the data lives in the
TS-owned SQLite tables (tip_views/tip_feedback). Lazy TTL refresh keeps
recommend latency bounded; values persist in user_profile_features (KV).

ml/serving accepts profile_features on /score + /generate but does not
yet consume them — extending the bandit feature vector changes D and
resets every user's learned state, so that's a deliberate phase-B step.

Includes ml/features/profile_schema.py as a contract mirror with a sync
test that diffs name sets against registry.ts.

ADR-0011 records the data-locality reasoning (registry in TS, not Python
as the issue originally suggested).

Phase B (deferred): event-driven incremental updates, bandit consumption
with state migration, admin per-user profile page, staleness alerts.

Refs #81.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 00:22:22 +00:00
..

ml/

Python. Owns models, features, training, online scoring.

Dir Role Phase
serving/ FastAPI online scorer (/score, /generate) + LiteLLM gateway + prompt registry (prompts.py), called by recommender 12
features/ context assembler (context.py): signals → PromptContext; Feast adapter later 2
pipelines/ batch feature + training DAGs (Prefect/Airflow) 4
registry/ MLflow-backed model registry integration 4
experiments/ A/B assignment + multi-armed bandit policies 4
notebooks/ research; never imported by production code

Principles

  • Every model has a model card in registry/ describing inputs, offline metrics, fairness checks, and rollout history.
  • Online inference must be stateless and < 50ms p99.
  • Training reads from the offline feature store; serving reads from the online feature store; definitions are shared (no train/serve skew).
  • Shadow deploys before any policy change that affects real users.

Profile-feature contract

User-level features (completion rate, preferred hour, tip volume…) are computed by the TypeScript recommender and shipped to ml/serving on every /score and /generate call as profile_features: dict | None. The Python mirror in features/profile_schema.py documents the available names + dtypes — keep it in sync with services/api/src/profile/registry.ts (a CI-style test asserts the name sets match). See ADR-0011.

Prompt registry

serving/prompts.py keys tip-generation prompts by stable version string. Adding a new variant means adding an entry — no caller changes. Selection precedence: POST /generate body's prompt_version field → env DEFAULT_PROMPT_VERSION"v1". The TypeScript recommender drives selection via TIP_PROMPT_VERSION (single value or comma-separated rotation); the version actually used flows back in the response and is persisted to tip_scores.prompt_version so the admin reward-analytics dashboard can bucket reactions per variant.