alvis/oO

Files

alvis 430804e9a5 feat(ml): prompt registry + per-request variant selection

Replaces the hardcoded "v1" label with a real prompt registry:

  ml/serving/prompts.py       — keyed by version: v1 (baseline),
                                v2-mentor (calm/specific persona),
                                v3-few-shot (v1 persona + curated examples)
  ml/serving/main.py          — POST /generate accepts optional prompt_version,
                                422 on unknown, echoes the version actually used
                                back in the response
  services/api/src/config.ts  — TIP_PROMPT_VERSION: empty / single / comma-list
                                (uniform random per request)
  services/api/src/routes/recommender.ts
                              — pickPromptVersion() drives selection; the
                                response's prompt_version (not a stale TS
                                constant) is what lands in tip_scores so the
                                #92 reward-analytics dashboard shows real
                                per-variant reaction rates

Closes #84.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-24 15:44:04 +00:00

experiments

feat: ε-greedy v1 as active policy; dwell-time reward inference; offline sim framework

2026-04-16 07:44:37 +00:00

features

feat: M2 AI tips — LiteLLM gateway, context assembler, end-to-end generation pipeline

2026-04-17 14:09:02 +00:00

notebooks

chore: scaffold oO monorepo with architecture, roadmap, and module stubs

2026-04-13 14:19:56 +00:00

pipelines

chore: scaffold oO monorepo with architecture, roadmap, and module stubs

2026-04-13 14:19:56 +00:00

registry

chore: scaffold oO monorepo with architecture, roadmap, and module stubs

2026-04-13 14:19:56 +00:00

serving

feat(ml): prompt registry + per-request variant selection

2026-04-24 15:44:04 +00:00

README.md

feat(ml): prompt registry + per-request variant selection

2026-04-24 15:44:04 +00:00

README.md

ml/

Python. Owns models, features, training, online scoring.

Dir	Role	Phase
`serving/`	FastAPI online scorer (`/score`, `/generate`) + LiteLLM gateway + prompt registry (`prompts.py`), called by `recommender`	1–2
`features/`	context assembler (`context.py`): signals → `PromptContext`; Feast adapter later	2
`pipelines/`	batch feature + training DAGs (Prefect/Airflow)	4
`registry/`	MLflow-backed model registry integration	4
`experiments/`	A/B assignment + multi-armed bandit policies	4
`notebooks/`	research; never imported by production code	—

Principles

Every model has a model card in registry/ describing inputs, offline metrics, fairness checks, and rollout history.
Online inference must be stateless and < 50ms p99.
Training reads from the offline feature store; serving reads from the online feature store; definitions are shared (no train/serve skew).
Shadow deploys before any policy change that affects real users.

Prompt registry

serving/prompts.py keys tip-generation prompts by stable version string. Adding a new variant means adding an entry — no caller changes. Selection precedence: POST /generate body's prompt_version field → env DEFAULT_PROMPT_VERSION → "v1". The TypeScript recommender drives selection via TIP_PROMPT_VERSION (single value or comma-separated rotation); the version actually used flows back in the response and is persisted to tip_scores.prompt_version so the admin reward-analytics dashboard can bucket reactions per variant.

README.md Unescape Escape

ml/

Principles

Prompt registry

README.md