feat(features): per-feature freshness spec — JIT vs batched (#61)
Each ml/features/*.py now declares freshness, source, and fallback per feature. ProfileFeature gains ttl_sec (mirrored from registry.ts), freshness="batched", source, and fallback. context.py adds ContextFeatureSpec + CONTEXT_FEATURES for the three JIT features (hour_of_day, day_of_week, tasks). CI test parses ttlSec from registry.ts to catch drift. ml/README updated with split JIT/batched feature contract. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
22
ml/README.md
22
ml/README.md
@@ -5,7 +5,7 @@ Python. Owns models, features, training, online scoring.
|
||||
| Dir | Role | Phase |
|
||||
|---|---|---|
|
||||
| `serving/` | FastAPI online scorer (`/score`, `/generate`) + LiteLLM gateway + prompt registry (`prompts.py`) + JetStream consumers for `signals.>` / `feedback.>`, called by `recommender` | 1–2 |
|
||||
| `features/` | context assembler (`context.py`): signals → `PromptContext`; Feast adapter later | 2 |
|
||||
| `features/` | context assembler (`context.py`): signals → `PromptContext`; profile-feature schema mirror (`profile_schema.py`); Feast adapter later | 2 |
|
||||
| `pipelines/` | batch feature + training DAGs (Prefect/Airflow) | 4 |
|
||||
| `registry/` | MLflow-backed model registry integration | 4 |
|
||||
| `experiments/` | A/B assignment + multi-armed bandit policies | 4 |
|
||||
@@ -18,14 +18,24 @@ Python. Owns models, features, training, online scoring.
|
||||
- Training reads from the offline feature store; serving reads from the online feature store; definitions are shared (no train/serve skew).
|
||||
- Shadow deploys before any policy change that affects real users.
|
||||
|
||||
## Profile-feature contract
|
||||
## Feature contract
|
||||
|
||||
### Profile features (batched)
|
||||
|
||||
User-level features (completion rate, preferred hour, tip volume…) are computed
|
||||
by the TypeScript recommender and shipped to ml/serving on every `/score` and
|
||||
by the TypeScript recommender and shipped to `ml/serving` on every `/score` and
|
||||
`/generate` call as `profile_features: dict | None`. The Python mirror in
|
||||
`features/profile_schema.py` documents the available names + dtypes — keep it
|
||||
in sync with `services/api/src/profile/registry.ts` (a CI-style test asserts
|
||||
the name sets match). See ADR-0011.
|
||||
`features/profile_schema.py` documents each feature's name, dtype, TTL, source,
|
||||
and null fallback — keep it in sync with `services/api/src/profile/registry.ts`
|
||||
(a CI-style test asserts names and `ttlSec` values match). See ADR-0011.
|
||||
|
||||
### Context features (JIT)
|
||||
|
||||
Request-time signals assembled by `features/context.py` (`hour_of_day`,
|
||||
`day_of_week`, task list). These are never cached — they are derived from the
|
||||
system clock and the live Todoist feed at the moment of the score call.
|
||||
`CONTEXT_FEATURES` in `context.py` declares freshness, source, and fallback for
|
||||
each field (issue #61).
|
||||
|
||||
## Prompt registry
|
||||
|
||||
|
||||
Reference in New Issue
Block a user