oO/ml/README.md

# ml/

Python. Owns models, features, training, online scoring.

| Dir | Role | Phase |
|---|---|---|
| `serving/` | FastAPI online scorer (`/score`, `/generate`) + LiteLLM gateway, called by `recommender` | 1–2 |
| `features/` | context assembler (`context.py`): signals → `PromptContext`; Feast adapter later | 2 |
| `pipelines/` | batch feature + training DAGs (Prefect/Airflow) | 4 |
| `registry/` | MLflow-backed model registry integration | 4 |
| `experiments/` | A/B assignment + multi-armed bandit policies | 4 |
| `notebooks/` | research; never imported by production code | — |

## Principles

- Every model has a **model card** in `registry/` describing inputs, offline metrics, fairness checks, and rollout history.
- Online inference must be stateless and < 50ms p99.
- Training reads from the offline feature store; serving reads from the online feature store; definitions are shared (no train/serve skew).
- Shadow deploys before any policy change that affects real users.