ml/serving JetStream consumer for signals.> + feedback.> #98

Closed
opened 2026-04-18 07:54:51 +00:00 by alvis · 0 comments
Owner

Context. PR #21 wired the API to JetStream as a producer: every in-process publish on signals.> and feedback.> is mirrored to durable streams. PR #22 added a 15-minute scheduler so signals.task.synced lands without a user request. Together these make a durable cross-process signal feed possible — but nothing reads it yet. ml/serving still gets context only when the API hands it features over HTTP.

Goal. A ml/serving JetStream consumer that drives the feature pipeline and policy training without needing the API to be the originator of every read.

Scope.

  • Async NATS client + durable consumers (feature-pipeline-signals, feature-pipeline-feedback).
  • Map JSON envelopes → existing pydantic feature payloads (later: protobuf once #54 lands).
  • Background task in the FastAPI app lifespan; survives broker reconnects.
  • Per-message ack semantics: ack on successful feature update, redeliver on transient failure, dead-letter after N retries.
  • Health surface in /admin/health (already exists for the API): consumer lag + last-msg time per stream.
  • Config: NATS_URL (already in .env.local), NATS_DURABLE_PREFIX (default feature-pipeline), NATS_MAX_DELIVER (default 5).

Non-goals.

  • Replacing the on-demand recommend path. The HTTP scoring contract stays.
  • Protobuf migration — tracked separately in #54.

Acceptance.

  • Compose full profile: bring up api + ml-serving + nats; emit signals.task.synced from the API; verify it surfaces in ml-serving logs and bumps the feature store row for that user.
  • ml-serving survives a NATS restart with no message loss (durable consumer replays from the last ack).
  • /admin/health shows non-zero last-msg time once the consumer has caught up.

References. ADR-0010 (bridge model), ADR-0005 (event schemas), services/api/src/events/nats.ts, services/api/src/signals/scheduler.ts.

**Context.** PR #21 wired the API to JetStream as a producer: every in-process publish on `signals.>` and `feedback.>` is mirrored to durable streams. PR #22 added a 15-minute scheduler so `signals.task.synced` lands without a user request. Together these make a durable cross-process signal feed possible — but nothing reads it yet. `ml/serving` still gets context only when the API hands it features over HTTP. **Goal.** A `ml/serving` JetStream consumer that drives the feature pipeline and policy training without needing the API to be the originator of every read. **Scope.** - Async NATS client + durable consumers (`feature-pipeline-signals`, `feature-pipeline-feedback`). - Map JSON envelopes → existing pydantic feature payloads (later: protobuf once #54 lands). - Background task in the FastAPI app lifespan; survives broker reconnects. - Per-message ack semantics: ack on successful feature update, redeliver on transient failure, dead-letter after N retries. - Health surface in `/admin/health` (already exists for the API): consumer lag + last-msg time per stream. - Config: `NATS_URL` (already in `.env.local`), `NATS_DURABLE_PREFIX` (default `feature-pipeline`), `NATS_MAX_DELIVER` (default 5). **Non-goals.** - Replacing the on-demand recommend path. The HTTP scoring contract stays. - Protobuf migration — tracked separately in #54. **Acceptance.** - Compose `full` profile: bring up api + ml-serving + nats; emit `signals.task.synced` from the API; verify it surfaces in `ml-serving` logs and bumps the feature store row for that user. - `ml-serving` survives a NATS restart with no message loss (durable consumer replays from the last ack). - `/admin/health` shows non-zero last-msg time once the consumer has caught up. **References.** ADR-0010 (bridge model), ADR-0005 (event schemas), `services/api/src/events/nats.ts`, `services/api/src/signals/scheduler.ts`.
alvis closed this issue 2026-04-25 17:09:11 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: alvis/oO#98