feat(ml): JetStream durable consumers in ml/serving (#98)

Adds a NATS JetStream consumer to ml/serving so the feature pipeline
can react to events without the API triggering every read.

- nats_consumer.py: durable push consumers for signals.> and feedback.>
  streams; acks on success, naks for redeliver, up to NATS_MAX_DELIVER
  attempts; per-consumer health state (last_msg_ts, processed, errors)
- main.py: FastAPI lifespan wires start/stop; /health exposes nats state
- requirements.txt: adds nats-py>=2.9.0
- Dockerfile.ml: copy all *.py from ml/serving (was missing prompts.py)

Handled subjects:
  signals.task.synced   → writes per-user sync metadata to STATE_DIR
  signals.tip.feedback  → logged for observability (reward via HTTP path)

Config: NATS_URL (empty = disabled), NATS_DURABLE_PREFIX, NATS_MAX_DELIVER

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-25 10:19:47 +00:00
parent 2d7cf217a9
commit 4652e4b582
4 changed files with 168 additions and 3 deletions

View File

@@ -28,6 +28,7 @@ import math
import os
import time
from collections import deque
from contextlib import asynccontextmanager
from pathlib import Path
from typing import Optional, Deque
@@ -36,9 +37,18 @@ import numpy as np
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import nats_consumer
from prompts import get_prompt
app = FastAPI(title="oO ML Serving", version="1.0.0")
@asynccontextmanager
async def lifespan(app: FastAPI):
await nats_consumer.start(STATE_DIR)
yield
await nats_consumer.stop()
app = FastAPI(title="oO ML Serving", version="1.0.0", lifespan=lifespan)
LITELLM_URL = os.getenv("LITELLM_URL", "http://localhost:4000")
LITELLM_MASTER_KEY = os.getenv("LITELLM_MASTER_KEY", "sk-oo-dev")
@@ -315,7 +325,13 @@ class GenerateResponse(BaseModel):
@app.get("/health")
def health():
return {"ok": True}
return {
"ok": True,
"nats": {
"enabled": bool(nats_consumer.NATS_URL),
"consumers": nats_consumer.consumer_health,
},
}
_RETRY_SUFFIX = (