feat(serving): replace MLflow run logging with native trace spans

Convert ml-serving from isolated MLflow runs to nested traces using mlflow.start_span_no_context(). The recommend endpoint now emits a full span tree: recommend (CHAIN) → build_context (TOOL), agent:* (AGENT) ×N, llm_orchestrator (LLM). Compute and infer endpoints each emit a single span. Supporting changes: - mlflow-skinny>=3.1.0 added to requirements - MLflow configured with --serve-artifacts + mlflow-artifacts:/ default root for cross-container artifact proxy (spans now persist from ml-serving) - --allowed-hosts extended to include mlflow:5000 (SDK includes port in Host) - science_destiny slider wired through prompts.py and recommend endpoint - Config page exposes science/destiny slider (0=data-driven, 100=intuitive) - Tip page shows rationale inline on tap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 08:26:05 +00:00
parent afacc34969
commit 161e654027
14 changed files with 419 additions and 141 deletions
--- a/services/api/src/signals/agent-scheduler.ts
+++ b/services/api/src/signals/agent-scheduler.ts
@@ -68,14 +68,13 @@ async function runCycle(agentIds: string[]): Promise<void> {
  let failed = 0;

  for (const userId of userIds) {
-    const results = await Promise.allSettled(
-      agentIds.map((agentId) => computeAndStore(userId, agentId)),
-    );
-    for (const r of results) {
-      if (r.status === 'fulfilled') ok++;
-      else {
+    for (const agentId of agentIds) {
+      try {
+        await computeAndStore(userId, agentId);
+        ok++;
+      } catch (err: any) {
        failed++;
-        logger.error({ err: r.reason, userId }, 'agent-scheduler: compute error');
+        logger.error({ err, userId, agentId }, 'agent-scheduler: compute error');
      }
    }
  }