Files
taskpile/docs/mlops/overview.md
Alvis 9b77d6ea67 Add MLOps feature store, fix UI layout, add docs and Gitea remote
Backend:
- Replace on-the-fly Ollama calls with versioned feature store (task_features, task_edges)
- Background Tokio worker drains pending rows; write path returns immediately
- MLConfig versioning: changing model IDs triggers automatic backfill via next_stale()
- AppState with FromRef; new GET /api/ml/status observability endpoint
- Idempotent mark_pending (content hash guards), retry failed rows after 30s
- Remove tracked build artifacts (backend/target/, frontend/.next/, node_modules/)

Frontend:
- TaskItem: items-center alignment (fixes checkbox/text offset), break-words for overflow
- TaskDetailPanel: fix invisible AI context (text-gray-700→text-gray-400), show all fields
- TaskDetailPanel: pending placeholder when latent_desc not yet computed, show task ID
- GraphView: surface pending_count as amber pulsing "analyzing N tasks…" hint in legend
- Fix Task.created_at type (number/Unix seconds, not string)
- Auth gate: LoginPage + sessionStorage; fix e2e tests to bypass gate in jsdom
- Fix deleteTask test assertion and '1 remaining'→'1 left' stale text

Docs:
- VitePress docs in docs/ with guide, MLOps pipeline, and API reference

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 06:16:28 +00:00

2.2 KiB

MLOps Overview

Design principles

Taskpile's ML subsystem follows three core MLOps practices applied to a small-scale Ollama setup:

1. Decouple inference from serving

The write path (POST /tasks, PATCH /tasks/:id) never calls Ollama. It only writes a pending row to task_features and wakes a tokio::sync::Notify. The read path (GET /graph) is a pure SQL query — no model calls, no blocking.

Result: sub-millisecond graph reads regardless of Ollama availability.

2. Versioned feature store

Every feature row records which model produced it:

desc_model     = "qwen2.5:1.5b"
embed_model    = "nomic-embed-text"
prompt_version = "v1"
content_hash   = sha256("v1" + title)

Changing any of these in MLConfig causes next_stale() to pick up those rows on the next worker tick — automatic backfill, no migration scripts.

3. Idempotent pipelines

mark_pending uses INSERT … ON CONFLICT DO UPDATE with a content-hash guard: re-editing a task title without changing its content does not re-queue it. The hash is derived from prompt_version + title, so it changes when either changes.

Observability

curl -u admin:VQ7q1CzFe3Y --noproxy '*' \
  http://localhost:3001/api/ml/status | jq
{
  "desc_model": "qwen2.5:1.5b",
  "embed_model": "nomic-embed-text",
  "prompt_version": "v1",
  "min_similarity": 0.8,
  "pending": 3,
  "ready": 14,
  "failed": 0,
  "edges": 22,
  "last_error": null
}

The graph endpoint also returns pending_count so the frontend can display an "analyzing N tasks…" indicator in the legend without a second API call.

Failure modes

Scenario Behavior
Ollama unreachable Worker marks rows failed with current model IDs, sleeps 5 s, backs off 30 s before retry. Graph returns nodes + 0 edges + pending_count > 0.
Model changed in config next_stale() picks up all ready rows where stored model IDs differ. They're re-processed in background. Old edges remain until recomputed.
Task deleted ON DELETE CASCADE on task_features and task_edges cleans up immediately. No orphaned embeddings.
Title unchanged on PATCH mark_pending detects matching content hash + ready status → no-op. Worker not woken.