Backend: - Replace on-the-fly Ollama calls with versioned feature store (task_features, task_edges) - Background Tokio worker drains pending rows; write path returns immediately - MLConfig versioning: changing model IDs triggers automatic backfill via next_stale() - AppState with FromRef; new GET /api/ml/status observability endpoint - Idempotent mark_pending (content hash guards), retry failed rows after 30s - Remove tracked build artifacts (backend/target/, frontend/.next/, node_modules/) Frontend: - TaskItem: items-center alignment (fixes checkbox/text offset), break-words for overflow - TaskDetailPanel: fix invisible AI context (text-gray-700→text-gray-400), show all fields - TaskDetailPanel: pending placeholder when latent_desc not yet computed, show task ID - GraphView: surface pending_count as amber pulsing "analyzing N tasks…" hint in legend - Fix Task.created_at type (number/Unix seconds, not string) - Auth gate: LoginPage + sessionStorage; fix e2e tests to bypass gate in jsdom - Fix deleteTask test assertion and '1 remaining'→'1 left' stale text Docs: - VitePress docs in docs/ with guide, MLOps pipeline, and API reference Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
61 lines
2.2 KiB
Markdown
61 lines
2.2 KiB
Markdown
# MLOps Overview
|
|
|
|
## Design principles
|
|
|
|
Taskpile's ML subsystem follows three core MLOps practices applied to a small-scale Ollama setup:
|
|
|
|
### 1. Decouple inference from serving
|
|
|
|
The write path (`POST /tasks`, `PATCH /tasks/:id`) never calls Ollama. It only writes a `pending` row to `task_features` and wakes a `tokio::sync::Notify`. The read path (`GET /graph`) is a pure SQL query — no model calls, no blocking.
|
|
|
|
**Result:** sub-millisecond graph reads regardless of Ollama availability.
|
|
|
|
### 2. Versioned feature store
|
|
|
|
Every feature row records which model produced it:
|
|
|
|
```
|
|
desc_model = "qwen2.5:1.5b"
|
|
embed_model = "nomic-embed-text"
|
|
prompt_version = "v1"
|
|
content_hash = sha256("v1" + title)
|
|
```
|
|
|
|
Changing **any** of these in `MLConfig` causes `next_stale()` to pick up those rows on the next worker tick — automatic backfill, no migration scripts.
|
|
|
|
### 3. Idempotent pipelines
|
|
|
|
`mark_pending` uses `INSERT … ON CONFLICT DO UPDATE` with a content-hash guard: re-editing a task title without changing its content does **not** re-queue it. The hash is derived from `prompt_version + title`, so it changes when either changes.
|
|
|
|
## Observability
|
|
|
|
```bash
|
|
curl -u admin:VQ7q1CzFe3Y --noproxy '*' \
|
|
http://localhost:3001/api/ml/status | jq
|
|
```
|
|
|
|
```json
|
|
{
|
|
"desc_model": "qwen2.5:1.5b",
|
|
"embed_model": "nomic-embed-text",
|
|
"prompt_version": "v1",
|
|
"min_similarity": 0.8,
|
|
"pending": 3,
|
|
"ready": 14,
|
|
"failed": 0,
|
|
"edges": 22,
|
|
"last_error": null
|
|
}
|
|
```
|
|
|
|
The graph endpoint also returns `pending_count` so the frontend can display an "analyzing N tasks…" indicator in the legend without a second API call.
|
|
|
|
## Failure modes
|
|
|
|
| Scenario | Behavior |
|
|
|----------|----------|
|
|
| Ollama unreachable | Worker marks rows `failed` with current model IDs, sleeps 5 s, backs off 30 s before retry. Graph returns nodes + 0 edges + `pending_count > 0`. |
|
|
| Model changed in config | `next_stale()` picks up all `ready` rows where stored model IDs differ. They're re-processed in background. Old edges remain until recomputed. |
|
|
| Task deleted | `ON DELETE CASCADE` on `task_features` and `task_edges` cleans up immediately. No orphaned embeddings. |
|
|
| Title unchanged on PATCH | `mark_pending` detects matching content hash + `ready` status → no-op. Worker not woken. |
|