Files
taskpile/docs/mlops/pipeline.md
Alvis 9b77d6ea67 Add MLOps feature store, fix UI layout, add docs and Gitea remote
Backend:
- Replace on-the-fly Ollama calls with versioned feature store (task_features, task_edges)
- Background Tokio worker drains pending rows; write path returns immediately
- MLConfig versioning: changing model IDs triggers automatic backfill via next_stale()
- AppState with FromRef; new GET /api/ml/status observability endpoint
- Idempotent mark_pending (content hash guards), retry failed rows after 30s
- Remove tracked build artifacts (backend/target/, frontend/.next/, node_modules/)

Frontend:
- TaskItem: items-center alignment (fixes checkbox/text offset), break-words for overflow
- TaskDetailPanel: fix invisible AI context (text-gray-700→text-gray-400), show all fields
- TaskDetailPanel: pending placeholder when latent_desc not yet computed, show task ID
- GraphView: surface pending_count as amber pulsing "analyzing N tasks…" hint in legend
- Fix Task.created_at type (number/Unix seconds, not string)
- Auth gate: LoginPage + sessionStorage; fix e2e tests to bypass gate in jsdom
- Fix deleteTask test assertion and '1 remaining'→'1 left' stale text

Docs:
- VitePress docs in docs/ with guide, MLOps pipeline, and API reference

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 06:16:28 +00:00

2.9 KiB
Raw Permalink Blame History

Feature Pipeline

Worker lifecycle

startup
  │
  ▼
notify.notify_one()   ← wake immediately to drain any pending rows
  │
  ▼
loop:
  drain loop:
    next_stale() ──► None → break
         │
         ▼
    generate_description(title)
         │ error → set_failed(current model IDs), sleep 5s, break
         ▼
    get_embedding(description)
         │ error → set_failed(current model IDs), sleep 5s, break
         ▼
    UPDATE task_features SET status='ready', embedding=blob, …
         │
         ▼
    recompute_for_task(task_id)
      DELETE task_edges WHERE source=id OR target=id
      load all other 'ready' embeddings
      INSERT pairs with cosine_sim ≥ min_similarity
  
  tokio::select!
    notified()         ← new task created/updated
    sleep(60s)         ← retry failed rows

Content hash and cache invalidation

content_hash = sha256( prompt_version || "\0" || title )

A task_features row is considered stale when:

  • status = 'pending' — explicitly queued
  • status = 'failed' and updated_at < now 30s — retry after backoff
  • status = 'ready' and desc_model ≠ config.desc_model — model changed
  • status = 'ready' and embed_model ≠ config.embed_model
  • status = 'ready' and prompt_version ≠ config.prompt_version

A stale-but-ready row serves its existing data until the worker overwrites it, so the graph never shows a "hole" during recomputation.

Changing models

Edit backend/src/ml/config.rs:

pub fn default() -> Self {
    Self {
        desc_model: "qwen2.5:7b".to_string(),      // upgraded
        embed_model: "nomic-embed-text".to_string(),
        prompt_version: "v2".to_string(),            // bump when prompt changes
        min_similarity: 0.75,                        // wider edges
        ..
    }
}

On the next startup (or notify_one()), next_stale() returns every row whose stored config no longer matches. The worker re-runs them in oldest-first order.

Prompt versioning

The prompt template is matched on prompt_version in ml/ollama.rs::render_prompt. Old versions remain compilable — bumping the version adds a new match arm rather than overwriting the old one, so descriptions produced by v1 can always be reproduced.

fn render_prompt(prompt_version: &str, task_title: &str) -> String {
    match prompt_version {
        "v2" => format!("…new prompt…{task_title}…"),
        _ => format!("…v1 prompt…{task_title}…"),  // "v1" and legacy
    }
}

Embedding storage

Embeddings are stored as raw little-endian f32 bytes in a BLOB column:

[f32 LE] [f32 LE] [f32 LE] … (768 floats for nomic-embed-text = 3072 bytes)

encode_embedding / decode_embedding in ml/features.rs handle the conversion. The embed_dim column records the dimension so readers don't have to hard-code the model's output size.