d7a2423940
fix(infra): mlflow image tag + python-based healthchecks for ml-serving/mlflow
...
- Corrects mlflow image tag (2.14.3 → v2.14.3); the former tag does not exist
on ghcr.io/mlflow/mlflow and caused a manifest-unknown error on pull.
- Replaces wget/curl healthchecks with inline python urllib calls — the
python:3.12-slim (ml-serving) and ghcr.io/mlflow/mlflow images ship
neither wget nor curl, so both containers reported unhealthy despite
/health returning 200.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com >
2026-04-18 15:04:18 +00:00
2a7380933c
feat: NATS JetStream + Todoist background sync ( #21 , #22 )
...
Issue 21 — event infrastructure:
- NormalizedEvent<T> + payload types in packages/shared-types/src/events/
- Bus.onPublish() hook for side-effect bridges
- NATS JetStream adapter (services/api/src/events/nats.ts): connects when
NATS_URL is set, creates signals.> and feedback.> streams, bridges all
in-process bus publishes to JetStream — no-ops gracefully when NATS is absent
- NATS service added to docker-compose (profile: events|full, port 4222/8222)
Issue 22 — Todoist background sync:
- services/api/src/signals/scheduler.ts: queries all active-token users every
15 min (TODOIST_SYNC_INTERVAL_MS), fan-out via todoistSource.fetchSignals()
which emits signals.task.synced; on-demand fetch remains as freshness fallback
- NATS_URL + TODOIST_SYNC_INTERVAL_MS added to config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-18 01:18:51 +00:00
46dee7377e
fix: api healthcheck + port mapping corrected to 3078
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-17 14:17:52 +00:00
ffdf70733f
feat: M2 AI tips — LiteLLM gateway, context assembler, end-to-end generation pipeline
...
Issues closed : #86 , #87 , #88 , #89 , #90 , #91 , #79 , #80 , #82
infra:
- docker-compose `ai` profile: Ollama + LiteLLM services
- infra/litellm/litellm_config.yaml: tip-generator / embedder / judge aliases
- .env.example: LITELLM_URL, LITELLM_MASTER_KEY, OLLAMA_URL
ml/serving:
- POST /generate: calls LiteLLM tip-generator alias, returns TipCandidate[]
- JSON retry loop (2 retries with correction prompt on malformed response)
- _parse_llm_json strips markdown fences
ml/features:
- context.py: build_context() assembles user signals → PromptContext
(sorts overdue/high-priority tasks first for LLM prompt quality)
shared-types:
- TipKind, TipSource, TipCandidate types
- Tip gains kind + rationale fields
services/api:
- recommender: 3-stage pipeline (assemble → score → serve)
Stage 1: Todoist tasks + LLM candidates fetched in parallel
Stage 2: egreedy bandit scores merged candidate pool
Stage 3: serve + log with prompt_version, llm_model, tip_kind
- tip_scores: prompt_version, llm_model, tip_kind columns + migrations
- config: LITELLM_URL added
- integrations: surface token_status in /integrations response
tests:
- ml/serving/tests/test_generate.py: 13 tests (retry, 502/503, fence variants)
- ml/features/test_context.py: 9 tests (sorting, edge cases)
- services/api recommender.unit.test.ts: 16 pure-function tests (inferReward, dueAgeDays)
- services/api recommender.test.ts: 4 integration tests (tip_scores columns, LLM fallback)
- shared-types: TipCandidate, rationale, full TipFeedback action set
docs:
- ADR-0008: LiteLLM AI gateway decision
- overview.md: M2 pipeline description updated
- ml/README.md: serving + features roles updated
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-17 14:09:02 +00:00
85367aeaa0
feat: MLOps external services, AI stack planning, admin MLOps hub
...
Infrastructure:
- Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db
- infra/mlflow/basic_auth.ini for MLflow auth config
- Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git)
- Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow)
Admin panel:
- /admin/models: replace MLflow iframe with external link cards
- /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets)
- AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section
Docs & planning:
- README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases)
- README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown
- README: Phase 4 expanded with LLM MLOps items (#94-#97)
- CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do
- docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline
- ADR-0006: updated to reflect external services (path-based, not embedded)
- Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-17 08:20:44 +00:00
faf44c18fc
feat: ε-greedy v1 as active policy; dwell-time reward inference; offline sim framework
...
- Promote egreedy-v1 to active serving policy (ADR-0007): /score/egreedy + /reward/egreedy
replaces linucb-v1 endpoints after offline sim shows +10.7% mean reward (−0.548 vs −0.606)
- Replace explicit helpful/not_helpful feedback with dwell-time inferred reward (inferReward):
dismiss=−1.0, snooze=+0.1, done<15s=−0.3, done 15s–2min=+1.0, done 2–10min=+0.6, done>10min=+0.3
- Add ml/serving ε-greedy endpoints: /score/egreedy, /reward/egreedy, /stats/egreedy/{user_id}
with d=7 feature vector (base 5 + sin/cos day-of-week encoding)
- Add offline simulation framework (ml/experiments/sim): rule/LLM/claude-code judges,
two-phase score+reward, synthetic personas, task generator; results stored in sim_runs/sim_events
- Add /admin/simulations page: start runs, live-poll status, reward curve SVG, action/persona tables
- Fix egreedy day_of_week training skew: reward endpoint now uses actual dow instead of hardcoded 0
- Fix runner.py proxy bypass: httpx.Client(trust_env=False) for localhost ML calls
- Add dwellMs to TipFeedbackEvent contract and bus.test.ts fixture
- Schema: sim_runs, sim_events tables; tip_feedback gains dwell_ms, reward_milli columns
- ADR-0006: admin console framework; ADR-0007: egreedy-v1 policy selection rationale
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-16 07:44:37 +00:00
65218762be
feat: Phase 0 walking skeleton — monorepo, API, web, ML stub
...
Sets up the full Phase 0 foundation:
- pnpm workspaces + turbo build graph; native module build approval
- packages/shared-types: HTTP contracts (Tip, Auth, Integrations, User)
- services/api: Express modular monolith with better-sqlite3/drizzle
- auth: Google OAuth2 + PKCE via openid-client v6, cookie sessions
- integrations: Todoist OAuth2 connect/disconnect, token vault
- recommender: RandomPolicy over Todoist tasks, feedback sink
- user: profile, consent capture, full account deletion (GDPR)
- apps/web: Next.js 15, three pages (sign-in → connect → tip)
- tip page: black canvas, hold-to-act gesture, action sheet
- PWA manifest + theme
- ml/serving: FastAPI stub implementing the POST /score contract
- infra: docker-compose (core/full profiles), Dockerfiles, CI skeleton
- .env.example with all required vars documented
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-14 12:41:24 +00:00
cf4c7a0eb4
chore: scaffold oO monorepo with architecture, roadmap, and module stubs
2026-04-13 14:19:56 +00:00