alvis/oO - oO - AgapGit

alvis/oO

Author	SHA1	Message	Date
alvis	8fd08379d7	chore(m2): close out remaining loose ends (#80 , #86 , #90 ) - Add `ai` compose profile — Ollama + LiteLLM containers for local dev when Agap shared services are unavailable; use with LITELLM_URL / OLLAMA_URL env vars pointing ml-serving at localhost - Mark #90 done (LLM schema validation + fallback shipped in `85a332b`) - Mark #80 superseded by ADR-0013 (multi-agent orchestrator is the pipeline) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 15:31:25 +00:00
alvis	161e654027	feat(serving): replace MLflow run logging with native trace spans Convert ml-serving from isolated MLflow runs to nested traces using mlflow.start_span_no_context(). The recommend endpoint now emits a full span tree: recommend (CHAIN) → build_context (TOOL), agent:* (AGENT) ×N, llm_orchestrator (LLM). Compute and infer endpoints each emit a single span. Supporting changes: - mlflow-skinny>=3.1.0 added to requirements - MLflow configured with --serve-artifacts + mlflow-artifacts:/ default root for cross-container artifact proxy (spans now persist from ml-serving) - --allowed-hosts extended to include mlflow:5000 (SDK includes port in Host) - science_destiny slider wired through prompts.py and recommend endpoint - Config page exposes science/destiny slider (0=data-driven, 100=intuitive) - Tip page shows rationale inline on tap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-11 08:26:05 +00:00
alvis	95e1b342b4	fix(serving): wire MLflow auth and Host header for container-to-container calls - Pass MLFLOW_ADMIN_PASSWORD as fallback password credential - Set host_header='localhost' to satisfy MLflow's --allowed-hosts check (MLflow rejects Host: mlflow but accepts Host: localhost) - Default MLFLOW_TRACKING_URI to http://mlflow:5000 in compose so the env_file value is not silently overridden to empty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-06 10:39:08 +00:00
alvis	c43dbaf23d	feat(serving): add MLflow tracing to ml-serving for all agent calls Logs one MLflow run per /recommend (params, token metrics, latency, full prompt + tip as artifacts) and per /agents/{id}/compute and /infer call (signals snapshot, inferred prefs, latency). Tracing is a no-op when MLFLOW_TRACKING_URI is unset; ml-serving starts and serves tips correctly without MLflow configured. Refs #118 (M4: remove from production / move off critical path). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-06 10:30:24 +00:00
alvis	f8d66aa01f	chore: remove Airflow completely from the stack Drop all four Airflow containers (db, init, webserver, scheduler) from the mlops compose profile, leaving MLflow as the sole mlops service. Remove AIRFLOW_* env vars, config fields, health-check entries, DAG trigger code in admin/bench routes, the airflow_dag_run_id schema column, Airflow nav links and DAG-run links in the admin UI, the two Airflow DAG files (bench_dag.py, sim_dag.py), and all related docs/ADR references. Simulations now run exclusively via the subprocess path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-03 16:38:46 +00:00
alvis	e40dfdcbb0	chore(infra): wire MLflow/Airflow env vars, fix healthcheck, add .dockerignore Some checks failed buf-check / Lint & breaking-change check (push) Has been cancelled Details - docker-compose: pass ML_SERVING_URL, MLFLOW_URL, AIRFLOW_URL + creds to api service - docker-compose: pass NEXT_PUBLIC_MLFLOW_URL/AIRFLOW_URL to admin service - docker-compose: replace wget healthcheck with node fetch (wget not in node image) - docker-compose: enable Airflow basic_auth API backend; add MLflow pip dep for DAGs - Dockerfiles: tighten layer caching, add .dockerignore Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-26 12:08:43 +00:00
alvis	75d0e89906	fix(infra): ml-serving LITELLM_URL default → host.docker.internal:4000 Inside the container, llm.alogins.net times out (public-DNS route, not the loopback path Caddy listens on). host.docker.internal:4000 reaches the Agap LiteLLM directly and is equivalent for dev. Prod deploys override via env. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 12:20:41 +00:00
alvis	d4205a00cf	refactor(infra): drop ai profile; ollama + litellm move to Agap Ollama and LiteLLM are shared Agap services (agap_git/openai/docker-compose.yml); oO never starts them. Removes the ai profile, the litellm config, and the --profile ai runbook; points ml-serving at https://llm.alogins.net by default and adds host.docker.internal host-gateway so the container can hit Agap ollama on the host. Also updates the tip-generator model alias to qwen2.5:1.5b to match the model actually pulled on Agap ollama (7b is ~4.7 GB and would blow VRAM budget). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 12:16:21 +00:00
alvis	d7a2423940	fix(infra): mlflow image tag + python-based healthchecks for ml-serving/mlflow - Corrects mlflow image tag (2.14.3 → v2.14.3); the former tag does not exist on ghcr.io/mlflow/mlflow and caused a manifest-unknown error on pull. - Replaces wget/curl healthchecks with inline python urllib calls — the python:3.12-slim (ml-serving) and ghcr.io/mlflow/mlflow images ship neither wget nor curl, so both containers reported unhealthy despite /health returning 200. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 15:04:18 +00:00
alvis	2a7380933c	feat: NATS JetStream + Todoist background sync (#21 , #22 ) Issue 21 — event infrastructure: - NormalizedEvent<T> + payload types in packages/shared-types/src/events/ - Bus.onPublish() hook for side-effect bridges - NATS JetStream adapter (services/api/src/events/nats.ts): connects when NATS_URL is set, creates signals.> and feedback.> streams, bridges all in-process bus publishes to JetStream — no-ops gracefully when NATS is absent - NATS service added to docker-compose (profile: events\|full, port 4222/8222) Issue 22 — Todoist background sync: - services/api/src/signals/scheduler.ts: queries all active-token users every 15 min (TODOIST_SYNC_INTERVAL_MS), fan-out via todoistSource.fetchSignals() which emits signals.task.synced; on-demand fetch remains as freshness fallback - NATS_URL + TODOIST_SYNC_INTERVAL_MS added to config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-18 01:18:51 +00:00
alvis	46dee7377e	fix: api healthcheck + port mapping corrected to 3078 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 14:17:52 +00:00
alvis	ffdf70733f	feat: M2 AI tips — LiteLLM gateway, context assembler, end-to-end generation pipeline Issues closed: #86, #87, #88, #89, #90, #91, #79, #80, #82 infra: - docker-compose `ai` profile: Ollama + LiteLLM services - infra/litellm/litellm_config.yaml: tip-generator / embedder / judge aliases - .env.example: LITELLM_URL, LITELLM_MASTER_KEY, OLLAMA_URL ml/serving: - POST /generate: calls LiteLLM tip-generator alias, returns TipCandidate[] - JSON retry loop (2 retries with correction prompt on malformed response) - _parse_llm_json strips markdown fences ml/features: - context.py: build_context() assembles user signals → PromptContext (sorts overdue/high-priority tasks first for LLM prompt quality) shared-types: - TipKind, TipSource, TipCandidate types - Tip gains kind + rationale fields services/api: - recommender: 3-stage pipeline (assemble → score → serve) Stage 1: Todoist tasks + LLM candidates fetched in parallel Stage 2: egreedy bandit scores merged candidate pool Stage 3: serve + log with prompt_version, llm_model, tip_kind - tip_scores: prompt_version, llm_model, tip_kind columns + migrations - config: LITELLM_URL added - integrations: surface token_status in /integrations response tests: - ml/serving/tests/test_generate.py: 13 tests (retry, 502/503, fence variants) - ml/features/test_context.py: 9 tests (sorting, edge cases) - services/api recommender.unit.test.ts: 16 pure-function tests (inferReward, dueAgeDays) - services/api recommender.test.ts: 4 integration tests (tip_scores columns, LLM fallback) - shared-types: TipCandidate, rationale, full TipFeedback action set docs: - ADR-0008: LiteLLM AI gateway decision - overview.md: M2 pipeline description updated - ml/README.md: serving + features roles updated Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 14:09:02 +00:00
alvis	85367aeaa0	feat: MLOps external services, AI stack planning, admin MLOps hub Infrastructure: - Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db - infra/mlflow/basic_auth.ini for MLflow auth config - Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git) - Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow) Admin panel: - /admin/models: replace MLflow iframe with external link cards - /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets) - AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section Docs & planning: - README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases) - README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown - README: Phase 4 expanded with LLM MLOps items (#94-#97) - CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do - docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline - ADR-0006: updated to reflect external services (path-based, not embedded) - Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 08:20:44 +00:00
alvis	65218762be	feat: Phase 0 walking skeleton — monorepo, API, web, ML stub Sets up the full Phase 0 foundation: - pnpm workspaces + turbo build graph; native module build approval - packages/shared-types: HTTP contracts (Tip, Auth, Integrations, User) - services/api: Express modular monolith with better-sqlite3/drizzle - auth: Google OAuth2 + PKCE via openid-client v6, cookie sessions - integrations: Todoist OAuth2 connect/disconnect, token vault - recommender: RandomPolicy over Todoist tasks, feedback sink - user: profile, consent capture, full account deletion (GDPR) - apps/web: Next.js 15, three pages (sign-in → connect → tip) - tip page: black canvas, hold-to-act gesture, action sheet - PWA manifest + theme - ml/serving: FastAPI stub implementing the POST /score contract - infra: docker-compose (core/full profiles), Dockerfiles, CI skeleton - .env.example with all required vars documented Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 12:41:24 +00:00

14 Commits