alvis/oO - oO - AgapGit

alvis/oO

Author	SHA1	Message	Date
alvis	f8d66aa01f	chore: remove Airflow completely from the stack Drop all four Airflow containers (db, init, webserver, scheduler) from the mlops compose profile, leaving MLflow as the sole mlops service. Remove AIRFLOW_* env vars, config fields, health-check entries, DAG trigger code in admin/bench routes, the airflow_dag_run_id schema column, Airflow nav links and DAG-run links in the admin UI, the two Airflow DAG files (bench_dag.py, sim_dag.py), and all related docs/ADR references. Simulations now run exclusively via the subprocess path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-03 16:38:46 +00:00
alvis	ce1c8bde57	fix(admin): simulations view-only + docs path in Docker (#109 #110 ) - simulate/page.tsx: remove launch form — simulations are triggered via Airflow DAG, not the admin UI. Page now shows run history + links to Airflow and MLflow only (#109) - docs.ts: use DOCS_ROOT env var (fallback: ../../docs for local dev) so the path works in Docker standalone where CWD is /app (#110) - Dockerfile.admin: copy docs/ into the runner image at /app/docs and set DOCS_ROOT=/app/docs so listAllDocs() finds the files at runtime (#110) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-27 13:55:50 +00:00
alvis	e40dfdcbb0	chore(infra): wire MLflow/Airflow env vars, fix healthcheck, add .dockerignore Some checks failed buf-check / Lint & breaking-change check (push) Has been cancelled Details - docker-compose: pass ML_SERVING_URL, MLFLOW_URL, AIRFLOW_URL + creds to api service - docker-compose: pass NEXT_PUBLIC_MLFLOW_URL/AIRFLOW_URL to admin service - docker-compose: replace wget healthcheck with node fetch (wget not in node image) - docker-compose: enable Airflow basic_auth API backend; add MLflow pip dep for DAGs - Dockerfiles: tighten layer caching, add .dockerignore Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-26 12:08:43 +00:00
alvis	4652e4b582	feat(ml): JetStream durable consumers in ml/serving (#98 ) Adds a NATS JetStream consumer to ml/serving so the feature pipeline can react to events without the API triggering every read. - nats_consumer.py: durable push consumers for signals.> and feedback.> streams; acks on success, naks for redeliver, up to NATS_MAX_DELIVER attempts; per-consumer health state (last_msg_ts, processed, errors) - main.py: FastAPI lifespan wires start/stop; /health exposes nats state - requirements.txt: adds nats-py>=2.9.0 - Dockerfile.ml: copy all *.py from ml/serving (was missing prompts.py) Handled subjects: signals.task.synced → writes per-user sync metadata to STATE_DIR signals.tip.feedback → logged for observability (reward via HTTP path) Config: NATS_URL (empty = disabled), NATS_DURABLE_PREFIX, NATS_MAX_DELIVER Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:19:47 +00:00
alvis	75d0e89906	fix(infra): ml-serving LITELLM_URL default → host.docker.internal:4000 Inside the container, llm.alogins.net times out (public-DNS route, not the loopback path Caddy listens on). host.docker.internal:4000 reaches the Agap LiteLLM directly and is equivalent for dev. Prod deploys override via env. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 12:20:41 +00:00
alvis	d4205a00cf	refactor(infra): drop ai profile; ollama + litellm move to Agap Ollama and LiteLLM are shared Agap services (agap_git/openai/docker-compose.yml); oO never starts them. Removes the ai profile, the litellm config, and the --profile ai runbook; points ml-serving at https://llm.alogins.net by default and adds host.docker.internal host-gateway so the container can hit Agap ollama on the host. Also updates the tip-generator model alias to qwen2.5:1.5b to match the model actually pulled on Agap ollama (7b is ~4.7 GB and would blow VRAM budget). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 12:16:21 +00:00
alvis	d7a2423940	fix(infra): mlflow image tag + python-based healthchecks for ml-serving/mlflow - Corrects mlflow image tag (2.14.3 → v2.14.3); the former tag does not exist on ghcr.io/mlflow/mlflow and caused a manifest-unknown error on pull. - Replaces wget/curl healthchecks with inline python urllib calls — the python:3.12-slim (ml-serving) and ghcr.io/mlflow/mlflow images ship neither wget nor curl, so both containers reported unhealthy despite /health returning 200. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 15:04:18 +00:00
alvis	2a7380933c	feat: NATS JetStream + Todoist background sync (#21 , #22 ) Issue 21 — event infrastructure: - NormalizedEvent<T> + payload types in packages/shared-types/src/events/ - Bus.onPublish() hook for side-effect bridges - NATS JetStream adapter (services/api/src/events/nats.ts): connects when NATS_URL is set, creates signals.> and feedback.> streams, bridges all in-process bus publishes to JetStream — no-ops gracefully when NATS is absent - NATS service added to docker-compose (profile: events\|full, port 4222/8222) Issue 22 — Todoist background sync: - services/api/src/signals/scheduler.ts: queries all active-token users every 15 min (TODOIST_SYNC_INTERVAL_MS), fan-out via todoistSource.fetchSignals() which emits signals.task.synced; on-demand fetch remains as freshness fallback - NATS_URL + TODOIST_SYNC_INTERVAL_MS added to config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-18 01:18:51 +00:00
alvis	46dee7377e	fix: api healthcheck + port mapping corrected to 3078 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 14:17:52 +00:00
alvis	ffdf70733f	feat: M2 AI tips — LiteLLM gateway, context assembler, end-to-end generation pipeline Issues closed: #86, #87, #88, #89, #90, #91, #79, #80, #82 infra: - docker-compose `ai` profile: Ollama + LiteLLM services - infra/litellm/litellm_config.yaml: tip-generator / embedder / judge aliases - .env.example: LITELLM_URL, LITELLM_MASTER_KEY, OLLAMA_URL ml/serving: - POST /generate: calls LiteLLM tip-generator alias, returns TipCandidate[] - JSON retry loop (2 retries with correction prompt on malformed response) - _parse_llm_json strips markdown fences ml/features: - context.py: build_context() assembles user signals → PromptContext (sorts overdue/high-priority tasks first for LLM prompt quality) shared-types: - TipKind, TipSource, TipCandidate types - Tip gains kind + rationale fields services/api: - recommender: 3-stage pipeline (assemble → score → serve) Stage 1: Todoist tasks + LLM candidates fetched in parallel Stage 2: egreedy bandit scores merged candidate pool Stage 3: serve + log with prompt_version, llm_model, tip_kind - tip_scores: prompt_version, llm_model, tip_kind columns + migrations - config: LITELLM_URL added - integrations: surface token_status in /integrations response tests: - ml/serving/tests/test_generate.py: 13 tests (retry, 502/503, fence variants) - ml/features/test_context.py: 9 tests (sorting, edge cases) - services/api recommender.unit.test.ts: 16 pure-function tests (inferReward, dueAgeDays) - services/api recommender.test.ts: 4 integration tests (tip_scores columns, LLM fallback) - shared-types: TipCandidate, rationale, full TipFeedback action set docs: - ADR-0008: LiteLLM AI gateway decision - overview.md: M2 pipeline description updated - ml/README.md: serving + features roles updated Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 14:09:02 +00:00
alvis	85367aeaa0	feat: MLOps external services, AI stack planning, admin MLOps hub Infrastructure: - Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db - infra/mlflow/basic_auth.ini for MLflow auth config - Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git) - Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow) Admin panel: - /admin/models: replace MLflow iframe with external link cards - /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets) - AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section Docs & planning: - README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases) - README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown - README: Phase 4 expanded with LLM MLOps items (#94-#97) - CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do - docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline - ADR-0006: updated to reflect external services (path-based, not embedded) - Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 08:20:44 +00:00
alvis	65218762be	feat: Phase 0 walking skeleton — monorepo, API, web, ML stub Sets up the full Phase 0 foundation: - pnpm workspaces + turbo build graph; native module build approval - packages/shared-types: HTTP contracts (Tip, Auth, Integrations, User) - services/api: Express modular monolith with better-sqlite3/drizzle - auth: Google OAuth2 + PKCE via openid-client v6, cookie sessions - integrations: Todoist OAuth2 connect/disconnect, token vault - recommender: RandomPolicy over Todoist tasks, feedback sink - user: profile, consent capture, full account deletion (GDPR) - apps/web: Next.js 15, three pages (sign-in → connect → tip) - tip page: black canvas, hold-to-act gesture, action sheet - PWA manifest + theme - ml/serving: FastAPI stub implementing the POST /score contract - infra: docker-compose (core/full profiles), Dockerfiles, CI skeleton - .env.example with all required vars documented Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 12:41:24 +00:00
alvis	cf4c7a0eb4	chore: scaffold oO monorepo with architecture, roadmap, and module stubs	2026-04-13 14:19:56 +00:00

13 Commits