alvis/oO - oO - AgapGit

alvis/oO

Author	SHA1	Message	Date
alvis	59c493323f	fix(recommender): remove Todoist fallback on orchestrator failure; add snooze exclusion When fetchOrchestratorTip returned null (LiteLLM timeout, bad JSON, etc.) the recommender silently fell back to randomPolicy, serving a raw Todoist task with no rationale — explaining both reported symptoms. - Remove randomPolicy/signalToCandidate; return 204 when orchestrator fails so the UI shows "All clear" instead of a confusing Todoist task - Pass recent_tip through the stack (frontend → POST /recommend → fetchOrchestratorTip → ml/serving RecommendRequest → build_orchestrator_messages) so after snooze the LLM is instructed not to repeat the snoozed content Fixes #122 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 13:28:32 +00:00
alvis	d1f28666b0	feat(integrations): add Google Health (Fit) integration with full permissions OAuth2 flow with all 11 Google Fitness scopes (activity, body, sleep, heart rate, nutrition, location, blood glucose/pressure/temperature, oxygen saturation, reproductive health). Stores access + refresh tokens; auto-refreshes on expiry. GoogleHealthSignalSource fetches steps, sleep sessions, active minutes, calories, and heart rate from the Fit aggregate + sessions APIs. Signals flow into both the tip orchestrator and the health-vitals pre-compute agent, which generates prompt snippets about step progress, sleep deficit, sedentary time, and elevated heart rate. Signal.kind extended with 'health'; IntegrationProvider extended with 'google-health'. Agent compute signal mapping enriched to include source, kind, and all features so health-vitals can filter its own signals. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-11 11:12:11 +00:00
alvis	161e654027	feat(serving): replace MLflow run logging with native trace spans Convert ml-serving from isolated MLflow runs to nested traces using mlflow.start_span_no_context(). The recommend endpoint now emits a full span tree: recommend (CHAIN) → build_context (TOOL), agent:* (AGENT) ×N, llm_orchestrator (LLM). Compute and infer endpoints each emit a single span. Supporting changes: - mlflow-skinny>=3.1.0 added to requirements - MLflow configured with --serve-artifacts + mlflow-artifacts:/ default root for cross-container artifact proxy (spans now persist from ml-serving) - --allowed-hosts extended to include mlflow:5000 (SDK includes port in Host) - science_destiny slider wired through prompts.py and recommend endpoint - Config page exposes science/destiny slider (0=data-driven, 100=intuitive) - Tip page shows rationale inline on tap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-11 08:26:05 +00:00
alvis	ad6747c242	feat(profile): /api/profile + eligibility filter + inference framework (ADR-0014 steps 4-6) Step 4 — /api/profile read-through API: GET /api/profile → { user, prefs, consents, contexts } PATCH /api/profile/prefs/:scope upsert user_preferences (source='user') PATCH /api/profile/consents grant / revoke consent keys PATCH /api/profile/contexts create / activate / deactivate contexts Legacy consentGiven bit folded in as data:core fallback. Step 5 — registry-driven eligibility filter: fetchRegistry() exported from agent-registry.ts. profile/eligibility.ts: getEligibleAgentIds(userId) — filters by required consents, silenced_in_contexts, and user_preferences[enabled=false]. fetchOrchestratorTip filters agent_outputs to eligible set before calling ml/serving /recommend. Fail-closed: registry unavailable → empty set. Step 6 — shared context-inference framework (#111) + time-of-day proof (#112): ml/agents/inference/: UserHistory, FeedbackEvent, run_inference(). Framework: cold-start, min_history gating, error fallback, structured logs. TimeOfDayAgent v1.1.0: inferred_params=[preferred_hour]; also reads quiet_start/quiet_end from agent_prefs. agent_prefs injected by TS caller. AgentInput gains agent_prefs field. ml/serving: POST /agents/{agent_id}/infer endpoint. agent-outputs.ts computeAndStore: loads prefs before compute, calls /infer after, persists results (source='inferred'); user overrides never touched. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-05 11:14:25 +00:00
alvis	05f748159b	chore: remove shadow policy machinery (ADR-0013 step 10) Deletes shadowPolicies map, getShadowPolicies, setPolicyActive from recommender.ts; removes /api/admin/policies routes from admin.ts; removes getPolicies, togglePolicy, PolicyInfo from admin api.ts; removes the policy toggle section from the ops page. 168 API tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 10:45:32 +00:00
alvis	c65bedcf68	feat(api): orchestrator cutover — replace bandit with multi-agent pipeline (ADR-0013 step 6) POST /recommend now calls ml/serving /recommend with pre-computed agent snippets + task context instead of /generate + /score/egreedy/v2. Falls back to a random signal candidate when ml/serving is unavailable. Removes: remotePolicy, fetchLlmCandidates, sendRewardWithRetry, candidateCache, pickPromptVersion. Feedback handler keeps inferReward + tipFeedback writes for observability; reward delivery to the bandit is gone. tipScores.policy is now 'orchestrator'; promptVersion is 'v4-orchestrator'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 10:37:15 +00:00
alvis	c4960d0601	feat(observability): structured logs, W3C trace IDs, Sentry hooks (#18 ) - TS: pino + pino-http; every HTTP request log includes traceId from W3C traceparent header (generated if absent); forwarded to ml/serving on all /score, /generate, /reward, and /api/ml proxy calls - Python: structlog JSON; FastAPI middleware binds trace_id via contextvars so every log line within a request carries it - Sentry: optional SENTRY_DSN init in both runtimes (no-op if unset) - Replace all console.* calls across services/api with pino logger - Update tests to spy on logger instead of console Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-26 03:37:28 +00:00
alvis	7281af83a4	feat(bandit): promote egreedy-v2 (D=12, profile features) as active policy (#99 ) Offline sim gate passed — egreedy-v2 mean reward −0.629 vs egreedy-v1 −0.642 (5 users × 20 rounds, rule judge, seed 42). v2 wins 3/5 personas. - recommender.ts: switch remotePolicy() to /score/egreedy/v2 - recommender.ts: switch sendRewardWithRetry() to /reward/egreedy/v2 with profile_features payload so the ridge update uses the full D=12 vector - recommender.ts: re-fetch profile at feedback time (TTL-cached, near-instant) - ADR-0012: status Accepted → Promoted, promotion record appended Shadow entry egreedy-v2-shadow kept in registry (active: false) for rollback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-26 03:08:28 +00:00
alvis	2d7cf217a9	feat(ml): egreedy-v2 shadow policy — D=12 with profile features (#99 ) Ship the scaffolding for #99 (phase B.3 of #81): - ml/serving: add /score/egreedy/v2, /reward/egreedy/v2, /stats/egreedy/v2 endpoints (D=12). New feature dims: completion/dismiss rates, mean dwell (clipped 10min), preferred-hour alignment (cosine, 1-dim), tip volume (log). Separate state file per user (_egreedy_v2.json). /reset clears v2 state too. - ADR-0012: documents D=7→12 dimension change, normalization choices, shadow rollout protocol, and promotion gate (offline sim win per ADR-0002). - recommender.ts: register egreedy-v2-shadow in shadow-policy map (disabled by default). When enabled, calls /score/egreedy/v2 fire-and-forget and publishes shadow:egreedy-v2-shadow serve signal. No reward to shadow — sim is the gate. - sim runner/personas: personas carry synthetic profile_features per persona; _call_score/_call_reward thread profile_features through (None-safe for v1/linucb). - 18 new Python tests; all 56 Python + 170 TS tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:00:38 +00:00
alvis	7d4c29e137	feat(profile): user-profile feature registry + builder (phase A) Centralizes user-level features (completion_rate_30d, dismiss_rate_30d, mean_dwell_ms_30d, preferred_hour, tip_volume_30d) in a TS registry that owns both definition and SQL aggregation, since the data lives in the TS-owned SQLite tables (tip_views/tip_feedback). Lazy TTL refresh keeps recommend latency bounded; values persist in user_profile_features (KV). ml/serving accepts profile_features on /score + /generate but does not yet consume them — extending the bandit feature vector changes D and resets every user's learned state, so that's a deliberate phase-B step. Includes ml/features/profile_schema.py as a contract mirror with a sync test that diffs name sets against registry.ts. ADR-0011 records the data-locality reasoning (registry in TS, not Python as the issue originally suggested). Phase B (deferred): event-driven incremental updates, bandit consumption with state migration, admin per-user profile page, staleness alerts. Refs #81. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 00:22:22 +00:00
alvis	430804e9a5	feat(ml): prompt registry + per-request variant selection Replaces the hardcoded "v1" label with a real prompt registry: ml/serving/prompts.py — keyed by version: v1 (baseline), v2-mentor (calm/specific persona), v3-few-shot (v1 persona + curated examples) ml/serving/main.py — POST /generate accepts optional prompt_version, 422 on unknown, echoes the version actually used back in the response services/api/src/config.ts — TIP_PROMPT_VERSION: empty / single / comma-list (uniform random per request) services/api/src/routes/recommender.ts — pickPromptVersion() drives selection; the response's prompt_version (not a stale TS constant) is what lands in tip_scores so the #92 reward-analytics dashboard shows real per-variant reaction rates Closes #84. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 15:44:04 +00:00
alvis	e3ca3ba733	feat: SignalSource abstraction — generalize signal ingestion beyond Todoist (#78 ) - Add Signal + SignalSource interfaces to packages/shared-types - TipCandidate.features widened to Record<string,number\|boolean> to match Signal - TodoistSignalSource: encapsulates fetch, cache, 401 handling, bus events, and act() - SignalAggregator: parallel fan-out across sources with per-source failure isolation - Recommender refactored to consume Signal[] via aggregator; source action dispatch via aggregator.act() - ADR-0009: signal normalization strategy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-18 01:11:56 +00:00
alvis	ffdf70733f	feat: M2 AI tips — LiteLLM gateway, context assembler, end-to-end generation pipeline Issues closed: #86, #87, #88, #89, #90, #91, #79, #80, #82 infra: - docker-compose `ai` profile: Ollama + LiteLLM services - infra/litellm/litellm_config.yaml: tip-generator / embedder / judge aliases - .env.example: LITELLM_URL, LITELLM_MASTER_KEY, OLLAMA_URL ml/serving: - POST /generate: calls LiteLLM tip-generator alias, returns TipCandidate[] - JSON retry loop (2 retries with correction prompt on malformed response) - _parse_llm_json strips markdown fences ml/features: - context.py: build_context() assembles user signals → PromptContext (sorts overdue/high-priority tasks first for LLM prompt quality) shared-types: - TipKind, TipSource, TipCandidate types - Tip gains kind + rationale fields services/api: - recommender: 3-stage pipeline (assemble → score → serve) Stage 1: Todoist tasks + LLM candidates fetched in parallel Stage 2: egreedy bandit scores merged candidate pool Stage 3: serve + log with prompt_version, llm_model, tip_kind - tip_scores: prompt_version, llm_model, tip_kind columns + migrations - config: LITELLM_URL added - integrations: surface token_status in /integrations response tests: - ml/serving/tests/test_generate.py: 13 tests (retry, 502/503, fence variants) - ml/features/test_context.py: 9 tests (sorting, edge cases) - services/api recommender.unit.test.ts: 16 pure-function tests (inferReward, dueAgeDays) - services/api recommender.test.ts: 4 integration tests (tip_scores columns, LLM fallback) - shared-types: TipCandidate, rationale, full TipFeedback action set docs: - ADR-0008: LiteLLM AI gateway decision - overview.md: M2 pipeline description updated - ml/README.md: serving + features roles updated Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 14:09:02 +00:00
alvis	85367aeaa0	feat: MLOps external services, AI stack planning, admin MLOps hub Infrastructure: - Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db - infra/mlflow/basic_auth.ini for MLflow auth config - Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git) - Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow) Admin panel: - /admin/models: replace MLflow iframe with external link cards - /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets) - AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section Docs & planning: - README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases) - README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown - README: Phase 4 expanded with LLM MLOps items (#94-#97) - CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do - docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline - ADR-0006: updated to reflect external services (path-based, not embedded) - Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 08:20:44 +00:00
alvis	faf44c18fc	feat: ε-greedy v1 as active policy; dwell-time reward inference; offline sim framework - Promote egreedy-v1 to active serving policy (ADR-0007): /score/egreedy + /reward/egreedy replaces linucb-v1 endpoints after offline sim shows +10.7% mean reward (−0.548 vs −0.606) - Replace explicit helpful/not_helpful feedback with dwell-time inferred reward (inferReward): dismiss=−1.0, snooze=+0.1, done<15s=−0.3, done 15s–2min=+1.0, done 2–10min=+0.6, done>10min=+0.3 - Add ml/serving ε-greedy endpoints: /score/egreedy, /reward/egreedy, /stats/egreedy/{user_id} with d=7 feature vector (base 5 + sin/cos day-of-week encoding) - Add offline simulation framework (ml/experiments/sim): rule/LLM/claude-code judges, two-phase score+reward, synthetic personas, task generator; results stored in sim_runs/sim_events - Add /admin/simulations page: start runs, live-poll status, reward curve SVG, action/persona tables - Fix egreedy day_of_week training skew: reward endpoint now uses actual dow instead of hardcoded 0 - Fix runner.py proxy bypass: httpx.Client(trust_env=False) for localhost ML calls - Add dwellMs to TipFeedbackEvent contract and bus.test.ts fixture - Schema: sim_runs, sim_events tables; tip_feedback gains dwell_ms, reward_milli columns - ADR-0006: admin console framework; ADR-0007: egreedy-v1 policy selection rationale Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 07:44:37 +00:00
alvis	e62c726ea4	feat: M1 admin console — all 10 remaining pages + signal/quality/ops infrastructure Admin console (issues #63–72): - Event stream viewer: live-tail ring buffer (500 events) with subject/user filters - Feature store browser: per-user feature vector history from ml/serving - Model registry panel: MLflow embed at /admin/models - Experiment dashboard: LinUCB per-user stats (pulls, reward, θ) + bandit reset - Recommendation log: per-tip explainability (policy, score, features, latency) - Reward analytics: daily reaction breakdown + per-policy compare - Data quality widget: missing-feature rate, stale-token rate, daily completeness - Ops actions: replay-signal, policy enable/disable; user actions link to Users page - SQL runner: read-only SELECT runner with saved queries - Health rollup: fan-out to api/ml/sqlite/event-bus with auto-refresh Backend: - tip_scores table: logs features+policy+score+latency at every scoring call (#67) - saved_queries table: per-admin saved SQL (#71) - Event bus: 500-event ring buffer + tail() API (#63) - Admin routes: /events, /tips, /reward-analytics, /data-quality, /health, /policies, /replay-signal, /sql, /saved-queries endpoints - /api/ml/* admin-gated proxy to ml/serving (#64, #66) - Shadow-policy registry in recommender (#56) ML serving: - /reset/{user_id}: clear bandit state + feature history (#66) - /stats/{user_id}: pulls, cumulative reward, estimated mean, θ (#66) - /features/{user_id}: last 100 feature vectors logged at scoring time (#64) - Meta (pulls, rewards) persisted alongside A/b matrices Web: - Tip action sheet adds Helpful / Not helpful buttons (#62) - TipFeedback type extended with helpful/not_helpful actions - Rewards mapped: helpful=+0.5, not_helpful=−0.5 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 03:56:48 +00:00
alvis	c7edd92e15	feat: M1 — LinUCB bandit, RemotePolicy, Web Push, event bus ML serving: - LinUCB contextual bandit (disjoint, d=5 features: hour_sin/cos, is_overdue, task_age, priority) - /score endpoint replaces stub random; /reward endpoint for online learning - Per-user model state persisted to disk as JSON (survives restarts) - venv at ml/serving/.venv; start with pnpm dev from ml/serving Recommender: - Todoist fetch now extracts features (is_overdue, task_age_days, priority) - RemotePolicy calls ml/serving with 3s timeout; falls back to RandomPolicy - Reward sent to /reward on feedback (done=+1, snooze=0, dismiss=-1) Web Push: - VAPID keys in config; push_subscriptions table in DB - POST/DELETE /api/push/subscribe; GET /api/push/vapid-public-key - Service worker (public/sw.js): push → showNotification, notificationclick → focus/open - "notify me" button on tip page; registers SW + subscribes on permission grant Event bus: - services/api/src/events/bus.ts: typed EventEmitter wrapper - Subjects: signals.tip.served, signals.tip.feedback, signals.task.synced - Same publish/subscribe API NATS JetStream will implement — swap is mechanical Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 14:08:00 +00:00
alvis	f6c890213b	feat: complete M0 — legal pages, consent, tip_views metrics, account deletion UI - /legal/terms and /legal/privacy pages (linked from sign-in) - Consent (consentGiven=true) recorded on first Google sign-in - tip_views table: one row per tip served — enables activation + reaction rate queries - tip_views purged on account deletion - Delete account button on /connect (confirm → revoke tokens → purge data → sign out) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 09:09:08 +00:00
alvis	3123cb73fb	feat: Phase 0 walking skeleton — auth, Todoist integration, tip page - Google OAuth2/PKCE flow via openid-client v6; session cookie (30-day) - Next.js middleware auth guard — redirects before any client render - Todoist OAuth2 connect/disconnect; REST v1 task fetch (today\|overdue) - RandomPolicy recommender behind stable POST /recommend contract - Feedback endpoint (done/dismiss/snooze); marks task complete in Todoist - 30s in-memory task cache per user (~1ms recommend on cache hit) - Tip page: pure opacity fade-in (3.5s), fast fade-out (0.3s), no motion - "reading you…" loading text with breathe animation - PWA icons + manifest - Ports pinned: API=3078, web=3079; Caddy at o.alogins.net Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 08:53:38 +00:00
alvis	65218762be	feat: Phase 0 walking skeleton — monorepo, API, web, ML stub Sets up the full Phase 0 foundation: - pnpm workspaces + turbo build graph; native module build approval - packages/shared-types: HTTP contracts (Tip, Auth, Integrations, User) - services/api: Express modular monolith with better-sqlite3/drizzle - auth: Google OAuth2 + PKCE via openid-client v6, cookie sessions - integrations: Todoist OAuth2 connect/disconnect, token vault - recommender: RandomPolicy over Todoist tasks, feedback sink - user: profile, consent capture, full account deletion (GDPR) - apps/web: Next.js 15, three pages (sign-in → connect → tip) - tip page: black canvas, hold-to-act gesture, action sheet - PWA manifest + theme - ml/serving: FastAPI stub implementing the POST /score contract - infra: docker-compose (core/full profiles), Dockerfiles, CI skeleton - .env.example with all required vars documented Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 12:41:24 +00:00

20 Commits