Infrastructure: - Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db - infra/mlflow/basic_auth.ini for MLflow auth config - Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git) - Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow) Admin panel: - /admin/models: replace MLflow iframe with external link cards - /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets) - AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section Docs & planning: - README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases) - README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown - README: Phase 4 expanded with LLM MLOps items (#94-#97) - CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do - docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline - ADR-0006: updated to reflect external services (path-based, not embedded) - Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.3 KiB
oO — Project Instructions
What this is
oO is a recommendation system for personal tips. It collects signals across a user's life (tasks, habits, calendar, mood, context) to build a rich profile and deliver one perfectly-timed tip — an advice or a todo — that feels like magic.
The magic is the product. Precision + timing + minimalism. The UI shows a single black page with one tip. The complexity lives behind it.
Prime directives
- Modular by package, deployable by stage. Contracts live at package boundaries from day one so extraction to a service is cheap. Deploy topology evolves with real pressure (team size, scaling hotspots, language boundaries), not with wishful architecture. Phase 0 = modular monolith + Python ML sidecar. See ADR-0003.
- Recommendation engine is the core. Every other module feeds it or renders its output. Design schemas, event contracts, and APIs with that in mind.
- Python owns ML. Training, features, online scoring are Python (FastAPI + PyTorch/scikit + MLflow/Feast). Application code is TypeScript (Node, Next.js) unless there's a reason.
- OAuth-first for identity and integrations. Never ask users for passwords or raw API keys when a delegated-auth flow exists. Store provider tokens encrypted, refresh transparently.
- Privacy is a feature, not a phase. Consent capture, token revocation, and account deletion exist from the first real user. Data minimization: store the token + derivatives we need, not the raw feed.
- Feel-of-magic over feature count. When in doubt, ship fewer things, polished. The tip page is a watch face.
Architecture (high level)
The tree below is logical module structure. Directory layout is stable; how many processes you deploy is a stage decision (ADR-0003).
apps/ user-facing clients
web/ Next.js PWA — the first shipped client
mobile-ios/ Swift/SwiftUI (Phase 3)
mobile-android/ Kotlin/Compose (Phase 3)
services/ backend modules — each owns a contract; may share a deployable
gateway/ BFF for clients; auth check; fan-out
auth/ OAuth (Google, Apple, ...), sessions, JWT issuance
profile/ user profile, preferences, consents
integrations/ third-party connectors + token vault (Todoist first)
recommender/ orchestration: candidates → policy → tip; feedback sink
events/ event bus ingress + durable signal store
notifier/ push/email/web delivery (web push from Phase 1)
packages/ shared libraries (importable across services + apps)
shared-types/ HTTP types via OpenAPI; event types via protobuf (ADR-0005)
sdk-js/ client SDK used by web + mobile webviews
ui/ shared React components + design tokens
ml/ Python — separate deployable from day one
serving/ online scorer (FastAPI), called by recommender
features/ feature definitions + store adapter
pipelines/ batch feature + training DAGs (Prefect/Airflow)
registry/ MLflow model registry integration
experiments/ assignment + A/B + bandit policies
notebooks/ research only; never imported by production code
infra/ docker-compose (Phase 0), k3s/k8s (later), terraform, CI
docs/ architecture notes, ADRs, API specs
Phase 0 deployables: one Node process (services/* bundled via modular monolith) + one Python process (ml/serving, stubbed until M1) + Postgres + NATS. Services extract to their own process when a real reason appears: language boundary, scaling hotspot, team ownership, or SLA divergence. See ADR-0003.
Contracts between modules
- HTTP (OpenAPI, in
packages/shared-types/http/) — synchronous request/response. In-process today; over the network once extracted. Signatures are identical. - Events (Protocol Buffers, in
packages/shared-types/events/) — durable signals + feedback. Today: in-process event emitter. Tomorrow: NATS JetStream. Schema registry enforced in CI (ADR-0005). - Do not redefine types per module. Regenerate from
shared-types.
Conventions
- Each module ships a
README.mddescribing its contract, its/healthstory, and its extraction criteria (when it should become its own process). - One PR = one concern. Conventional-commit prefixes (
feat:,fix:,chore:,docs:,refactor:). - ADRs go in
docs/adr/NNNN-title.mdfor any decision that constrains future work. - No secrets in repo. Local dev via
.env.local(gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later). - Compose profiles:
core(api + web + admin),full(adds ml-serving),mlops(adds MLflow + Airflow),ai(adds Ollama + LiteLLM). Mix as needed.
Definition of done (per feature)
- Code + tests merged.
- Module's
README.mdupdated. - If it changes a contract →
shared-typesregenerated + consumers updated. - If it changes architecture → ADR added.
- Deployable via
docker compose uplocally. - If it touches user data → a deletion path exists and is tested.
AI stack
oO generates tips with an LLM and ranks them with a bandit. All LLM calls route through LiteLLM at llm.alogins.net using model aliases — swapping models is a config change, not a code change.
| Alias | Model | Used by |
|---|---|---|
tip-generator |
qwen2.5:7b (default) | ml/serving tip generation |
embedder |
nomic-embed-text | task clustering, dedup |
judge |
claude-haiku-4-5 (cloud, eval only) | offline sim |
Env vars: LITELLM_URL (default http://localhost:4000), OLLAMA_URL (default http://localhost:11434).
Start with: docker compose --profile ai up (adds Ollama + LiteLLM locally). In prod both are shared Agap services.
LLM tip generation pipeline:
ml/features/context.pyassembles user signals → structured prompt contextPOST /generateinml/servingcalls LiteLLM → returnsTipCandidate[]- Bandit policy in
ml/servingscores + ranks candidates - Best candidate returned as tip; reaction closes the online reward loop
Current phase
M1 shipped. M2 (AI tips) in progress. See README.md for the phase roadmap and docs/architecture/ for diagrams. Work is tracked as Gitea milestones + issues on alvis/oO.
Active work: AI tip generation pipeline — issues #86–#93 in M2 milestone.
What NOT to do
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
- Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
- Don't hardwire a recommender. The contract is
POST /recommend → {tip}. Swap internals (bandit, LLM, hybrid), keep contract. - Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
- Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).
- Don't call LLMs directly from application code. All LLM calls go through
ml/serving(Python) viaLITELLM_URL. The TS recommender never holds a model name. - Don't embed MLflow/Airflow/OpenWebUI in the admin panel. They are external services; link out to them. The admin shell links to
o.alogins.net/mlflow,/airflow,ai.alogins.net.