chore(scheduler): skip agents whose data sources aren't granted #128
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
services/api/src/signals/agent-scheduler.ts:70-79iterates every (active-user × agent) pair every 15 minutes and callscomputeAndStoreunconditionally. There is no consent check — the recommender's eligibility filter (services/api/src/profile/eligibility.ts) discards every snippet at request time for users who haven't granted the corresponding data sources.For user
ODGp4Gkr7JWemMsqcMLMn(MLflow tracetr-591449ea8a72af8e81b6a585234a86ab) this manifests as 5 fresh rows inagent_outputsthat no/recommendcall will ever forward. Multiply by every user who connected nothing and you have continuous wasted DB writes, wastedml/serving /agents/{id}/infercalls, and noisy logs.Fix
In
runCycle(services/api/src/signals/agent-scheduler.ts:70), look up the eligibility set once per user and skip agents not in it:getEligibleAgentIdsalready considers active contexts and the per-agentenabledpreference, so a user in a silenced context (e.g.vacation) also stops generating snippets. Good side effect.Dependency
Blocked by #127 — until the data-source consent refactor lands, this filter would drop everything for almost every user (no one has the per-agent consents granted today). Land #127 first, then this becomes safe.
Optional cleanup
After both land, prune
agent_outputsrows that no longer pass eligibility for their owner. Optional — they expire on their own; skip unless table growth is a concern.Tests
Add a unit test in
services/api/src/signals/__tests__/agent-scheduler.test.ts(or extend an existing one): seed a user withdata:coreonly and an agent manifest that requiresdata:todoist; run one cycle; assert noagent_outputsrow is written for the Todoist-requiring agent and one is written for adata:core-only agent.Verification
After shipping, run one scheduler cycle against a fresh user with no integrations: expect zero
agent_outputsrows (onlydata:coreis granted, and no agent requires onlydata:coretoday after #127 — health-vitals requiresdata:google-health, focus/overdue requiredata:todoist, etc.). Confirm log line showsokincludes only consented-source agents.