refactor(consents): data-source consents only; drop per-agent consent gate #127

Closed
opened 2026-05-11 11:35:51 +00:00 by alvis · 0 comments
Owner

Problem

Manifest required_consents lists today mix two unrelated dimensions:

  • data:<source> — what data the agent reads.
  • agent:<id> — whether the user opted into this specific agent.

In practice nothing ever auto-grants either:

  • Only data:core is auto-granted on signup (services/api/src/routes/auth.ts:108, services/api/src/routes/user.ts:31).
  • The OAuth callbacks in services/api/src/routes/integrations.ts (Todoist L113, Google Health L201) write only to integration_tokens — they never insert a data:<provider> consent row.
  • There is no UI that grants agent:<id> consents.

Result: the eligibility filter at services/api/src/profile/eligibility.ts:72 drops every agent for every realistic user, even when snippets are freshly pre-computed.

Symptom: MLflow trace tr-591449ea8a72af8e81b6a585234a86ab — user ODGp4Gkr7JWemMsqcMLMn has 5 valid (non-expired) rows in agent_outputs from the scheduler, but recommender called /recommend with agent_ids: [] and the orchestrator fell back to its no-context prompt. The user has Todoist connected and Google Health connected previously, but only data:core in user_consents.

Decision

Collapse to a single consent dimension: data source. Consent is implicit in connecting the source.

  1. Connecting a data source via the UI grants data:<provider>. Disconnecting revokes it.
  2. data:core continues to be auto-granted on signup (no integration needed).
  3. The agent:<id> consent concept is removed. Per-agent control becomes a preference, not a consent: user_preferences[scope='agent:<id>', key='enabled']. eligibility.ts:74 already honours this pref — it just stops being load-bearing once agent:* is removed from manifests.
  4. Eligibility rule (final): an agent is eligible iff every data:* it declares is granted, no active context is in silenced_in_contexts, and its enabled preference is not false.

Plan

1. Auto-grant data:<provider> on connect, revoke on disconnect

In services/api/src/routes/integrations.ts:

  • Todoist callback (after L120 insert(integrationTokens)): upsert user_consents row with consentKey='data:todoist', revokedAt=null, grantedAt=now. Use the same onConflict(target=[userId,consentKey]) shape as services/api/src/routes/user.ts:30-34.
  • Google Health callback (after L210): same, with consentKey='data:google-health'.
  • DELETE /:provider handler (L216): set revokedAt=now on the matching consent row. Do not hard-delete — preserve audit trail.
  • Factor a helper grantDataSourceConsent(userId, provider) / revokeDataSourceConsent(userId, provider) so future providers call one function and provider → consentKey lives in one place.

2. Backfill existing tokens

Migration in services/api/src/db/migrations.ts: for every row in integration_tokens with tokenStatus='active', insert a data:<provider> consent row if missing. Idempotent. Required so users who connected before this change aren't stuck.

3. Drop agent:<id> from manifests

Edit required_consents in every file under ml/agents/*.py:

  • overdue_task.py:73
  • focus_area.py:38
  • momentum.py:124
  • time_of_day.py:129
  • recent_patterns.py:134
  • health_vitals.py:43

Remove only the trailing agent:<id> entry from each list. data:core and data:<source> entries stay.

Update ml/agents/tests/test_manifest.py:46-47: assert every entry in required_consents starts with data: (no agent: allowed) and data:core is present.

(Optional, separate follow-up): rename the field required_consentsrequired_data_sources on the manifest type (ml/agents/manifest.py:42) and the TS wire shape (services/api/src/profile/eligibility.ts:20) once the consent type is collapsed. Skip if it adds churn.

4. Eligibility filter

No behaviour change once manifests are clean. Update the docblock at services/api/src/profile/eligibility.ts:1-12 to describe the new model: data-source consents + active-context silencing + per-agent enabled preference.

5. Admin / Web UI

  • Existing /connect page already lists integrations and statuses; that page is the consent surface (no new UI required for the consent half).
  • New: per-agent enable/disable toggle in agent settings that writes user_preferences[scope='agent:<id>', key='enabled']. This replaces the user-facing role the old agent:* consents pretended to play.

6. Tests

  • Update services/api/src/profile/__tests__/eligibility.test.ts cases that include agent:* consents — drop them (most cases only ever included data:*).
  • New unit test in services/api/src/routes/__tests__/integrations.test.ts (or wherever this lives): Todoist callback inserts data:todoist into user_consents; DELETE /:provider sets revokedAt.
  • Migration test: a user with an active Todoist token and no data:todoist consent gets one inserted after backfill runs; idempotent on a second run.

7. ADR

Add docs/adr/0015-data-source-consents.md superseding the relevant section of ADR-0014. State the new rule: agent manifests express data dependencies as data:* only; per-agent control is a preference, not a consent. Reference this issue and trace tr-591449ea8a72af8e81b6a585234a86ab.

Out of scope

  • Google token refresh.
  • Renaming the manifest field — optional, separate PR.

Verification

After this lands, user ODGp4Gkr7JWemMsqcMLMn (Todoist + Google Health both connected) should pass eligibility for all five (six, incl. health-vitals) currently registered agents on the next /recommend request. The orchestrator should receive a non-empty agent_ids. The pending followup ("don't schedule non-consented agents") becomes safe to land after this.

## Problem Manifest `required_consents` lists today mix two unrelated dimensions: - `data:<source>` — what data the agent reads. - `agent:<id>` — whether the user opted into this specific agent. In practice nothing ever auto-grants either: - Only `data:core` is auto-granted on signup (`services/api/src/routes/auth.ts:108`, `services/api/src/routes/user.ts:31`). - The OAuth callbacks in `services/api/src/routes/integrations.ts` (Todoist `L113`, Google Health `L201`) write only to `integration_tokens` — they never insert a `data:<provider>` consent row. - There is no UI that grants `agent:<id>` consents. Result: the eligibility filter at `services/api/src/profile/eligibility.ts:72` drops every agent for every realistic user, even when snippets are freshly pre-computed. Symptom: MLflow trace `tr-591449ea8a72af8e81b6a585234a86ab` — user `ODGp4Gkr7JWemMsqcMLMn` has 5 valid (non-expired) rows in `agent_outputs` from the scheduler, but `recommender` called `/recommend` with `agent_ids: []` and the orchestrator fell back to its no-context prompt. The user has Todoist connected and Google Health connected previously, but only `data:core` in `user_consents`. ## Decision Collapse to a single consent dimension: **data source**. Consent is implicit in connecting the source. 1. Connecting a data source via the UI grants `data:<provider>`. Disconnecting revokes it. 2. `data:core` continues to be auto-granted on signup (no integration needed). 3. The `agent:<id>` consent concept is removed. Per-agent control becomes a **preference**, not a consent: `user_preferences[scope='agent:<id>', key='enabled']`. `eligibility.ts:74` already honours this pref — it just stops being load-bearing once `agent:*` is removed from manifests. 4. Eligibility rule (final): an agent is eligible iff every `data:*` it declares is granted, no active context is in `silenced_in_contexts`, and its `enabled` preference is not `false`. ## Plan ### 1. Auto-grant `data:<provider>` on connect, revoke on disconnect In `services/api/src/routes/integrations.ts`: - Todoist callback (after `L120` `insert(integrationTokens)`): upsert `user_consents` row with `consentKey='data:todoist'`, `revokedAt=null`, `grantedAt=now`. Use the same `onConflict(target=[userId,consentKey])` shape as `services/api/src/routes/user.ts:30-34`. - Google Health callback (after `L210`): same, with `consentKey='data:google-health'`. - `DELETE /:provider` handler (`L216`): set `revokedAt=now` on the matching consent row. Do not hard-delete — preserve audit trail. - Factor a helper `grantDataSourceConsent(userId, provider)` / `revokeDataSourceConsent(userId, provider)` so future providers call one function and `provider → consentKey` lives in one place. ### 2. Backfill existing tokens Migration in `services/api/src/db/migrations.ts`: for every row in `integration_tokens` with `tokenStatus='active'`, insert a `data:<provider>` consent row if missing. Idempotent. Required so users who connected before this change aren't stuck. ### 3. Drop `agent:<id>` from manifests Edit `required_consents` in every file under `ml/agents/*.py`: - `overdue_task.py:73` - `focus_area.py:38` - `momentum.py:124` - `time_of_day.py:129` - `recent_patterns.py:134` - `health_vitals.py:43` Remove only the trailing `agent:<id>` entry from each list. `data:core` and `data:<source>` entries stay. Update `ml/agents/tests/test_manifest.py:46-47`: assert every entry in `required_consents` starts with `data:` (no `agent:` allowed) and `data:core` is present. (Optional, separate follow-up): rename the field `required_consents` → `required_data_sources` on the manifest type (`ml/agents/manifest.py:42`) and the TS wire shape (`services/api/src/profile/eligibility.ts:20`) once the consent type is collapsed. Skip if it adds churn. ### 4. Eligibility filter No behaviour change once manifests are clean. Update the docblock at `services/api/src/profile/eligibility.ts:1-12` to describe the new model: data-source consents + active-context silencing + per-agent `enabled` preference. ### 5. Admin / Web UI - Existing `/connect` page already lists integrations and statuses; that page is the consent surface (no new UI required for the consent half). - New: per-agent enable/disable toggle in agent settings that writes `user_preferences[scope='agent:<id>', key='enabled']`. This replaces the user-facing role the old `agent:*` consents pretended to play. ### 6. Tests - Update `services/api/src/profile/__tests__/eligibility.test.ts` cases that include `agent:*` consents — drop them (most cases only ever included `data:*`). - New unit test in `services/api/src/routes/__tests__/integrations.test.ts` (or wherever this lives): Todoist callback inserts `data:todoist` into `user_consents`; `DELETE /:provider` sets `revokedAt`. - Migration test: a user with an active Todoist token and no `data:todoist` consent gets one inserted after backfill runs; idempotent on a second run. ### 7. ADR Add `docs/adr/0015-data-source-consents.md` superseding the relevant section of ADR-0014. State the new rule: agent manifests express data dependencies as `data:*` only; per-agent control is a preference, not a consent. Reference this issue and trace `tr-591449ea8a72af8e81b6a585234a86ab`. ## Out of scope - Google token refresh. - Renaming the manifest field — optional, separate PR. ## Verification After this lands, user `ODGp4Gkr7JWemMsqcMLMn` (Todoist + Google Health both connected) should pass eligibility for all five (six, incl. health-vitals) currently registered agents on the next `/recommend` request. The orchestrator should receive a non-empty `agent_ids`. The pending followup ("don't schedule non-consented agents") becomes safe to land after this.
alvis added the mlbackend labels 2026-05-11 11:35:51 +00:00
alvis closed this issue 2026-05-11 11:47:36 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: alvis/oO#127