Method
Direction for the agentic conversion system. Nothing here is a committed design — it states what we want to build and what we want to figure out by building it.
What we want to build
A pipeline that takes a conversion question, pulls relevant data from the sources mapped in semantic.md, surfaces what looks wrong, proposes hypotheses that pass the synthesis meta-checklist (data integrity / cohort identity / confounder), and produces concrete page / content / creative ideas to test on the new web.
The first versions can be rough. The point is a pipeline we can run, inspect, criticize, and improve — not a perfect first answer.
Pattern we are exploring
analytics → suggestions → execution
- Analytics: what is happening in the funnel right now.
- Suggestions: why it might be happening; what to test.
- Execution: the concrete change to make and how to evaluate it.
This is the shape we are starting with. It may evolve as we learn what works.
Capabilities the system should grow into
In rough order:
- Run a conversion question across the relevant sources without re-discovering the schema each time
- Identify the biggest funnel leak and the worst segment at meaningful volume
- Surface replay / session candidates worth a human review
- Name the specific mechanism behind each finding (no canonical taxonomy — describe the thing in plain language)
- Generate hypotheses with evidence for and against
- Produce a test brief: change, segment, success metric, stop rule
- Generate copy / page-section ideas concrete enough to hand to a designer or developer
- Improve from explicit feedback after each run
Architecture
The committed design — agent topology, models, sources per persona, runtime, run storage, build approach — lives in architecture.md. This file stays direction-focused.
Tooling we need to build
Capabilities the system needs but does not yet have. New gaps are added when surfaced by a real run; entries are removed when delivered.
- PostHog
replay_recordingszero-result behavior — first PoC run foundreplay_recordingsreturned 0 sessions despite candidates being present in the same query window. Verify the tool’s filter shape against PostHog’s API and either fix the MCP wrapper or document the param shape that returns results. Surfaced fromruns/2026-05-08-cyf-pass-through. - Cloudflare bot filtering — Cloudflare Bot Management API for filtering bot traffic from funnel reads. Named in week-4 as a likely contributor to direct-traffic inflation — promote when the direct share looks anomalous in a run.
- Per-creative on-site progression — given a list of active Meta creatives, return on-site progression metrics (visit → product → order) per creative so the team can manually back-trace which creative concepts perform on the current web. Precondition for the creative-to-landing match analysis. Likely a PostHog query layered with Meta ad-name attribution.
- Sentry — verify, complete, integrate (near-term). Three steps in order:
- Verify Sentry is the right error-tracking tool, or whether we need error tracking at all on the frontend.
- If yes:
@sentry/reactis installed inversion1but notversion2; install on all variants so checkout and runtime errors in the variant under test are visible. - Expose recent errors as a tool inside the analytics MCP so the agent can query them when diagnosing technical drop-offs.
- Microsoft 365 Connector verification — Data Master is accessible in Claude.ai chats via the Microsoft 365 Connector. Confirm a Claude Code subagent (not a claude.ai chat) can read the live SharePoint workbook through the connector; document the access path. For the Mastra runtime later this won’t work — a Microsoft Graph API integration in the analytics MCP will be needed at that point.
- Cloudflare Browser Run setup — for the Page/UX subagent’s live-site navigation. Confirm Cloudflare account, configure CDP endpoint, plug
chrome-devtools-mcpinto the subagent. Same primitive will serve the Bun/Mastra runtime later viachromium.connectOverCDP(). - Conversion-patterns skill — niche-specific named pattern library so the Page/UX agent can identify what’s missing on a page beyond generic UX heuristics. v0 (LLM synthesis) is generated; v1 will crawl reference pages with the Page/UX tooling (CEO input pending on brand list); v2 will be team-curated from patterns that recur in real runs. See
../skills/conversion-patterns/PLAN.md. - Frontend
af_internal=agentopt-out — the Page/UX agent navigates the live site for diagnosis (~30 page-views per run); without an opt-out, that traffic pollutes PostHog / GTM / Plausible at a non-trivial share of low-volume days. Add a small handler inshared/src/utils/: when?af_internal=agentis present on initial page load, callposthog.opt_out_capturing()and short-circuit any GTM dataLayer pushes for the session. Once shipped, the agent skill (conversion-page-inspection) gets a rule to append the param on everynavigate_page. - Run-output indexing — once
runs/grows beyond ~50 entries, a flatindex.mdor lightweight DB will help. Defer.
Open questions
A few design questions remain — answered by running the system, not by writing them up front:
- Whether named patterns emerge across runs that would be worth promoting into a small shared vocabulary (curation-driven, not pre-committed taxonomy)
- Where execution stops being manual and starts being agent-generated
Principles for whatever we build
- Facts and hypotheses are always separated; volatile numbers stay in the run output, not in the KB.
- Agent outputs are hypotheses, not findings. Every claim is evaluated for plausibility, then either falsified by data or tested at small scale — never absorbed as truth.
- The synthesis meta-checklist in
semantic.md(data integrity / cohort identity / confounder) is applied before any finding is finalised — “data is misleading” is always on the table. - Stakeholder-facing language stays plain.
- Rough is fine; opaque is not — the pipeline must be inspectable.