What is journey testing in CRO?

Journey testing is experimentation where the goal and measurement span multiple steps, not a single page. You still randomise users into control and treatment, but you evaluate impact on outcomes that happen later in the journey, such as activation or purchase.

How do I define a journey for experimentation?

Define a journey as a start event, a sequence of key steps, and an end event. A practical starting point is ad click, landing page, sign-up, onboarding touchpoint, and first value event. You then decide the analysis window and what counts as exposure.

Can I randomise at the page level and measure a journey outcome?

Sometimes, but it is risky. Page-level randomisation can leak across steps when users revisit, switch devices, or enter through a different channel. For most journey goals, user-level assignment is safer and easier to interpret.

Why do journey experiments need more sample size?

Down-funnel outcomes are rarer. As conversion rates drop, you need more users to detect the same relative effect. If you only have enough traffic for page-level metrics, use driver metrics with guardrails and treat downstream impact as directional.

How do I choose the primary metric for a journey test?

Choose the metric that matches the decision, is sensitive within your runtime, and is predictive of long-term value. Many teams use an Overall Evaluation Criterion plus driver metrics and guardrails. The best primary metric is the one you will actually act on.

Can I run journey tests concurrently?

Yes, if they do not overlap the same steps for the same users. Overlap creates interaction effects and ambiguous attribution. If you are unsure, stagger tests or use an experimentation platform that supports safe overlapping designs.

What is the biggest attribution trap in journey testing?

Treating analytics attribution as causal proof. A journey test needs experimental assignment and a clear exposure definition. Attribution models can help you debug, but they do not replace randomisation and careful metric design.

From page tests to journey tests

Trends say CRO is shifting from isolated pages to full journeys. Here is the practical how, with examples, constraints, and a playbook that works without big-tech infrastructure.

Andrea Corvi

Last updated: 18 October 2025

Why page level optimisation stops scaling

Page tests are not wrong. They are often the fastest place to start. The problem is the assumption that local lift equals global impact. Users do not convert in a straight line. They bounce, return later, switch devices, and arrive through different channels. A single landing page is one touchpoint in a longer sequence.

Journey testing is the practical response. You still run controlled experiments, but you design them around how conversion happens in reality. That changes which unit you randomise, which metric you optimise, and how you interpret sample size and attribution.

Local lift is not enough

Improving CTR on a page can shift who reaches downstream steps. The downstream mix changes, and the business outcome can move differently.

Channels interact

Email, paid search, and in-product nudges can amplify or cancel each other. Journey experiments force you to state which touchpoints are in scope.

Devices break assumptions

Session based assignment falls apart when a user starts on mobile and completes on desktop. The journey spans identities.

A beginner friendly way to define a journey

Start with a sentence. A journey is a start event, the critical steps you might change, and an end event that matches the decision. For many teams that looks like ad click, landing page, sign-up, onboarding touchpoint, and first value event.

Then decide three things. Your unit of analysis, your exposure definition, and your analysis window. These three choices decide whether results are interpretable and whether you can scale the programme.

A beginner friendly journey definition

Start simple. Write a start event, the key steps you will change, and a single end event you want to improve. Then decide the analysis window and whether you randomise by user, account, or session.

Choose a unit that matches the decision

User: best default for most journeys, especially cross-device and return visits.
Account: common in B2B SaaS when multiple users share the same organisation outcome.
Session: only safe when you care about an immediate effect and repeat exposure is unlikely.

If you are not sure, default to user-level assignment. It is easier to explain and less likely to create accidental contamination.

Journey experiment types you can run today

You do not need a complex platform to start. You need a clear scope and a consistent template. The patterns below work for intermediate practitioners and small teams, as long as you are honest about constraints.

Multi-step funnel tests

Change messaging and friction across steps, not just the entry page. Evaluate impact on the end event, and use driver metrics per step to debug where lift comes from.

•Landing headline plus sign-up form friction
•Pricing message plus checkout reassurance

Onboarding flows

Optimise activation, not just sign-up. This is often the highest leverage journey test for SaaS, because it changes long-term retention through early experience.

•Guided setup versus self-serve
•First value checklists and progressive disclosure

Email plus on-site tests

Coordinate a lifecycle touchpoint with an on-site experience. The design challenge is to keep assignment consistent across channels and avoid partial exposure.

•Onboarding email plus in-app nudge
•Cart abandonment email plus checkout banners

The practical constraints you must plan for

Journey testing makes three problems more visible. Attribution, sample size, and metric selection. If you do not plan for them, you will ship faster for a month and then stall because results are hard to interpret.

Sample size grows down-funnel

The deeper the outcome, the rarer it is. That usually means a longer runtime or more traffic. If you cannot afford that, use a driver metric as your primary and treat the deeper outcome as directional.

Attribution is not causality

In journey testing you will see more cross-channel paths and delayed conversions. Attribution models can help you debug, but the experiment design is what makes the result causal. Define exposure, define windows, and stick to the plan.

Metric selection needs guardrails

Journey experiments are more likely to move intermediate metrics. Use guardrails to avoid accidental harm. If you are not already doing this, start with the guardrail metrics guide.

A practical stance for small teams

Use one primary journey metric, and keep step metrics as diagnostics
Plan sample size for the primary, not for every step
If you look at many metrics, treat wins more conservatively

The winner's curse applies here too. When you celebrate only significant wins, you tend to overestimate lift. See the winner's curse article.

A practical playbook for 1 to 3 person teams

Journey testing feels advanced because it forces clarity. The good news is that the mechanics are simple once you have a stable brief template and a checklist.

Before you launch

•Write the journey in one sentence, and list steps in scope
•Choose the unit and define exposure, including re-entry rules
•Set the primary metric, drivers, and 2 to 4 guardrails
•Plan sample size and duration for the primary outcome

After you launch

•Monitor SRM, exposure, and instrumentation before reading effects
•Interpret step metrics as diagnostics, not as success criteria
•Document a decision with confidence intervals, not just p-values
•Feed learnings into the backlog, and keep the cadence consistent

Use a test template

Plan sample size

Check SRM

References

Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy Online Controlled Experiments. Cambridge University Press. Cambridge
Kohavi, R., et al. (2020). Online controlled experiments at scale: Lessons and extensions to medicine. Trials. Trials
Deng, A., et al. (2018). Pitfalls of Long-Term Online Controlled Experiments. Microsoft Research. PDF
Tang, D., et al. (2010). Overlapping Experiment Infrastructure: More, Better, Faster Experimentation. Google. PDF

Share this article

Share this with your team

Help others design experiments that match how users actually convert.

Related Resources

Sample Size Calculator

Plan experiments with proper power analysis.

Guardrail Metrics

Protect experiments from hidden harm.

Design Template

Structured checklist for trustworthy experiments.

Winner's Curse

Why significant A/B test wins overestimate impact.

Experiment Velocity

Ship more tests with templates and guardrails.

Frequently Asked Questions

Make the next journey test easier

Use a standard template, define guardrails, and plan sample size for the metric you will actually ship on.

Use the template

Plan sample size

Set guardrails

Why page level optimisation stops scaling

A beginner friendly way to define a journey

References

Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy Online Controlled Experiments. Cambridge University Press. Cambridge
Kohavi, R., et al. (2020). Online controlled experiments at scale: Lessons and extensions to medicine. Trials. Trials
Deng, A., et al. (2018). Pitfalls of Long-Term Online Controlled Experiments. Microsoft Research. PDF
Tang, D., et al. (2010). Overlapping Experiment Infrastructure: More, Better, Faster Experimentation. Google. PDF

Frequently Asked Questions

From page tests to journey tests

Why page level optimisation stops scaling

Local lift is not enough

Channels interact

Devices break assumptions

A beginner friendly way to define a journey

A beginner friendly journey definition

Choose a unit that matches the decision

Journey experiment types you can run today

Multi-step funnel tests

Onboarding flows

Email plus on-site tests

The practical constraints you must plan for

Sample size grows down-funnel

How to read this chart

Attribution is not causality

Metric selection needs guardrails

A practical stance for small teams

A practical playbook for 1 to 3 person teams

References

Share this with your team

Related Resources

Frequently Asked Questions

What is journey testing in CRO?

How do I define a journey for experimentation?

Can I randomise at the page level and measure a journey outcome?

Why do journey experiments need more sample size?

How do I choose the primary metric for a journey test?

Can I run journey tests concurrently?

What is the biggest attribution trap in journey testing?

Make the next journey test easier

From page tests to journey tests

Why page level optimisation stops scaling

Local lift is not enough

Channels interact

Devices break assumptions

A beginner friendly way to define a journey

A beginner friendly journey definition

Choose a unit that matches the decision

Journey experiment types you can run today

Multi-step funnel tests

Onboarding flows

Email plus on-site tests

The practical constraints you must plan for

Sample size grows down-funnel

How to read this chart

Attribution is not causality

Metric selection needs guardrails

A practical stance for small teams

A practical playbook for 1 to 3 person teams

References

Share this with your team

Related Resources

Frequently Asked Questions

What is journey testing in CRO?

How do I define a journey for experimentation?

Can I randomise at the page level and measure a journey outcome?

Why do journey experiments need more sample size?

How do I choose the primary metric for a journey test?

Can I run journey tests concurrently?

What is the biggest attribution trap in journey testing?

Make the next journey test easier