The second edition of Speero’s Experimentation Program Benchmark Report (for 2023) is completed. From our first report in 2021, the landscape is radically different. The pandemic, recession, and inflation, all should’ve decreased experimentation across companies and brands. Oh, did we mention AI? But, this isn’t what happened.

In fact, experimentation maturity has increased across the board. Even in the most competitive business environments, brands are turning toward experimentation to stay ahead. Whether it’s to drive growth in times of recession, learn and adapt quickly in volatile markets, or to make sound decisions and avoid risks or failures in highly competitive markets.

The audit structure is still the same, but the question set has evolved since its first incarnation as we had learnings from running more and more program audits. Still, this benchmark audit remains a classic program management tool meant to help ask questions to find gaps in the efficiency (speed) and effectiveness (impact) of experimentation programs.

Big thanks to Kattie Kelly for help with writing, and Kameleeon for sponsoring the report.

What Speero Discovered

During the analysis, Speero discovered a nice number of easy-to-implement areas that most brands don’t do but should be doing, if they want to speed up their programs. We’ve also explored what the highest maturity level businesses are doing, with advice on which areas you should focus on too if you want to increase the effectiveness of your testing program.

We’ll cover four key takeaways from the report. If you want the whole deal, you can download it from this page.

You should also try out Speero’s free experimentation maturity audit, so you can compare your performance against benchmarked companies.

Research Methodology

The dataset used in this report includes a total of 119 respondents who responded to the audit from October 2021 to December 2022. We solicited responses via our company newsletters, website, and social media channels. It’s a beefy survey, so we were quite happy to have this many programs benchmark themselves. The audit asked respondents to answer a range of questions across four key pillars of experimentation:

Strategy & Culture
People & Skills
Data & Tools
Process & Methodology

The questions were either presented using a Likert 0-10 scale with bipolar adjectives “don’t agree” and “strongly agree,” or open-ended questions. To apply a narrative to the dataset, we employed a similar weighing to our Likert scale used in Net Promoter Score. Scores were grouped by 0-6, 7-8, and 9-10. We attributed statements to these bracketed scores:

0-6: Disagree with the statement.
7-8: Somewhat agree with the statement.
9-10: Strongly agree with the statement.

Maturity Levels (in Experimentation Programs)

Each respondent received an “overall experimentation maturity score” based on their answers. The scores correspond to different maturity levels as follows:

‍
Beginner – 0-20% Overall Score

Businesses at the start of their experimentation journey. Few of the fundamental building blocks needed to run an effective experimentation program are yet to be implemented.

Aspiring – 21-40% Overall Score

Businesses that have established some of the important elements needed in preparation for running an effective experimentation program. These businesses typically have many internal hurdles to overcome and practices to implement in order to run a successful experimentation program.

Progressive – 41-60% Overall Score

Characterized by businesses that are starting to recognize the importance of insight-driven experiments and the need to improve their processes to increase the performance of their work. They have the necessary foundational elements in place to run a basic experimentation program.

Strategic – 61-80% Overall Score

Businesses that have most of the foundational and some advanced practices in place, employing a strategic approach to experimentation. They are likely to have wider company buy-in for experimentation as a core business growth driver due to the results from their work.

Transformative – 81-100% Overall

These businesses are the industry elite. They are outperforming their competition through a well-oiled experimentation program that is consistently delivering results.

Takeaway 1 — 40% of Businesses Have no Dedicated Person Responsible for Experimentation

A graph representing answers to a question: You have a dedicated person or team that is responsible for: — You have a dedicated person or team that is responsible for:

Key Findings:

Only 59% of respondents strongly or somewhat agree they have a dedicated person responsible for experimentation.
Businesses are more likely to have dedicated UX or data analyst resources than dedicated experimentation resources.
Respondents were least likely to have a dedicated person responsible for qualitative research (58% disagree.)

We can see that dedicated expert resources (in the example below, experimentation resources) increase as maturity increases.

E.g., at the aspiring maturity level, only 17% of businesses strongly agree that they have a person dedicated to experimentation, whereas 92% of transformative businesses strongly agreed. But it’s somewhat of a chicken-and-egg scenario regarding dedicated expert resources.

You have a dedicated person or team that is responsible for experimentation:

What Can You Do About This?

Without dedicated experimentation resources, it can be hard to:

Increase test velocity
Ensure you are implementing a robust process
Run advanced or better-informed tests

Without dedicated experimentation resources, it’s hard to increase maturity. However, businesses are reluctant to invest in dedicated roles until experimentation proves successful, thus worth the investment. But to reach this point, you often need dedicated specialists.

A good interim solution is to hire external consultants and pair them up with an internal resource that has the traction inside the organization to make changes happen but lacks the experimentation skills and know-how.

This combination can quickly implement and validate experimentation techniques that benefit the company. Hence bringing more trust for the discipline and dramatically increasing the experimentation maturity growth curve.

Resources:

RASCI Matrix: Use this template to assign team roles and responsibilities across the business and program. Get the template.

Five key traits to look for when hiring an experimenter. Read the post.

How to structure your optimization and experimentation teams. Read the post.

Takeaway 2 — 58% of Brands Don’t Have a Knowledge Base

You have a repository or knowledge base where tests and testing insights are stored by all teams

Key Findings:

58% of respondents don’t have a testing knowledge base.
No beginner or aspiring maturity stage respondents have a knowledge base.
Only 16% of respondents somewhat agreed, and 26% strongly agreed.

Again experimentation team structure impacts the necessity of a knowledge base. However, you could argue that one is always necessary, even without a testing team in-house, to ensure that insights are shared across internal departments and a history of what has been tested is maintained.

However, nearly all (91%) respondents with no dedicated team have no knowledge base.

Interestingly a high number (57%) of decentralized teams aren’t using a knowledge base. With a decentralized team, experiments and their learnings must be documented and shared across the dispersed teams who often work independently.

What Can You Do About This?

It appears that maturity level correlates more with whether a knowledge base is used than experimentation team structure.

This represents a big opportunity for improvement for early maturity stage businesses as the creation and record keeping of tests and insights is a relatively easy and low-cost endeavor. You simply need:

The introduction of a consistent format for recording the data.
A place to store and search the data (possible with a free Airtable setup)
That all stakeholders know the process to record/store and access the data.

Resources:

Pitching a Data Strategy? Here's How to Ensure the C-Suite Says “Yes. Read the post.

Data Discrepancies in Google Analytics: What Can Go Wrong, Why, & How to Fix It. Read the post.

Don't be fooled by deceptive data. Read the post.

Takeaway 3 — 62% of Businesses Don’t Have Metrics to Measure Their Experimentation Program

Key Findings:

62% of respondents disagree with the statement, “You have experimentation metrics at a team or program level that help you manage and assess the health of your experimentation efforts.” With 20% of respondents somewhat agreeing and only 18% strongly agreeing.

All transformative-level respondents had some level of experimentation program metrics in place. None of the beginner or aspiring-level businesses had such metrics in place.

It’s hard to imagine how any team could improve its processes and results without some form of measurement of the activity itself.

What Can You Do About It?

Experimentation program metrics represent metrics that assist teams in assessing how effective their processes, workflows, and even resources are.

For example, metrics such as “time in workflow stage” might help identify bottlenecks and show where more resources are needed.

The learning rate from tests can indicate the quality of the hypothesis being tested. As these examples illustrate, such metrics are beneficial and should be implemented earlier in the maturity stages than our research suggests.

Where experimentation maturity is concerned, program metrics are a crucial tool to help you advance.

Resources:

Program Metrics: This blueprint helps you help you to measure important factors that impact the success of your experimentation program and therefore the results you can generate. Get the blueprint.

A/B testing workflow map: an example workflow map for an A/B test showing the different steps right before a test goes live, during, and afterward. Get the blueprint.

Test phase gate framework: use this as a communication tool to align the team on how things work and the cadence and flow of the experimentation flywheel. Get the blueprint.

Takeaway 4 — Most Early-stage Programs Lack a Prioritization Framework

You have a well-defined prioritization framework for assessing the potential impact/cost of experiments that has been customized to meet the needs of your organization.

Key Findings:

Overall, 64% of respondents disagreed with the statement “You have a well-defined prioritization framework for assessing the potential impact/cost of experiments that have been customized to meet the needs of your organization.” with only 36% somewhat or strongly agreeing.

When asked whether “Your prioritization framework is being used to assess every hypothesis and test idea.” 17% of transformative and 44% of strategic-level businesses disagreed.

In an ideal world, no business would run A/B testing without prioritization in place. Test ideas are often in abundance, and many are proffered up by those with strong opinions and little data, which is why prioritization is necessary.

Early in experimentation maturity, every test outcome counts towards the business's level of confidence in experimentation. As companies increase maturity, test velocity becomes important; thus, every test run must be of the highest potential. So both low and high-maturity programs need a good prioritization framework.

That’s why it’s rather alarming that 8% of transformative and 41% of strategic-level businesses disagree that they have a well-defined prioritization framework. However, this result could be due to the statement wording.

Not every business needs to customize its prioritization framework; for some, out-of-the-box versions like PXL, PIE, or ICE work well. Yet, most companies have more nuanced goals and needs, meaning that customization of the framework makes it much more useful.

A subsequent question from our survey asks about the usage of the prioritization framework. When asked whether “Your prioritization framework is being used to assess every hypothesis and test idea.” 17% of transformative and 44% of strategic-level businesses disagreed. Based on themes seen in the qualitative responses, we can speculate that HIPPOs remain a problem at all maturity levels.

Your Prioritization Framework is being used to assess every hypothesis and test idea.

Resources:

PXL. If you want to improve your prioritization you move on to PXL Model. It is much more detailed and tailored to A/B-tests ideas and hypotheses. Get the blueprint.

How to Prioritize Your A/B Test Ideas. Read the post.

Conclusion

Another benchmark report blasted into the world wide web. We hope it proves helpful, makes your program better, and gives you tactical advice and good questions to improve experimentation in your business.

Your best bet for maturing your experimentation program is to:

Focus on speed, the process will follow. Just be prepared and ok when things break, that’s part of the process.
Focus on Impact with proper strategic roadmapping that comes from getting CLOSE to both you’re org’s growth model and the customer problem. ResearchXL ftw! Check it out.
Build a learning repo. You have shoulders to stand on, just need to exercise em! Build a system to document, archive, swimlane and watchtower your test and learn culture. XOS blueprint ftw! Check it out.

Get the Experimentation Program Benchmark Report 2023

If you want to see what the most experimentation mature companies are doing, and tips on where to focus if you want to reach them, all connected with blueprints of must-try experimentation practices, download this report.

Have trouble maturing your experimentation program? Reach out.

Experimentation / CRO

Research & Strategy

Data and Analytics

The State of Experimentation Programs 2023

What Speero Discovered

Research Methodology