Author: Ben Labay
Title: Speero Managing Director
Contact: Speak directly to author to discuss the topics
Briefly Experimental monthly aims to inform, educate and elevate the industry by sharing the latest in Experimentation learnings, and how these translate to better, faster businesses decision-making.
Edition 17, January
Key Take-Aways
- Exp program challenges and pitfalls - Key attributes of successful test programs, top mistakes, top scaling blockers, and the required organizational change.
- Is it time to ditch roadmaps? January 11th AB Tasty published “A Conversation With Lukas Vermeer of Vista” where he pushed for more acknowledgment of uncertainty and thus (I think) creativity in testing programs.
Headlines Of The Month
- Build vs Buy for your experimentation platform? There’s a discussion brewing on Linkedin here and here on this, Ronny Kahavi is the latest provocateur.
- The importance of experimental design can’t be emphasized enough in proper AB testing. Adam Stone of Netlify published Why It Matters Where You Randomize Users in A/B Experiments on January 6th.
Industry Recognition - elevating a few who are elevating our industry
Craig Sullivan
Craig is one of the OGs in our industry. He was Peep Laja’s mentor for CRO more than a decade ago, and thus the inspiration of much of the procedural innovation in CRO that Peep put out with research-to-testing rituals and artifacts such as ResearchXL and PXL that are so well known in the CRO world. I’m recognizing him here because he 1. Continues to contribute in online communities, and 2. I happen to be working with him and Lukas Vermeer, and Manuel Da Costa on pushing the boundaries of CRO by creating and delivering consulting services for program maturity work. In effect, he’s brilliantly auditing organization’s CRO programs and it’s amazing to be a part of this work
Manuel Da Costa
Manuel Da Costa - I mentioned Manuel above. I want to highlight Manuel for his deliberate focus on organisational change and the prerequisite of proper governance and leadership for successful AB testing programs.
Rommil Santiago
So not only is Rommil an apparently successful Lead Product Manager at Loblaw Digital, but in his spare time he interviews and elevates CRO practitioners with his side-gig experimentnation.com. I’ve been introduced to dozens of creative and ambitious CROs through his newsletter and social media work to elevate the individuals that make up our industry.
Experimentation Pillar 1 - Strategy & Culture:
THE FOCUS
I was challenged by Luis Trindade of Farfetch to speak to the Farfetch experimentation team about experimentation pitfalls and learnings. It was a roundtable alongside Luis, Lukas Vermeer, Craig Sullivan, and Manuel da Costa.
We were asked these great questions that spoke to the challenges of orgs to run experimentation programs.
Here are the five questions with some of my thoughts bulleted below each:
1. What are the specific attributes you commonly see in companies with highly successful experimentation programmes? (mention resources below and share)
- Buy in from top leadership (tech first companies test first, legacy companies with old business models have a hard time with it)
- Strong business ‘editorials’ meaning a super clear vision and purpose….with this all the micro arrows point in the same direction (also connected to leadership)
- Strong focus on problem definitions, so a dedication and resourcing of customer research
2. What are the top mistakes that companies are making with their experimentation programs and what harm do these cause?
- Perverse incentives - an incorrect metric strategy
- Bad data collection and mismanagement of data. (Data traps)…the argument that exp should live with data teams
- Not staying connected with customers…. The argument that exp should live with marketing
3. (If not covered) What are the main reasons companies aren’t able to scale their testing effectively (or at all)?
- Capacity (dev, design, automations etc)
- Mis-aligned metric and product strategy
- Poor process and governance structures make things move thru mud and make it hard to celebrate failure because the failure isn’t contextualized with program metrics (governance & guardrail metrics).
4. Organisational change seems to be a vital element of building and scaling an experimentation programme. Why is this so important? What are the most common changes you need to make?
- It doesn’t have to be actually. I got a sneak peek into the experimentation program at Uber Eats thru Dan Layfield who owns part of the digital experience there…and that company has no COE, there are no ambassadors or ‘testing champions’. It’s in the DNA and operations of that business logic to such a degree that everything is tested…..this is an exception for sure, but it highlights what the world looks like at the end of the road perhaps, or at least further down it.
- But besides these type of exceptions, org change is needed primarily to create and leverage trust down the corporate ladder. Leaders turn from gatekeepers and bottleneck decision-makers, into coaches and advocates of the system. It’s autonomy over hierarchy.
5. How do you make the case for the investment or organisational change required, if the current experimentation isn’t working?
- I don’t usually. I prefer to just hang out with people and companies that get it. If you’re going to test, then do it and get better at it by ‘greasing the flywheel’ where it matters…push on the opposing forces of velocity and complexity at the same time, and make sure there’s alignment of business and metric strategy first and foremost.
- That said, there are clear strategic narratives that ‘sell’ why we should experiment and what the alternative is. It’s a situation where innovation is absent, and competitors pass you by.
THE INSIGHT
My main insight on this Q&A thought exercise is that leadership is vital. It comes up again and again in the answers above. It comes in two parts:
- Buy-in into the methodology, and pairing that with
- A strong internally aligned product vision and business strategy
THE CASE STUDY
It’s an old document, but it needs revisiting, spend some time looking over the Digital Experimentation whitepaper of Farfetch. It speaks to leadership’s embrace of this, and I’m sure allows the company to recruit and retain authentically curious talent.
Experimentation Pillar 2 - Process & Governance:
THE FOCUS
So, is it time to ditch testing roadmaps? Maybe, according to Lukas Vermeer. Process is at the heart of experimentation. It’s core to any business management. But if overly applied, it can kill innovation and creativity.
In a classic 1998 article Teresa M. Amabile tells us:
“Creativity thrives when managers let people decide how to climb a mountain; they needn’t, however, let employees choose which one.”
I was reminded of this quote when listening to AB Tasty’s “A Conversation With Lukas Vermeer of Vista” that came out January 11th. Lukas argues AGAINST experimentation roadmaps, or at least against the strict compliance of them without the acknowledgement of uncertainty.
I agree wholeheartedly. In the last year I’ve tested on multiple B2B service websites that are introducing product-led growth models by way of new ‘digital onboarding’ journeys. These are prime examples where one or two or even 10 tests won’t get to a solution. This line of testing is disruptive to traffic patterns and to sales teams. A roadmap is hard to stick to strictly other than the strategic direction generally, e.g.,
STRATEGIC GOAL example: “we need to introduce a digital onboard sign up flow by X date without hurting closed/won deals numbers (revenue)”
FOCAL AREAS that could pivot the roadmap include:
- Plan/package positioning and messaging
- Form field optimization
- Pricing strategy
- Plan/package discovery wizards and progressive exposure
If micromanaging on any or all of the above focal areas occurs on top of the goal, everything slows down (I’ve seen it so often). Imagine trying to get to the goal without playing around with different approaches to these focal areas.
- “No changes to packages/plans”
- “We need all these fields, oh and we don’t want to add any because we heard it caused friction”
- “We can’t show pricing at all”
- “Our CEO likes the package comparison table, so we need to keep this”
If you’ve been a part of an experimentation program long enough you’ve heard others or even yourself say similar things.
Experimentation Pillar 3 - Data & Tools :
THE FOCUS
Buy vs Build for your testing tool. This argument completely depends on how high up in the org has the company bought into the methodology of experimentation. Who is using this methodology?
- Website product owners
- CMO
- Product
- Data
- Ops
- CFO even???
there’s a scale to the above list. It relates to what ‘problems’ are being addressed and it gets increasingly hard to test with 3rd party tools that only do randomization of assignments based on sessions or users.
A couple of weeks ago Diana Jerman of Disney discussed 3 Key Tenets of Experimentation at Disney, all related to how they pushed on accessibility and normalization within a central testing platform across Disney+, Star+, Hulu, and ESPN+:
“We focused on building one platform that is fast and supports all teams enabling innovation experiments on millions of customers around the world. We knew the experimentation platform would also need to be easy to use and create confidence for internal teams. We set out to standardize metrics, encourage best practices, and make it easier for all teams to perform experiments at scale”
Ronny Kahavi has me thinking of this. He’s looped me in to discuss this alongside Stephen Pavlovich, André Morys, and Lukas Vermeer in a Q&A session for his testing cohort. Here is his good list of questions with some of my initial thoughts:
- Which A/B testing vendors have you worked with (ran real experiments with a paying client, not just a demo or test)?
- What are the three factors that differentiate between the vendors that prospective customers should look at, besides price?
- Are there components of a vendor solution that you recommend replacing?
- Are there add-on components that you recommend, which vendors are missing? For example, a more comprehensive AA test evaluation, or alerting on degradations.
I’ve had conversations with a lot of tool vendors and industry experts like Craig Sullivan, Chad Sanderson about this, discussed it on Linkedin here, and will be publishing a long article on this soon.
Ultimately, the testing platform handles 3 core things:
- Assignment
- Metrics
- Measurement
Differences in how vendors or home-built solutions handle any one of these are determined by the ‘problem’ they are trying to solve first.
Experimentation Pillar 4 - People and Skills:
Data & THE FOCUS
Experimentation design is an art. CRO/Experimentation practitioners would benefit from not only courses on statistics but on courses of experimental design, especially related to how (and WHY) participants are allocated to the different groups in an experiment.
It’s so easy to just rely on tools to do all this thinking for us instead of critically thinking about the problem we’re solving and designing a creative measurement system accordingly.
THE INSIGHT
Adam Stone of Netlify published an exhaustive (if not exhausting) article on this topic on January 6th: Why It Matters Where You Randomize Users in A/B Experiments.
I think the TL;DR ‘principle’ of the article is to be relentless on lowering the denominator of the experiment.
THE CASE STUDY
An example here would be, if you’re on a SaaS homepage like https://miro.com/ and you want to test the signup form placement approach, then don’t bucket all users that land on miro.com, rather only trigger assignments when they click on a CTA which leads to the signup form.
But it gets deeper…. I had a great zoom chat this week with Matt Gershoff. We were talking about the above topic of tool build vs buying. He spoke of some cool first principle concepts related to what ALL tools are fundamentally doing in contrast to traditional analytics for example:
- Analytics = Just in case data
- Experiments = Just in time data
Experiments are ‘interventions’ and perform the joint problem of data collection and analysis. In other words, the way tools are collecting data influences and is influenced by what problem is being addressed.
Matt’s point here is “a tiger vs mouse problem” he said. The analysis is a mouse compared to the tiger in the way the data is collected. This is why SRM is such a huge issue.
Seasonality adjustments, SRM monitoring, conflicting tests, power analysis and duration planning, sample size and MDE… these are a small list of the properties of proper experimental design that CROs need to master. I think we need a dedicated course on this. Ton Wesslings AB Testing Mastery course does an ok job, but I think there’s an opportunity to add focus to this “tiger” of a topic.
Join the monthly conversation with the leading minds in Experimentation by signing up to Briefly Experimental.