Experimentation / CRO

Experimentation and Testing Programs acknowledge that the future is uncertain. These programs focus on getting better data to product and marketing teams to make better decisions.

Research & Strategy

We believe that research is an integral part of experimentation. Our research projects aim to identify optimization opportunities by uncovering what really matters to your website users and customers.

Data and Analytics

90% of the analytics setups we’ve seen are critically flawed. Our data analytics audit services give you the confidence to make better decisions with data you can trust.

Why do tactics, best practices, and even data fail?

One of my favorite mental models provides key insight for both my personal and working life.

The model is this: 

"Knowledge equals experience plus sensitivity" 

This quote comes from Viktor Frankl, the author of Man’s Search for Meaning

Frankl’s book is all about finding the right answers to life’s various problems - and I’ve found that this model helps you stay humble by understanding the limits of  your opinions and knowledge.

Frankl’s model explained

To define what’s meant by sensitivity, let me start with a personal example. 

Recently my young son tried a sip of wine for the first time. 

His reaction? He quickly spat it out and said, “I hate that! Wine is horrible!” 

But does he really hate wine? He’s only tried it a single time. He hasn’t tried it with food, or while visiting a vineyard, or from different regions. In short, he hasn’t had an adequate breadth of experience with wine that allows him to make a life-long ultimatum. 

Frankl’s model helps keep opinions in perspective. 

Applying Frankl’s mental model to best practices  

The experimentation industry is full of best practices and tactics. But this mental model helps us understand them in context. 

Imagine that I hand you a list of “12 guidelines for designing an eCommerce home page.” Then, I send you off to design a convenience (deodorant), shopping (couch), and unsought (coffin) product home page. 

Quickly you’ll realize that the guidelines don’t necessarily translate to these specific contexts. The best practices aren’t sensitive to the different products and user perspectives. Even the most concrete set of guidelines won’t span the breadth of all user intents, shopping behaviors, and decision-making criteria. 

This is why we use data and statistics – to add context and sensitivity to our own experience-based suggestions. 

The mental model where best practices go to die

Another example is the best practice of reducing friction. General wisdom would have us believe that we should reduce customer friction at all costs. 

But like all golden rules, they can be broken. The image below shows two versions of the mobile home page for Segment.com. 

The image on the left contains an email field, which adds user friction. The image on the right does not.

Our optimization experience tells us that we should hide the email field, placing it after the CTA click interaction.

But we’ve tested this idea across a number of our clients and instead found that the email field: 

  • Promoted scrolling
  • Increased visitor interaction with the content
  • Anchored the visitor to the value offered
  • Informed the visitor exactly what to do next

The connection to the mental model is this: our best practices and heuristics will reliably work and reliably fail. They work based on our experience, and fail due to lack of sensitivity applied to all situations and considering all factors. 

What we 'KNOW' thus is based on the balance of the two.

Applying the mental model to data 

So, what research results are sensitive enough to lead to accurate conclusions?

Those that are statistically significant. 

By way of explanation, I’ll first show an example of what *not* to do. Can you guess what research method produced the below findings? 

The research findings came out of user testing. And this list immediately raised red flags for me.

First was the inclusion of trigger words like “most” and “some” as well as percentages to conceal the small sample size. These are often a sign that dangerous generalizations are in progress.

Secondly was the research methodology itself. User testing is for observing behavior, not to make conclusions around perception. 

In this case, the research team had taken a very small group of users (5 men, 4 women) and implied conclusions about users as a whole. Data is dangerous. It can be used incorrectly very easily, and I'd say user testing is one of the more dangerous research methods for this reason. It isn't sensitive enough to come to these types of conclusions. We all know this data trap problem, but this mental model explains the 'why' behind it.

Leveraging statistics

In the example above, researchers attempted to use a small sample of nine people to come to conclusions and inform actionable next steps.

But from a tiny sample of only four women, can we really extrapolate and say that all women will feel the “content is lively?” No. 

To really understand what works (and find true “knowledge”), we need to employ statistical analysis. Stats give us an understanding of the ‘sensitivity’. So if we want to ‘know’ from ‘an experience’, we need to understand how ‘sensitive’ the data at hand is. Thus ‘Knowledge = Experience + Sensitivity’. This is the ‘WHY’ behind statistics! 

Understanding statistics as a proxy for sensitivity means paying attention to standard statistical guardrails such as:

  • Duration
  • Power
  • Sample size
Speero A/B test calculator

When you hear about a user’s experience, you need to be careful about p-hacking, or looking for evidence of patterns that aren’t really there. The guardrails listed above will help ensure that you aren’t finding false positives within your data. 

If you have an adequate sample size and run a test for a lengthy duration, you’ll be in a good position to test for sensitivity and find areas with a true “statistical lift.”  In other words you’ll be in a place to ‘know’ what an experience implies.

Only through this process can you say with confidence that you’ve found a lever that can influence and inform customer behavior. 

Putting all of this together in the context of testing and experimentation we can say;

  • Knowledge = we want to KNOW if the variation is better than the control.
  • Experience = we see in this test that the variation is better than control right now.
  • Sensitivity =  we see that the data is not very sensitive yet...with only 3 days of data, it hasn't had enough time. 

In the end, the statistics we gather are a metric used to measure sensitivity, to see how confident we are that the one experience it measures can give us *knowledge* about our population of data/customers/etc.

When you’re confronted with your next data set, remember to stop and think. Check your convictions and hunches. Seek breadth of experience over a quick sound bite. And don’t assume that the status quo will always remain static.

Related Posts

Who's currently reading The Experimental Revolution?