Using UX Benchmarking to Unlock User Perceptions

As a self-proclaimed data nerd, to say that I enjoy exploring customer research is an understatement. Ironically, the area where I geek out the most is when it comes to user perceptions - the area that’s the hardest to collect data on.

If you collect data on user behavior, it's always a picture of the past. But when you try to collect data on in-the-moment user perceptions, you can’t necessarily trust the words users say. That’s because they are likely to share a rationalization rather than a true perception.

That’s why I’ve personally dedicated time to developing new techniques that gain true insight into user perceptions. 

My secret to unlocking this vault? Leveraging UX benchmarking to tap into five different dimensions of user perception.  

UX benchmarking

The goal of user experience benchmarking is simple. You want to be able to answer two important questions: 

  1. How are users evaluating your offer compared to other offers?
  2. How do users feel about your digital or product experience in general?

Perhaps the most well-known (or notorious) measurement for user experience benchmarking is the Net Promoter Score or NPS.

NPS was created by Fred Reichheld of Bain & Company in 2003 and has become one of the de facto methods of measuring perception and satisfaction.

It can certainly be powerful when used correctly, but it also only represents one dimension of user experience: customer loyalty. 

5 dimensions of user perception

Loyalty on its own is a narrow lens through which to view customer perception as a whole. In my work at Speero, I look at five dimensions that together define user perception as a comprehensive whole.

For each dimension, you can ask specific questions that get to the heart of the user’s perception of your website. 

  1. Loyalty: The affinity and commitment that the user has for the brand. 
    - Is the user likely to visit this website again in the future?
    - How likely is the user to recommend the website to friends or colleagues? 
  1. Credibility: The level of trust that the user feels.                                                   
    - How comfortable does the user feel purchasing from the website?                        
    - How confident is the user conducting business with the website?
  1. Appearance: The visual allure of the website.                                                            
    - Does the user find the website attractive?                                                                  
    - Does the website have a clean, simple presentation?
  1. Usability: The ease of user interface.                                                                            
    - Is the website simple to use and navigate?
  1. Clarity: The above four factors accumulate to a feeling of “clarity”.  This is the “aha” moment that a user gets when they find the right solution for their needs. The user understands why the website is superior to competitors.

Understanding SUPR-Q

The first four areas on the list above are representative of a standardized survey technique called the SUPR-Q which was developed by Dr. Jeff Sauro of MeasuringU.

The model was developed through;

“A process called psychometric qualification, in which the best 8 items from a pool of 75 candidate items were identified. The items were winnowed down by only retaining those which generated high internal reliability, loaded highest across common factors, had the highest item-total correlation, and could discriminate best between websites.”

A couple of years ago, I worked directly with Dr. Sauro to see if we could apply his method to a new use case: user perception of digital experiences.

The result of this work was the addition of the fifth and final dimension: Clarity. 

Clarity was the missing element that dug into the question of value proposition and was the key to understanding how a website’s overall messaging made it clear to users that they should buy from one website versus another.

Today, we have done UX benchmark studies for hundreds of sites across many different verticals. 

To illustrate exactly how this works, I’ll walk through one of our most recent competitive UX benchmark studies, which focused on the beauty and cosmetics space.

Case study: Benchmarking 4 mobile beauty websites

Four of the internet's biggest beauty players are Clinique, Fresh, Lush, and Sephora.

To undertake a benchmarking venture such as comparing the mobile websites of these four brands, think of the process of user testing, at scale.

The experiment 

If you think about deploying a standard NPS survey, it's usually pushed to users after they have gone through the user journey. In this case, we did a more active version of the NPS concept. 

We gathered a panel of 100 people and asked them to perform a set of identical tasks on each of the four beauty websites.

The tasks we assigned were as follows:

  1. Find a lipstick for $25 or less.
  2. Once found, compare it to similar lipsticks and choose the one you would like to buy. Add it to the shopping cart.
  3. Imagine you want to buy something as a gift for your friend. Find an item you think they would like and add it to the shopping cart.
  4. Go to your cart and “complete” the purchase (using a provided credit card number that intentionally resulted in an error message to end the assignment).

The task required our volunteer users to go through the entire funnel for each website, from landing on the site to searching for products to “completing” a transaction. 

At the conclusion of the assignment, the users were provided with our SUPR-Q survey, including our “clarity” dimension questions.

The results 

After crunching the data from the benchmarking questions, the differences and patterns in user perception across the four websites began to emerge.

We started by creating the radar graph below. Here, you can visibly see the relative differences among people and how they perceive the sites across and within each of the UX dimensions of credibility, usability, etc. 


What can we take away from this graph?

  • Sephora is taking up most of the real estate and globally can be seen as our “winner,” with the highest degree of loyalty, appearance, clarity, and credibility.
  • Clinique is the opposite; it has the most to work to do, relative to the other sites and appears to be underperforming in general. 

Digging into the details is also enlightening. For instance, we can look closer at ‘usability’ to see the exact percentile differences among the different players. 


In addition to the Likert scale questions that comprised the bulk of this experiment, we also sent a series of open-ended questions to participants.

Looking at both response types we began to see an anomaly within the ‘usability’ data, uncovering the reasons why users rated the sites as ‘easy to use’ versus ‘difficult’. Such insights would then allow the businesses to create better hypotheses to map out future user testing experiments. 

Case study: An e-learning Website 

Another example comes in the form of a test we ran for a popular e-learning site. 

While this was not a competitive test like the beauty case study above, we were able to apply the same research method to help tease out and understand why the business had a low credibility score.

The percentile gauge graphs below show that this e-learning company has very strong scores when it comes to every facet of user perception - except for credibility.


What we found, in this case, is it had to do with their credit card upfront trial model. Users didn’t appreciate having to hand over their credit card details so early in the process. 

With that insight in hand, we were able to begin the task of helping to mitigate some of those credibility concerns. 

It allowed us to realize where we needed to run tests, and after each test, we ran the survey again to see if our experimental efforts had made an impact.

Quick wins 

Using the SUPR-Q method at scale can be a robust study design.

If you’re looking to get started with this technique, the one thing that I would recommend focusing on first is the user questions surrounding usability. If you only ask one question in your survey or study, ask users to rate the question: 

This website was easy to use. 


According to Dr. Sauro, this one question can explain up to 80% of the variation seen from all of the other questions combined. While this does not include our fifth “clarity” element, it’s an extremely powerful question, the answers to which can provide quick learnings.

To get started with UX benchmarking, explore working with panels such as Amazon MTurk (for the US) and Clickworker (for the EU). While you will need to filter through the noise as you seek to curate your own panel, these resources offer over 100 premium qualification criteria to help you sort through participants. This means you're not having to waste too much time setting up advanced-level screeners on your own; instead, you're able to screen down to your required demographic or firmographic targets quickly. 

Whether you are looking to understand your own brand’s user perception, or benchmark one brand competitively against others, UX benchmarking is an excellent data collection method. Not only will it allow you to get a moment-in-time snapshot that you can follow up and compare against in the future, but you can also get immediate, high-value insights about where you rank in the eyes of your target audience.


Related Posts

Who's currently reading Briefly Experimental?