Olga Osadcha is the Director of Growth Products at ClickUp. It’s a B2B software for everything — from project management to docs, reporting, whiteboards, and more. Before ClickUp partnered with Speero, they faced several challenges:
- The testing program had no clear or standard process, it was a mess.
- The program was heavily tied/dependent on their A/B testing tool.
- They couldn’t trust their A/B testing tool.
- Data was often lost, and it made them constantly question results, lowering test velocity.
During test validation, ClickUp continuously ran into issues and unexplainable mysteries with their A/B testing tool. ClickUp validates their tests in a couple of ways. Primarily through AA tests.
Next, by checking if the traffic is the same in analytics and testing tools. ClickUp would often have a percentage of users they would lose in the tool, without knowing why.
“Was it decay failing? Or other conditions? It was a constant mystery why a certain percentage of traffic was lost. This would slow down our testing velocity because now, we had to wait for longer to have the needed sample size.”
Olga Osadcha
The third validation they did was to run several AA tests simultaneously and ensure the split was orthogonal:
- One control and one treatment group in test 1.
- One control and one treatment group in test 2.
- Then split the combination of both equally.
For a long time, ClickUp’s experimentation team ran this type of validation through their tool without realizing that whoever is bucketed into a treatment group in one test would always end up in the treatment group of the second one. At the same time, the control was a shared holdout group between all these different tests.
ClickUp wasn’t able to see this happening. The tool simply wasn’t configured to run several concurrent tests. Despite this, its teams relied on the tool and its data to plan the next experiments and generate new ideas.
“When we were analyzing the results, interpreting the data, and coming up with the next steps and validation… we constantly wondered… Does this actually work? Or is this just noise? Bad data?”
Olga Osadcha