Remember, A/B testing is not just about running tests and finding the winning variation for your web page, it is also about learning how different factors contributed to the win. Before implementing the highest winning design on your website, it's important to take a close look at any outcome (winning, leading, or inconclusive) you obtained during your experiment. This will help you understand your audience's behavior towards a particular variation and make appropriate decisions in the future.
In Zoho PageSense, we consider several criteria that should be met to conclude your A/B test results and declare a winner. This includes the number of visitors, the number of conversions, conversion rate, improvement, significance, and other related metrics for each goal configured in your variation page. These metrics can provide you with the level of accuracy with which you can trust the results in your A/B test reports run using PageSense.
In this article, we will see the major parameters that contribute to a winning variation, as well as the factors you should consider while analyzing your A/B (or Split URL) experiment results in PageSense.
Declaring results using the Frequentist method
The Frequentist method of A/B testing tells you whether variation A or B will win by calculating the number of times an event occurs within a visitor sample. To be considered a winner in frequentist method, a variation must:
- Have reached the adequate visitor count (or sample size)
- Have a higher conversion rate on the primary goal
- Have achieved the set statistical significance level.
Visitor count/Sample size
When you are running an A/B (or Split URL) test, the overall visitor traffic that you see in your original and variation version depends on the traffic percentage that you allocated to each version while setting up your experiment.
Zoho PageSense carefully monitors the new visitors coming to your page to make sure you have sufficient traffic for each original and variation pages to attain a statistically significant result. Once the desired visitor traffic is reached, we will declare the winner for your result. However, if the volume of traffic is too low, we will tell you how many more visitors are required for each variation to get a reliable and actionable result.
This minimum number of visitors required for each variation to run your A/B test and get a valid result is termed the Sample Size. Sample size is the biggest factor used to obtain a conclusive test result in your A/B experiment within a short time.
For instance, say, you want to predict how the population in a specific region will react to your new homepage. In this case, you cannot wait to test the entire population coming to your site from this location, as it can be time-consuming and tedious. We can first test it on a sample size that is representative of the targeted population. The sample size, in this case, will be tested on the given number of people based on that location. The more the traffic (as per the sample size) your site receives, the sooner you will have a large enough to determine if there are statistically significant results.
Calculating the required sample size in advance and running your experiment on a set visitor count can help you reach the required significance level, without wasting much of your time and resources.
Info: How can sample size help you achieve the desired significance level?
In PageSense, the final results of an A/B test is declared based on the Significance level achieved. However, the calculation of Significance percentage is subject to a certain degree of error based on the visitor traffic and conversion rate obtained.
For instance, when there are fewer visitors and conversions, the significance level fluctuates in a wide margin. You may see that at one point the significance can be 95%. After a few more visitors, it may drop to 70%. In this case, considering a sample size can help the statistical significance to stabilize after a good number of visitors are reached on your variation page. These metrics will remain stable from there on.
Conversion rate
The basic aim of an A/B test is to increase your conversion rate, because the measure of your goal conversions can help you determine how how your variations are performing from in comparison to each other. A goal conversion takes place when your visitors complete a specific action you have created on your variation page(s) -- for example, sign up for your newsletter, make a purchase, or add a product to the cart.
The conversion rate of a page is calculated (in percentage value) by dividing the number of conversions to the total number of visitors you received. Note that in PageSense the primary goal is taken as the key goal to announce whether your test has a winning variation, a leading variation, or if the result in inconclusive.
A low goal conversion rate means not many people are doing what you want them to, while a high one conversely shows you what is going well on a page. This metric ensures that the results on a page are a due to the changes you made on it and not by random factor.
For instance, let's say you ran a test on the 'Sign up' CTA of your homepage, and the conversion rate obtained for your original and variation are as follows:
- Original version: 42,000 unique visitors and 10,000 conversions, with a 23.8% conversion rate.
- Variation 1: 86, 000 unique visitors and 22,000 conversions, with a 25.5% conversion rate.
In this example, you can see 'Variation 1' is more efficient than the original in terms of the conversion rate, but this metric alone is not sufficient to declare 'Variation 1' as the winner. This is because your conversion rate for a page will keep changing based on the new visitors and the test duration.
Info: Deciding a variation to be a winner or loser just by looking at the highest conversion rate before the end of the test might prove wrong and inefficient, as it would not have been tested on the majority of visitor numbers.
Significance level
Once you've looked at your visitor count and conversion rate metrics for your primary goal set in the A/B test, next you should think about how statistically reliable your results need to be. This can be determined using the statistical significance level indicated in your report.
Statistical level is the mathematical way of proving that the results you obtained are due to the changes made by you on the web page, and that you can be confident enough about implementing these final changes in your business decisions. This significance level is set based on the type of frequentist method you have applied to evaluate your test results:
- Quick trends: Statistical significance level > 90%
- Optimal: Statistical significance level > 95%
- High accuracy: Statistical significance level > 99%
Let's say you ran a test on the 'Sign up' CTA of your homepage, and the number of visitors and conversions for your original and variation are as follows:
- Original version: 42,000 unique visitors and 10,000 conversions, with a 23.8% conversion rate.
- Variation 1: 86,000 unique visitors and 22,000 conversions, with a 25.5% conversion rate.
By calculating the statistical significance, we notice that 'Variation 1' is more efficient than the original variation, and that the significance percentage is 96. In this case, your A/B test is a success, and you can now use 'Variation 1' on your web page.
Statistical significance depends on two variables: The number of visitors (or sample size) and the conversion rates for both the original and any variations. To ensure that your A/B tests reaches the desired statistical significance level at the earliest, you need to plan your testing hypothesis by keeping both these variables in mind.
Below are a few general cases for declaring your A/B test results in PageSense:
If visitor count and significant is reached
If the visitor count is attained for both original and variation, then the variation with the highest conversion rate and significance level reached is declared the winner.
If visitor count and significant is not reached
PageSense will show you the remaining count of visitors required for each variation, until which you need to run your A/B test to get a conclusive result.
If visitor count is reached but significance is not reached
If neither of the variations are statistically better, even after reaching the visitor count, then the test is marked as inconclusive. In this case, you can stick with the original variation or run another test. You can use the failed data to help you identify a new hypothesis for your next test.
Declaring results using the Bayesian method
The Bayesian method of A/B testing tells you the probability of variation A performing better than B by comparing the previous day data collected during the testing phase. The winner of this approach will be declared when the three primary rules are met:
1. Has achieved a unique visitor count of 100 or above for the variations.
2. Has gathered a unique conversion of 50 or above for the goals setup.
3. Has an average of minimal loss value.
Once the primary rules are satisfied, in the Bayesian approach, we declare winners based on two other parameters:
1. To declare a variation from the Experiment as the winner, the primary goal value should meet the visitor count, conversion counts, and expected loss value.
2. To declare a variation from the Goal as the winner, we will take in visitor count, conversion count and expected loss value.
Number of visitors and conversions metric
The Bayesian method of A/B testing is about calculating the probability percentage for both the original and the variation pages. This probability percentage can be quickly achieved based on two major factors: the number of visitors and the number of conversions.
The number of visitors indicates the total number of unique visitors to the corresponding variation. Ideally, your visitor count should be big enough for accurate interpretation of the A/B test results. In other words, the bigger the visitor count, the higher the probability of declaring a winner.
On the other hand, the number of conversions indicate the number of unique instances of the visitor fulfilling the desired action for a given goal. It can refer to any desired action that you want the user to perform on your web page, from clicking on a button to making a purchase and becoming a customer.
Note: In comparison to the Frequentist method of testing, the Bayesian tests do not require fixing the sample to give valid results.
In PageSense, we recommend following two business rules before making a decision using the Bayesian approach:
Wait until you have recorded at least 100 unique visitors per variation
Wait until you have reached at least 50 conversions on the primary goal.
Check if it has a lesser than negligible loss value.
Check if the primary goal value meets the required criteria to declare an Experiment as the winner.
Check if all the mentioned values are collected and satisfied to declare a Goal as the winner.
Expected loss
Bayesian statistics use the expected loss value in the decision-making framework of your A/B testing. The expected loss is defined as a combination of how probable it is that variation B has a lower conversion rate than variation A, and if variation B is worse, and by what much percentage on an average. Note that the lower the expected loss, the higher the chances of declaring a variation as a winner, and vice versa.