Statistics plays an important role in research projects: without them, you may be making decisions based off the roll of the dice. The more in-depth you go with statistics and data, the more confusing it can seem. It’s easy to make mistakes without ever realizing it. If you don’t already have that PhD in Statistics, we’ve got a few questions you should consider when thinking about data and statistical testing, whether from your own research or something your competitor has published:
1. Do you understand the difference between correlation and causation?
The classic example used to explain the difference between correlation and causation is that as ice cream sales increase, so do shark attacks. Therefore, ice cream sales must cause shark attacks! Thankfully for ice cream lovers everywhere, this is a case where correlation does not equal causation. Correlation describes the size and direction of a relationship between two variables (the ice cream sales and shark attack rates in our example). Causation indicates that one is the result of the other.
In more practical terms, if you see your sales go up when you lower your price, you have a correlation between your sales and your price. But lowering the price may not have caused the sales increase. Other factors that could have contributed include:
- Seasonality: Is this a time of year when your sales increase anyway? (such as in the example of shark attacks and ice cream sales)
- Competition: Did competitive actions impact your sales?
- Product: Did you introduce a new product with new features?
The point: If you see correlation, make sure you look at the bigger picture before assuming causation.
2. Are you using the right terminology?
Describing results in stats testing is one case where being nit-picky about word choice is important. For example, if a measure in your research goes from 5% to 10%, that is a 5 percentage point increase, not a 5 percent increase. The latter is often incorrectly cited in research (for the record—this is actually a 100% increase). Being exact is necessary when describing data and trends, so that you can make informed decisions based on your research.
The point: Using the right terminology will allow researchers and non-researchers alike to more objectively understand the results.
3. Are you overdoing the stats analysis?
It’s tempting to do every analysis you can think of with your data. You want to be thorough and make sure you extract every last detail from the numbers.
Sometimes those additional analyses are completely unnecessary. Often, simple cross tabs can provide all the insights you need to make a confident decision. Think of it this way: review your objectives for the research. If analyzing cross tabs provides answers to all your on-going questions, then you likely don’t need to go any further by delving into more complex statistics.
The Point: Spending time on additional, unnecessary multivariate analyses can prolong your timeline and add additional costs that don’t give you any extra benefits.
4. How large is the denominator?
You may see brands claiming things like, “It’s the fastest growing brand in the market”. While probably true, statements like that can be very misleading. For example, if a brand had 1% share and increased to 2%, that would be a 100% increase—which could make it the fastest growing brand, but only because it started with such a small share. That large increase is due to the denominator in the calculation being so small (1% in the example: 2%/1% = 100%). The claim sounds great and is technically correct. However, the brand still only has a 2% share, which doesn’t exactly make it a major player.
The Point: Statistics can be used to manipulate takeaways. Don’t fall into this trap. Use stats transparently and honestly and your results will be replicable and reputable.
5. What’s the confidence level vs. sample size?
One question our team is frequently asked is “What sample size is statistically significant?”
As your sample size increases, the stat testing will become more sensitive- smaller differences between two scores will be significant.
Technically, you can get statistical significance at almost any sample size. However, to determine the sample size you need when measuring share of preference, you need to ask two specific questions:
- What confidence level do you want? (95% and 90% confidence levels are the most commonly used)
- What size difference in two data points do you expect to be or want to be statistically different
Confidence level: Confidence level can be defined as the repeatability of the experiment. In plain English: at a 95% confidence level, for every 100 times you complete the experiment, you will receive the same results 95 times. As your confidence level decreases, you need smaller differences for statistical significance because the criteria relax somewhat.
Put into practice, at a 95% level of confidence with a sample size of 100, you need a difference of around 10 points for a significant difference between two market share numbers*. With a sample size of 200 and the same level of confidence, there only needs to be a difference of about 7 points. However, there is a level of diminishing returns with sample size. Past a sample of 500 or so, your gains in sensitivity become very small.
Size difference: When you’re basing you study size on a necessary confidence level, it’s important to make an assumption about how close the preference levels may be. Using our example above, you can see that if you expect preference between two brands to be only 7 points, a sample size of 100 will not yield significant results at 95%. Guessing at the share difference isn’t always easy, so it may be best to consult with a statistician to determine the trade-offs between sample size, cost, and significance.
*The exact difference needed varies depending on the actual numbers, but that’s a stat lesson for another day. To further drive home the complexity: the difference needed to determine statistical significance can vary depending on the exact question.
The point: When you’re deciding on a sample size, the correct number for your study will become a question of how much sensitivity you want versus how much of your budget is allotted for paying for sample.
6. Which tests are you using?
T-test. ANOVA. Chi-square. 1-tail. 2-tail. Independent samples…there are numerous options when it comes to statistical testing. With so many choices, it’s easy to feel uncertain when you’re doing all the testing yourself. Partnering with a research supplier often means that there is a statistician on hand who can help guide you through your options. While we’re not going to go in-depth to explain these different tests now, you can look forward to another stats lesson in a future post.
There is a lot that goes into interpreting statistics and data from market research. While it may feel overwhelming, you usually don’t have to have a PhD Statistician like we do to understand everything. Remember: Keeping it simple will often give you most of the data you’re looking for. If you do need more in-depth analysis, partnering with the Stevenson Company’s analysts will offer a free consultation and give you the additional resources and support you need to make informed decisions. Are there other parts of statistical analysis that are confusing you? Shoot us an email and we can help point you in the right direction.