Resources > Articles

The Two Questions You Need to Ask Your Data Analysts

Post Author
  • Pragmatic Institute is the transformational partner for today’s businesses, providing immediate impact through actionable and practical training for product, design and data teams. Our courses are taught by industry experts with decades of hands-on experience, and include a complete ecosystem of training, resources and community. This focus on dynamic instruction and continued learning has delivered impactful education to over 200,000 alumni worldwide over the last 30 years.

two questions to ask data analysts

two questions to ask data analysts
An article written by Data Incubator founder Michael Li was featured on Harvard Business Review today. It can be found where it was originally posted here.

Previously posted on October 27, 2015, on The Data Incubator

 

Data scientists are in high demand. McKinsey predicts a need for 1.5 million new data professionals in the U.S. alone. As these droves of analysts join organizations, it’s critical that they know how to talk with managers about their findings. But the burden for good communication doesn’t just fall on them. For their part, managers – the consumers of the analysis – need to ask the right questions to be sure they understand the key concepts behind data analysis.

At The Data Incubator, we work with hundreds of companies looking to train their workforce in modern data analytics or hire data scientists from our selective PhD fellowship. Our clients often ask us how they should engage with their newly trained or newly hired data professionals. Here are two critical questions we suggest they ask when trying to understand the results of any data analysis.

 

How was the data collected?

Let’s say the result of your analyst’s hard work is this statement: “Customers who were shown an advertisement were twice as likely to purchase the product than those who were not. Since the ads cost less than the expected profit, we should show more customers the advertisements.” This may sound like good news to many managers and they may be inclined to act on it, quickly. But before you do, you need to understand how your analyst reached this conclusion. Not probing further could result in costly mistakes.

If the customers who were shown the ads were chosen at random, then this might be a randomized controlled trial, and the above conclusion would likely be valid. However, if those target customers were not selected at random, then the results are much less likely to be valid. For example, if the ads were shown to New York customers but not ones from Boston, then it is unclear if the customer’s city is acting as a confounding factor (is our product just more popular in New York, independent of advertising spend?). If that’s the case, showing more ads in Boston or Philadelphia may not result in more purchases.

Choosing the customers at random sets up an experiment. Experiments allow us to infer causality with a high degree of confidence and, if done right, their actionable conclusions are incontrovertible. On the other hand, if the customers were not chosen at random (e.g. they were chosen based on city), then this is only forming an observational study. Observational studies rely on post-hoc summary statistics rather than ex-ante randomization and are susceptible to the adage that “correlation is not causation.” It is less clear in these studies whether repeating the study will result in the same conclusion.

Of course, there is more nuance to most analyses beyond what this simple dichotomy suggests. Sometimes, for example, the bias in sampling can be more subtle, rendering what we thought was an experiment to actually be an observational study. Were the target customers selected based on high income? If so, lower-income customers may respond differently. Even if they were selected randomly, if the experiment was done a while ago, there is an implicit sample bias when selecting from an older customer cohort so newer customers may not respond the same way. On the flip side, the presence of instrumental variables can make what appears to be an observational study more like a quasi-experiment, strengthening the conclusions for business purposes. Were the customers shown ads based on a customer ID number? If that number was randomly generated, then this may not be an observational study but a fortuitous experiment.

Both observational studies and experiments are called “data analysis” – after all, both come from “looking at the data.” Treat the strength of the conclusions drawn from these distinct types of analyses differently and probe your analysts to fully understand the type of analysis that was performed. This does not mean that you should ignore results from observational studies. Experiments can be expensive and time-consuming. Instead, understand the specific weaknesses of observational studies and treat the findings as a starting point for a longer conversation with your data analysts about the underlying assumptions of their analyses and the potential biases. Balance the results of any data analysis with their industry experience and the potential risks and benefits of each alternative you are evaluating.

 

What is the margin of error?

Now, let’s suppose your data analyst concludes that customers shown an advertisement are 20% more likely to purchase your product than those who were not. Depending upon the sample size and how the analysis was conducted, you may feel confident in this result – or you may not. One measure of such confidence is the standard error. In the example above, if the standard error is 30%, then there is a substantial (25.25%) chance the advertisement may not help drive purchases and the result may be deemed statistically insignificant. Even if the standard error is only 10%, there is a small chance (2.28%) that ads do not positively impact sales. Good data analysts always report some measure of this confidence, often through error bars that reflect the standard error and good data managers think critically about how this uncertainty affects their business.

You do not need – and normally can’t demand – absolute certainty in the data before proceeding. In the above example, we don’t just care about the likelihood that an advertisement increases purchasing but we also care about the likelihood of that likelihood: what is the chance that a customer is 30% more likely to purchase? What is the chance they are 10% more likely to purchase? How does the potential upside weigh against the potential downside? While mentally tedious, this likelihood analysis is necessary to truly understand the risks of the decisions you are making.

It’s rarely possible to be 100% certain about a business decision you have to make but statistics can help smart managers quantify and limit the risks of their decisions. Never take your data analyst’s conclusions at face value, however. Ask them about the methodology they used and the margin of error before using their conclusions to inform your decisions.

Author

  • Pragmatic Institute is the transformational partner for today’s businesses, providing immediate impact through actionable and practical training for product, design and data teams. Our courses are taught by industry experts with decades of hands-on experience, and include a complete ecosystem of training, resources and community. This focus on dynamic instruction and continued learning has delivered impactful education to over 200,000 alumni worldwide over the last 30 years.

Author:

Other Resources in this Series

Most Recent

Book end with a stick figure person pushing books forward. Laptop with an arrow pointing right with data inside the arrow.
Article

How to Overcome 5 Common Challenges to Data Maturity

There are some common challenges to data maturity (meaning you’re not alone). Here are some ways to push through the problems and keep moving forward.  
Category: Data Science
Article

The Power of Data Visualization

We’ve all heard the adage before “a picture is worth a thousand words.” With 90% of the information transmitted to the brain visually, it’s no secret data visualization is a powerful tool that organizations virtually
Category: Data Science
Article

Fundamentals on the Grammar of Data Visualization

Data visualization is the graphical representation of information and data. In the latest episode of Data Chats, our host Chris Richardson had a conversation with Lee Feinberg, president of DecisionViz, a management consulting company specializing
Category: Data Science
Article

Data Maturity Assessment: How Data Mature is Your Organization?

Data maturity refers to the degree to which an organization uses and makes the most of its data. This article introduces our new Data Maturity Assessment and outlines the benefits of being a data-driven company. 
Category: Data Science
Article

7 Best Practices for Creating a Data-Driven Business

The best practices for creating a data-driven business featured in this article is based on a recent conversation on the Data Chats Podcast featuring Jason Foster, founder & CEO of Cynozure.
Category: Data Science

OTHER ArticleS

Book end with a stick figure person pushing books forward. Laptop with an arrow pointing right with data inside the arrow.
Article

How to Overcome 5 Common Challenges to Data Maturity

There are some common challenges to data maturity (meaning you’re not alone). Here are some ways to push through the problems and keep moving forward.  
Category: Data Science
Article

The Power of Data Visualization

We’ve all heard the adage before “a picture is worth a thousand words.” With 90% of the information transmitted to the brain visually, it’s no secret data visualization is a powerful tool that organizations virtually
Category: Data Science

Sign up to stay up to date on the latest industry best practices.

Sign up to received invites to upcoming webinars, updates on our recent podcast episodes and the latest on industry best practices.

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.

Subscribe

Subscribe

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.