Why do 80% of data science projects fail (as Gartner reports)?
Often, these projects never identified the business problem to solve with data, created a hypothesis to test or verified that the necessary data was available. What’s more, there are persistent communication gaps between business executives and data scientists and data analysts, who aren’t speaking the same language. Stakeholders don’t understand the data or what questions it can answer, while data teams don’t understand what the business wants.
Organizations waste time, money and effort as a result of these disconnects. They also miss out on valuable opportunities to apply data analysis toward business needs.
That’s why Pragmatic Institute developed an approach not just for data analysis, but for businesses to define a problem and solve it with data—tackling problems that are actually going to be tangible and add value within the business. Phase by phase, let’s walk through the Pragmatic Data Analysis Model: a proven, optimized and repeatable approach for any data project or toolset.
Editor’s note: What follows is a transcript of the first section of recent Data Chat “Applying Data Analysis to Strategy: A Proven Approach.” It has been lightly edited and condensed for clarity. Watch the full webinar below.
Define: Focus on the specific business problem you want to solve with data
The first step is defining the business problem. One of the big problems with a lot of businesses is they want to hit the ground running, “Let’s dive in and get to it.” Oftentimes, you skip this really important step, “Let’s really understand and define what it is we’re trying to accomplish.”
In data, it’s absolutely critical to really deconstruct what the business is trying to understand about itself and the goals it is trying to lay out, and then bring it back into the actual data. I’ve seen time and time again the business is skipping the step of defining: What is the business problem? What are the hypotheses you’re trying to prove or disprove in this initial step?
They’re diving right into the analysis instead of really trying to define what it is and how that is going to affect the business goals and getting buy-in. If you’re not getting buy-in from the business stakeholders, then it really doesn’t matter what you’re analyzing or what you’re going to come up with.
You’re not setting the right parameters. You’re not setting the right expectations with the people who are going to have to do something with the results of whatever analysis you’re about to embark on.
Prepare: Explore the available data and the most useful methods
The second step is preparing for that. So once you’ve defined it, then you’ve got to really dive in. Anybody who’s done any extensive data analysis knows that once you start really getting into the data, things start to change.
“You’re getting into data diligence to start framing all of these pieces so that you can rethink what’s actually possible and feasible with the data itself.”
You start finding that, “I’m missing this piece.” Or, “these databases don’t connect,” or “I need to supplement this external piece of data.” There’s a million different things that could go wrong. Once you get into the data itself, that is either going to change the cost of it, the time of it, or what the potential outputs of it are.
So once you get into preparing this, which means some part of the analysis, it doesn’t mean that you’re completing the analysis, but you’re getting into data diligence to really start framing what all of these pieces are going to be so that you can rethink what’s actually possible and feasible with the data itself.
Once you’ve done that, then you need to rethink how you define the business problem and all of the hypotheses. You need to prepare to go back and refine that with the stakeholders.
Refine: Revise questions and expectations as necessary
The next step is to go back to the stakeholders and refine those statements. This takes some courage.
This is something that data scientists and analysts are not being taught how to do. You need to actually go back to the sponsor of this project and say, “You asked for this, and we can’t deliver this with this data in this timeframe with these parameters. With the data that we have, if you want that, these are the things that need to be added to it, or we can deliver this or that with these changes in parameters.” This is extremely important. This is a step that doesn’t happen in most organizations.
This is why you get things called “scope creep” or a failed data project. Because it goes from a three-month project to a six-month project to a nine-month project to a 12-month project, only to deliver something that the stakeholder didn’t want and didn’t care about. All because you never went back and refined it after you went deep into the data to really understand what’s possible, feasible, and go back to the stakeholders to refine and adjust the parameters of the project.
This is incredibly important in making sure you’re setting yourself up for success right out of the gates before you even get in and start analyzing the data.
Analyze: Build models to find actionable insights
When it comes down to being precise about what you’re trying to do, you can think about the difference between predictive analytics and prescriptive analytics. Predictive analytics, which is so common in data science, is building a model that can predict what people will do in the future, or that can predict what a human would do in the same situation.
It’s an extremely important task. It gets an enormous amount of attention. But most of the time, when a person comes to you with a business problem, they’re not asking for just a prediction.
They want to say, “Tell me what to do.” And that’s a fundamentally different question that requires a different kind of data and a different kind of analysis. And this leads into the “analyze” phase.
Most of the time, when people get a problem, they want to start jumping immediately to the models: “I’m going to build a neural network to do this. I’m going to build some machine learning algorithms to do that.” Those are very good for predictive models or correlational models. But when you’re trying to come up with a business strategy and telling people, “This is what you do as a result,” you’re going to need to simplify it to something that can be easily interpreted and put into action.
“Simple methods can sometimes be the best. ‘Analyze’ means choosing a method that is going to be most likely to give you these interpretable actionable insights.”
Two themes came up constantly as we’ve prepared this course, Business-Driven Data Analysis, for Pragmatic. As this model got developed, the idea was that we are looking for actionable insights. Your analysis needs to tell you what to do next. We’re also looking for a good return on investment. We have to show that this is going to be the best way to reach a particular outcome.
In terms of analysis, it actually often involves keeping it a lot simpler. I have worked on projects where the best insights from our data were from a spreadsheet join. It was an organization that just simply never joined the data tables for their clients with the data tables from their donors. Once we joined those two, we were able to help them identify donors with high potential. They were a nonprofit. So that was really important.
Another one, it was just a pivot table where we said, “This is where you’re getting your donors.” And they had never done that. So these simple methods can sometimes be the best. “Analyze” means choosing a method that is going to be most likely to give you these interpretable actionable insights. It’s not necessarily the standard, less usual suspects in data science.
Present: Communicate actionable insights and next steps to stakeholders
The fifth step in this model is “present. This is where you have to focus on actually telling the stakeholders what your actionable insights are.
I know that most analysts are not comfortable with that. But the whole point of conducting the project in the first place was to find out what to do next. Analysts need to be able to say: “This is the business solution.”
You had a business problem—not a data problem, a business problem. You are using data to help provide a business solution, which says, for instance, focus on this particular market, promote this item more, drop this location. Any one of those would be important for presenting as clearly as possible: the business problem, the data used to answer it, and the business solution that can be justified by the data.
* * *
To learn the Pragmatic Data Analysis Model in depth and get hands-on practice with real-world business scenarios, sign up for our new course Business-Driven Data Analysis. The course is also available as private training for data teams.