Resources > Articles

15 Questions to Ask When Preparing Data for Analysis

Data preparation is critical

As a data practitioner, you probably spend over 80% of your time preparing your data for analysis. This may be frustrating at times as you are eager to perform the analysis and uncover trends and insights. 

However, data preparation is an integral component of data analysis. Without proper data preparation, subsequent data analysis will be flawed. 

The preparation process includes reformatting and cleaning the data, correcting any errors and outliers, and combining data sets if applicable. 

Effective data preparation is beneficial for organizations to optimize the analysis process. Preparing the data is a key step to ensure the data available is of good quality and insights derived from it are accurate and reliable. 

Additionally, a thorough exploration of the data and possible methods during this phase is well worth the effort and can save a lot of time and aggravation down the line. 

 

DATA PREPARATION SCENARIO 

Imagine a mobile communications company wants to understand the characteristics of customers who churn or leave the company for another provider. If you were the data analyst presented with this request, you’d want to use the data preparation phase to help you understand the feasibility of this request before diving into the data. 

The first step is to determine how far back stakeholders want the data to be investigated and then identify if there is data available on who churned during that period (e.g., 5 years). Machine learning and artificial intelligence models can help determine if the data set is available and applicable. 

Decisions will arise. Will you delete or ignore missing data? Or would you try to fill in missing values through imputation? If there are extreme values, will you keep or delete them? 

 

QUESTIONS TO KEEP IN MIND FOR DATA ANALYSIS

Here is a checklist with questions to help you ensure you cover all the important bases of the data preparation phase of a data project. 

This checklist helps to ensure you have access to accurate data and identifies other key issues from the start. It begins with general overview questions and becomes more specific and action-oriented

You will likely want to add relevant questions of your own pertaining to your industry and organization. 

 

AT FIRST GLANCE: 

1. Does the data you need exist? 

2. Do you know how the data was generated and collected? 

3. Is the data enough to reach reliable conclusions? 

UPON FURTHER REFLECTION: 

4. Does it measure what you need? 

5. Are the variables the correct types or levels? 

6. Do you understand the labels and codes used? 

AFTER EXPLORING THE DATA:

7. Does the data include the required range and variability? 

8. Are the distributions as you would expect? 

9. Have you identified outliers or anomalies? 

CONSIDER RETURN ON INVESTMENT (ROI):

10. Are you focusing on predictors you can control? 

11. Have you identified the costs of manipulating the predictors? 

12. Have you identified the potential benefits of conducting your analysis? 

AT THE END OF YOUR PREPARATION: 

13. Are you confident that your analysis will produce the desired insights? 

14. Have you identified if anything can be safely and usefully reduced? 

15. Can you explain and justify your conclusions and recommendations?

 

[Want to see more data preparation questions not listed here? Download our ebook – Prepare: Avoid Common Pitfalls by Analyzing the Right Data] 

 

CONCLUSION

Data is only as useful as its accuracy. 

As organizations spend resources and time to ensure the quality of their data is accurate and reliable, an error or issue in the data can significantly impact decision-making or skew insights. Asking the right questions when preparing your data is critical to getting accurate data insights. 

 

Advance From A Tactical Role to Being A Strategic Contributor 

Translate business needs into achievable data projects with Pragmatic Institute’s course, Business-Driven Data Analysis. The course is built around the Pragmatic Data Insights Model to ensure data practitioners and stakeholders embrace an optimized approach to data projects. Master the Pragmatic Data Insights Model and implement these skills within your own organization using real-world data.

Learn More

Author
  • Pragmatic Institute

    Pragmatic Institute is the transformational partner for today’s businesses, providing immediate impact through actionable and practical training for product, design and data teams. Our courses are taught by industry experts with decades of hands-on experience, and include a complete ecosystem of training, resources and community. This focus on dynamic instruction and continued learning has delivered impactful education to over 200,000 alumni worldwide over the last 30 years.

Author:

Other Resources in this Series

Most Recent

Real-World Data Challenges for Business Leaders
Article

Real-World Data Challenges for Business Leaders

Advancements in data collection and analysis are constantly reshaping the business landscape. This transformation has shifted the role of data management and utilization from a mere support function to a fundamental cornerstone of most business...
Category: Data Science
Article

The Pragmatic Data Insights Model: A Blueprint for Data Success

In the fast-paced world of data analytics and business intelligence, achieving actionable insights from data can be a challenging endeavor. Many data projects face disconnects between data teams and business leaders, leading to unclear goals...
Category: Data Science
Crafting Data Stories: The Intersection of Art and Data Science
Article

Crafting Data Stories: The Intersection of Art and Data Science

Editor’s note: This conversation has been lightly edited and condensed for clarity. “Data visualization is about adding a visual channel to make the data more memorable and comprehensible. We remember things in images and stories;...
Category: Data Science
Demographic Bias in Data
Article

Demographic Bias in Data

Demographic bias in data occurs when datasets don’t include information from a broad, diverse group of subjects. For example, a company might collect information from 100 people, 90 of whom identify as male and 10...
Category: Data Science
3 Emerging Roles at the Intersection of Data and Business
Article

3 Emerging Roles at the Intersection of Data and Business

The rise of AI doesn’t have to spell doom for data careers.  In our latest Data Chats podcast episode, Favio Vazquez, senior data scientist at H2O.ai, not only provides reassurance to data professionals but also...
Category: Data Science

OTHER ArticleS

Real-World Data Challenges for Business Leaders
Article

Real-World Data Challenges for Business Leaders

Advancements in data collection and analysis are constantly reshaping the business landscape. This transformation has shifted the role of data management and utilization from a mere support function to a fundamental cornerstone of most business...
Category: Data Science
Article

The Pragmatic Data Insights Model: A Blueprint for Data Success

In the fast-paced world of data analytics and business intelligence, achieving actionable insights from data can be a challenging endeavor. Many data projects face disconnects between data teams and business leaders, leading to unclear goals...
Category: Data Science

Sign up to stay up to date on the latest industry best practices.

Sign up to received invites to upcoming webinars, updates on our recent podcast episodes and the latest on industry best practices.

Subscribe

Subscribe

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.