Resources > Articles

Two Types of Data Scientists: Which Is Right for Your Needs?

Two Types of Data Scientists

Two Types of Data Scientists

Previously posted on July 16, 2015, on The Data Incubator.

There’s a lot of good discussion about technology in big data, but not enough informed discussion about the talent in the field. We usually spend more time thinking about how to optimize our MapReduce jobs than we do thinking about how to motivate the data scientists who write them. We often use the term “data scientist” to encompass two very different types of roles: data scientists who produce analytics for humans, and data scientists who produce analytics for machines. It’s an important distinction, especially because the backgrounds and skill sets necessary for success in these two roles are quite different.

Lately, I have been seeing increasing awareness among employers of the importance of understanding data science and this division within the data science role. This certainly isn’t the only distinction among data scientists, but when it comes to formulating a successful big data strategy, it’s the most significant.

Here’s the difference and the kinds of backgrounds and motivations an employer can expect to look for in each type of data scientist.

 


Analytics for Humans

In the case of data scientists who produce analytics for humans, another human is the final decision maker and consumer of the analysis. This type of data scientist often has to deliver a report on her findings and answer questions like what groups are using a product or what factors are driving user growth and retention.

Though they may sift through the same data sets as their analytics-for-machines counterparts, this type of data scientist delivers the results of their models and predictions to another human, who makes business or product decisions based upon these recommendations. Often, that decision-maker is not a data scientist, so the data scientist must be able to explain her results in a non-technical way, which introduces an additional layer of complexity to the job.

The need to explain implies that the data scientist might deliberately choose more basic models over more accurate but overly complex ones. Data scientists also must be comfortable coming to higher-level conclusions – the “why” and “how” – that are a step removed from the raw data.

A typical background for this kind of role is that of a social or medical scientist (often at the Ph.D. level). They are trained to ask the deeper questions (the “how” and “why”), making them better suited to produce analytics for humans. They are often trained to employ “simple” models and convey the results to those without deep technical understanding, like management or sales. Data scientists with these sorts of backgrounds frequently thrive on the intellectual challenge of explaining a model to another human and drawing clarity from obscure data. They also love seeing the direct impact of decision-making at their organization.

 


Analytics for Machines

The other major division of data scientist is those who produce analytics for machines. In this instance, the final decision maker and consumer of the analysis is a computer. These data scientists build highly complex models that ingest vast data sets and try to extract subtle signals using machine learning and sophisticated algorithms. They tend to work in areas like algorithmic trading, online content/advertising targeting, or personalized product recommendations, to name a few. Their digital models are established and then act on their own, making recommendations, choosing ads to display, or automatically trading in the stock market.

Data scientists who produce analytics for computers must have remarkably strong mathematical, computational, and statistical skills to construct models that can make quality predictions quickly. They can piece together an array of technical tricks in order to create sophisticated models that squeeze out the last drop of performance and typically operate with easily measurable, unambiguous metrics from management such as clicks, profits, and purchases. Their value lies in leveraging their technical virtuosity over millions of situations where even small gains aggregated across millions of users and trillions of events can lead to huge wins.

Data scientists who produce analytics for machines often have mathematics, natural science, or engineering backgrounds (again, often at the Ph.D. level) with the deep computational and mathematical knowledge necessary to do the high-powered work. They also have strong software engineering backgrounds that enable them to build robust large-scale systems to deploy their analyses. They thrive on the technical challenge of building these large-scale, complex systems.

 


Why the Distinction Matters

It’s rare to find someone who is well-suited for both roles, so employers would do well to figure out which role they need. An MIT-trained physicist hungry for a deep machine-learning challenge likely would not be the best fit for a role in which their models must be “simple” enough for management to understand. She also may not be as comfortable extrapolating the “why” and “how” from the data. Likewise, a Harvard-trained social scientist might be great for explaining and drawing deeper conclusions from data, but may not be as well suited to produce analytics for machines. If he lacks the necessary deep mathematical and computational skills, he may not be able to build robust systems or may engineer simplistic models that fail to capture the data’s full value.

Understanding your data science team – what makes them tick, what drives them up the wall – is just as important to the success of a big data strategy as understanding your technology stack. It’s important to figure out what you really need from a data scientist so that you can determine which backgrounds and temperaments would be best suited to getting the job done.

 

Other Resources in this Series

Most Recent

professionals sitting down looking at phone and reports
Article

10 Reasons You Need to Assess the Data Maturity in Your Organization

While most companies want to harness the power of data, the journey to determine where to begin or what are the next steps can be challenging. When an organization is data-driven, they base decisions on
Category: Data Science
predictive analytics on laptop
Article

Staying Ahead of the Competition with Predictive Analytics

Changes in customer behavior, the industry, and competitors’ offerings are why products routinely go out of favor—particularly in the digital space. For example, a digital enterprise product that was well-received when it launched in 2015
Category: Data Science
professionals evaluating reports on computer
Article

How to Pick the Best KPIs for Any Business

Data, data everywhere, not an insight in sight. You probably have encountered this contradiction if you’re a business leader trying to use data to manage your business. It’s not that you don’t have access to
Category: Data Science
data literacy
Article

How Big Data is Revolutionizing Business

Data is revolutionizing the world. IBM estimates that the world is producing 2.5 exabytes of data each day. That’s enough hard disks to cover more than six NFL football fields 

Category: Data Science
woman analyzing different graphs
Article

[Q&A] How Data Visualization Can Be Misused

Learn how organizations can leverage data visualization to make data-driven decisions and how to stop charts from lying.
Category: Data Science

OTHER ArticleS

professionals sitting down looking at phone and reports
Article

10 Reasons You Need to Assess the Data Maturity in Your Organization

While most companies want to harness the power of data, the journey to determine where to begin or what are the next steps can be challenging. When an organization is data-driven, they base decisions on
Category: Data Science
predictive analytics on laptop
Article

Staying Ahead of the Competition with Predictive Analytics

Changes in customer behavior, the industry, and competitors’ offerings are why products routinely go out of favor—particularly in the digital space. For example, a digital enterprise product that was well-received when it launched in 2015
Category: Data Science

Sign up to stay up to date on the latest industry best practices.

Sign up to received invites to upcoming webinars, updates on our recent podcast episodes and the latest on industry best practices.

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.

Subscribe

Subscribe

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.