Resources > Articles

Building a Brain: A How-to Guide on Capsule Neural Networks

Machine learning. Artificial Intelligence. Technological background with a printed circuit board.

By Morven Watt

 

Imagine a tree. Now name the parts.

You probably listed things like roots, trunk, branches, leaves, twigs. Ask a hundred people, and you’ll mostly get the same answers. Ask people around the world and they’ll likely name the exact same universal parts.

When asked to think about a tree, a picture of a tree with all its composite parts popped into your head without much effort. But there’s a lot more going on in your brain than populating a simple image.

 

If you look at the human brain, specifically the higher order functions of the frontal cortex, you see that it includes things like planning, language, and problem-solving. It’s also where we learn to decode the complexities of the world around us. We start problem-solving early in life and build on that existing framework as our brains grow. One of the key ways we do this is by organizing and categorizing the things in our world.

 

When we learn about the leaves on the trees, we learn them in the context of the twigs and branches they grow from. When we learn about trees, we refer back to learning about leaves and the other smaller elements, creating our own categories and distinctions. “This is a tree and I know this is a tree, even if I don’t explicitly see all of it.” We can still identify trees from a distance, or if we only see a canopy, fallen branch, or trunk.

 

To put this another way, any complex object is a hierarchy of simpler objects.

 

This is essentially what Capsule Neural Networks (CNN) aim to do. CNNs are a type of Artificial Neural Network (ANN) that mimic the human brain and the way we categorize and process information in order to better process this hierarchical model of relationships.

 

But how does this actually work?

 

Let’s break down the steps using our tree example.

 

Step 1. We add structures known as “capsules” to a Convolutional Neural Network (ConvNet). So “leaf” would be a capsule, as would “twig,” “branch,” “truck,” “roots,” etc. These would be considered “lower order capsules.”

 

Step 2.We show our ConvNet an image of a tree.

 

Step 3. We use the output from the lower order capsules to form a more stable representation for higher order capsules. Our lower order capsules would identify leaves, twigs, branches, and so on, and pass them to the higher order capsule.

 

Step 4. This then allows the higher order capsule to produce an output, which is a vector of probability of an observation and a pose for that observation. In a nutshell, it allows the machine to say “Hey, this image has leaves, twigs, and branches. It’s probably a tree.”

 

What’s great about this type of technology is how it can be used to solve problems, like the Picasso Problem. This refers to the fact that an image can have all the right parts, but not in the correct spatial relationship, like most Picasso paintings. Or, in the context of our tree, if an image were to have unusual looking trees, or spatial inconsistencies, image recognition becomes compromised.

 

This is because max-pooling is used in traditional ConvNets. Max-pooling refers to what happens between convolutional layers. Max-pooling layers take the most active neurons from each convolutional layer and pass them on to the next convolutional layer.

 

1A Convolutional Neural Network (ConvNet) is a class of deep neural network most commonly applied to analyzing visual imagery.

 

The problem? The neurons that are less active are dropped. This means spatial information gets lots as data progresses through the network. CNNs help solve this problem because they encapsulate all the important information of an image. Because capsules are independent, when multiple capsules agree, the probability of correct detection is much higher.

 

This is the (basic) principle of CNN technology and how it can propel future learning is fascinating. Image identification can help us understand and analyze the vast amounts of data that already exist, and the new data that is constantly being created. While it may seem that CNNs are simply identifying images, the fact is, we’re creating technologies that can operate more and more like a human brain.

 

As Pliny the Elder wrote, “The brain is the citadel of the senses; this guides the principle of thought.” When it comes to machine learning, we’re building citadels, and every data scientist has the chance to be an architect for the future of machine learning and AI.

 

Author

Other Resources in this Series

Most Recent

Real-World Data Challenges for Business Leaders
Article

Real-World Data Challenges for Business Leaders

Advancements in data collection and analysis are constantly reshaping the business landscape. This transformation has shifted the role of data management and utilization from a mere support function to a fundamental cornerstone of most business...
Category: Data Science
Article

The Pragmatic Data Insights Model: A Blueprint for Data Success

In the fast-paced world of data analytics and business intelligence, achieving actionable insights from data can be a challenging endeavor. Many data projects face disconnects between data teams and business leaders, leading to unclear goals...
Category: Data Science
Crafting Data Stories: The Intersection of Art and Data Science
Article

Crafting Data Stories: The Intersection of Art and Data Science

Editor’s note: This conversation has been lightly edited and condensed for clarity. “Data visualization is about adding a visual channel to make the data more memorable and comprehensible. We remember things in images and stories;...
Category: Data Science
Demographic Bias in Data
Article

Demographic Bias in Data

Demographic bias in data occurs when datasets don’t include information from a broad, diverse group of subjects. For example, a company might collect information from 100 people, 90 of whom identify as male and 10...
Category: Data Science
3 Emerging Roles at the Intersection of Data and Business
Article

3 Emerging Roles at the Intersection of Data and Business

The rise of AI doesn’t have to spell doom for data careers.  In our latest Data Chats podcast episode, Favio Vazquez, senior data scientist at H2O.ai, not only provides reassurance to data professionals but also...
Category: Data Science

OTHER ArticleS

Real-World Data Challenges for Business Leaders
Article

Real-World Data Challenges for Business Leaders

Advancements in data collection and analysis are constantly reshaping the business landscape. This transformation has shifted the role of data management and utilization from a mere support function to a fundamental cornerstone of most business...
Category: Data Science
Article

The Pragmatic Data Insights Model: A Blueprint for Data Success

In the fast-paced world of data analytics and business intelligence, achieving actionable insights from data can be a challenging endeavor. Many data projects face disconnects between data teams and business leaders, leading to unclear goals...
Category: Data Science

Sign up to stay up to date on the latest industry best practices.

Sign up to received invites to upcoming webinars, updates on our recent podcast episodes and the latest on industry best practices.

Subscribe

Subscribe

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.