Resources > Articles

Building a Brain: A How-to Guide on Capsule Neural Networks

Building a Brain: A How-to Guide on Capsule Neural Networks

Capsule Neural Networks

By Morven Watt

 

Imagine a tree. Now name the parts.

You probably listed things like roots, trunk, branches, leaves, twigs. Ask a hundred people, and you’ll mostly get the same answers. Ask people around the world and they’ll likely name the exact same universal parts.

When asked to think about a tree, a picture of a tree with all its composite parts popped into your head without much effort. But there’s a lot more going on in your brain than populating a simple image.

 

If you look at the human brain, specifically the higher order functions of the frontal cortex, you see that it includes things like planning, language, and problem-solving. It’s also where we learn to decode the complexities of the world around us. We start problem-solving early in life and build on that existing framework as our brains grow. One of the key ways we do this is by organizing and categorizing the things in our world.

 

When we learn about the leaves on the trees, we learn them in the context of the twigs and branches they grow from. When we learn about trees, we refer back to learning about leaves and the other smaller elements, creating our own categories and distinctions. “This is a tree and I know this is a tree, even if I don’t explicitly see all of it.” We can still identify trees from a distance, or if we only see a canopy, fallen branch, or trunk.

 

To put this another way, any complex object is a hierarchy of simpler objects.

 

This is essentially what Capsule Neural Networks (CNN) aim to do. CNNs are a type of Artificial Neural Network (ANN) that mimic the human brain and the way we categorize and process information in order to better process this hierarchical model of relationships.

 

But how does this actually work?

 

Let’s break down the steps using our tree example.

 

Step 1. We add structures known as “capsules” to a Convolutional Neural Network (ConvNet). So “leaf” would be a capsule, as would “twig,” “branch,” “truck,” “roots,” etc. These would be considered “lower order capsules.”

 

Step 2.We show our ConvNet an image of a tree.

 

Step 3. We use the output from the lower order capsules to form a more stable representation for higher order capsules. Our lower order capsules would identify leaves, twigs, branches, and so on, and pass them to the higher order capsule.

 

Step 4. This then allows the higher order capsule to produce an output, which is a vector of probability of an observation and a pose for that observation. In a nutshell, it allows the machine to say “Hey, this image has leaves, twigs, and branches. It’s probably a tree.”

 

What’s great about this type of technology is how it can be used to solve problems, like the Picasso Problem. This refers to the fact that an image can have all the right parts, but not in the correct spatial relationship, like most Picasso paintings. Or, in the context of our tree, if an image were to have unusual looking trees, or spatial inconsistencies, image recognition becomes compromised.

 

This is because max-pooling is used in traditional ConvNets. Max-pooling refers to what happens between convolutional layers. Max-pooling layers take the most active neurons from each convolutional layer and pass them on to the next convolutional layer.

 

1A Convolutional Neural Network (ConvNet) is a class of deep neural network most commonly applied to analyzing visual imagery.

 

The problem? The neurons that are less active are dropped. This means spatial information gets lots as data progresses through the network. CNNs help solve this problem because they encapsulate all the important information of an image. Because capsules are independent, when multiple capsules agree, the probability of correct detection is much higher.

 

This is the (basic) principle of CNN technology and how it can propel future learning is fascinating. Image identification can help us understand and analyze the vast amounts of data that already exist, and the new data that is constantly being created. While it may seem that CNNs are simply identifying images, the fact is, we’re creating technologies that can operate more and more like a human brain.

 

As Pliny the Elder wrote, “The brain is the citadel of the senses; this guides the principle of thought.” When it comes to machine learning, we’re building citadels, and every data scientist has the chance to be an architect for the future of machine learning and AI.

 

Other Resources in this Series

Most Recent

Article

The Power of Data Visualization

We’ve all heard the adage before “a picture is worth a thousand words.” With 90% of the information transmitted to the brain visually, it’s no secret data visualization is a powerful tool that organizations virtually
Category: Data Science
Article

Fundamentals on the Grammar of Data Visualization

Data visualization is the graphical representation of information and data. In the latest episode of Data Chats, our host Chris Richardson had a conversation with Lee Feinberg, president of DecisionViz, a management consulting company specializing
Category: Data Science
Article

Data Maturity Assessment: How Data Mature is Your Organization?

Data maturity refers to the degree to which an organization uses and makes the most of its data. This article introduces our new Data Maturity Assessment and outlines the benefits of being a data-driven company. 
Category: Data Science
Article

7 Best Practices for Creating a Data-Driven Business

The best practices for creating a data-driven business featured in this article is based on a recent conversation on the Data Chats Podcast featuring Jason Foster, founder & CEO of Cynozure.
Category: Data Science
Article

5 Effective Communication Techniques for Data Analysts

It is predicted that the demand for data analysts will grow by 25% between 2020 and 2030. As more information is easily available, businesses understand the importance of data analysts and how beneficial it is
Category: Data Science

OTHER ArticleS

Article

The Power of Data Visualization

We’ve all heard the adage before “a picture is worth a thousand words.” With 90% of the information transmitted to the brain visually, it’s no secret data visualization is a powerful tool that organizations virtually
Category: Data Science
Article

Fundamentals on the Grammar of Data Visualization

Data visualization is the graphical representation of information and data. In the latest episode of Data Chats, our host Chris Richardson had a conversation with Lee Feinberg, president of DecisionViz, a management consulting company specializing
Category: Data Science

Sign up to stay up to date on the latest industry best practices.

Sign up to received invites to upcoming webinars, updates on our recent podcast episodes and the latest on industry best practices.

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.

Subscribe

Subscribe

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.