Resources > Articles

Building a Brain: A How-to Guide on Capsule Neural Networks

Machine learning. Artificial Intelligence. Technological background with a printed circuit board.

By Morven Watt

 

Imagine a tree. Now name the parts.

You probably listed things like roots, trunk, branches, leaves, twigs. Ask a hundred people, and you’ll mostly get the same answers. Ask people around the world and they’ll likely name the exact same universal parts.

When asked to think about a tree, a picture of a tree with all its composite parts popped into your head without much effort. But there’s a lot more going on in your brain than populating a simple image.

 

If you look at the human brain, specifically the higher order functions of the frontal cortex, you see that it includes things like planning, language, and problem-solving. It’s also where we learn to decode the complexities of the world around us. We start problem-solving early in life and build on that existing framework as our brains grow. One of the key ways we do this is by organizing and categorizing the things in our world.

 

When we learn about the leaves on the trees, we learn them in the context of the twigs and branches they grow from. When we learn about trees, we refer back to learning about leaves and the other smaller elements, creating our own categories and distinctions. “This is a tree and I know this is a tree, even if I don’t explicitly see all of it.” We can still identify trees from a distance, or if we only see a canopy, fallen branch, or trunk.

 

To put this another way, any complex object is a hierarchy of simpler objects.

 

This is essentially what Capsule Neural Networks (CNN) aim to do. CNNs are a type of Artificial Neural Network (ANN) that mimic the human brain and the way we categorize and process information in order to better process this hierarchical model of relationships.

 

But how does this actually work?

 

Let’s break down the steps using our tree example.

 

Step 1. We add structures known as “capsules” to a Convolutional Neural Network (ConvNet). So “leaf” would be a capsule, as would “twig,” “branch,” “truck,” “roots,” etc. These would be considered “lower order capsules.”

 

Step 2.We show our ConvNet an image of a tree.

 

Step 3. We use the output from the lower order capsules to form a more stable representation for higher order capsules. Our lower order capsules would identify leaves, twigs, branches, and so on, and pass them to the higher order capsule.

 

Step 4. This then allows the higher order capsule to produce an output, which is a vector of probability of an observation and a pose for that observation. In a nutshell, it allows the machine to say “Hey, this image has leaves, twigs, and branches. It’s probably a tree.”

 

What’s great about this type of technology is how it can be used to solve problems, like the Picasso Problem. This refers to the fact that an image can have all the right parts, but not in the correct spatial relationship, like most Picasso paintings. Or, in the context of our tree, if an image were to have unusual looking trees, or spatial inconsistencies, image recognition becomes compromised.

 

This is because max-pooling is used in traditional ConvNets. Max-pooling refers to what happens between convolutional layers. Max-pooling layers take the most active neurons from each convolutional layer and pass them on to the next convolutional layer.

 

1A Convolutional Neural Network (ConvNet) is a class of deep neural network most commonly applied to analyzing visual imagery.

 

The problem? The neurons that are less active are dropped. This means spatial information gets lots as data progresses through the network. CNNs help solve this problem because they encapsulate all the important information of an image. Because capsules are independent, when multiple capsules agree, the probability of correct detection is much higher.

 

This is the (basic) principle of CNN technology and how it can propel future learning is fascinating. Image identification can help us understand and analyze the vast amounts of data that already exist, and the new data that is constantly being created. While it may seem that CNNs are simply identifying images, the fact is, we’re creating technologies that can operate more and more like a human brain.

 

As Pliny the Elder wrote, “The brain is the citadel of the senses; this guides the principle of thought.” When it comes to machine learning, we’re building citadels, and every data scientist has the chance to be an architect for the future of machine learning and AI.

 

Author

Other Resources in this Series

Most Recent

Spotify is data-driven
Article

Case Study: How Spotify Prioritizes Data Projects for a Personalized Music Experience

Spotify, a titan in the realm of audio streaming, has transformed the way we experience music and podcasts. Since its inception in 2008, it’s become a ubiquitous platform, boasting a colossal user base of approximately...
Category: Data Science
Team Prioritizing Projects
Article

Avoid These Mistakes When Prioritizing Data Projects for Your Company

Even in a world full of data, business decisions often still rely on instinct and emotions. However, when it comes to business, considering all external factors before making a move is essential. This is where...
Category: Data Science
Guy celebrates connection
Article

Harnessing Data to Forge Emotional Bonds with Customers: Insights from Zack Wenthe

Zack Wenthe joined a recent episode of Data Chats to discuss the importance of understanding how consumers interact with your brand, how customers make decisions emotionally and leveraging data to create meaningful decisions.   Wenthe is...
Man and Woman Working on Same Laptop
Article

The Path to Data Democratization

Data democratization isn't easy. Developing a successful data strategy requires a clear vision of the end goal or purpose that an organization wants to achieve within one year to eighteen months. This vision should be comprehensive and ambitious, considering every aspect of the investment and budget.
Category: Data Science
Business Team Communicating
Article

Communicating Data to Non-Data Teams

Providing data insights to non-data teams can be a challenging task. Non-data teams often have limited knowledge of data and statistics and may not have the skills to interpret and apply insights effectively. Here's what you can do about it.
Category: Data Science

OTHER ArticleS

Spotify is data-driven
Article

Case Study: How Spotify Prioritizes Data Projects for a Personalized Music Experience

Spotify, a titan in the realm of audio streaming, has transformed the way we experience music and podcasts. Since its inception in 2008, it’s become a ubiquitous platform, boasting a colossal user base of approximately...
Category: Data Science
Team Prioritizing Projects
Article

Avoid These Mistakes When Prioritizing Data Projects for Your Company

Even in a world full of data, business decisions often still rely on instinct and emotions. However, when it comes to business, considering all external factors before making a move is essential. This is where...
Category: Data Science

Sign up to stay up to date on the latest industry best practices.

Sign up to received invites to upcoming webinars, updates on our recent podcast episodes and the latest on industry best practices.

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.

Subscribe

Subscribe