How to design deep convolutional neural networks for your AI

How to design deep convolutional neural networks for your AI

While the term artificial intelligence (AI) enjoys universal recognition, most people would need help to define what a deep convolutional neural network (DCNN) is. It’s time to change that. 

In this blog, we’ll explore what DCNNs are, how they work, and how you can design one of your own in conjunction with your AI. Let’s begin with a brief recap of the different types of neural networks you might come across and what each of them specializes in

What are deep convolutional neural networks?

Neural networks have been around for a long time. In fact, the invention of the first one is generally credited to psychologist Frank Rosenblatt, who built the Perceptron in 1958 (after its conception in 1943) to model how the human brain processes visual data.

Since then, the understanding of machine learning has undergone a revolution. Generally speaking, we can categorize modern neural networks into three distinct types:

  • Feedforward neural networks (FNNs) consist of an input layer, one or more hidden layers, and an output layer. Information flows in one direction, from the input to the output layer, without loops. This kind of network is excellent for simple classification and regression jobs.
  • Recurrent neural networks (RNNs), on the other hand, have connections that loop back on themselves. This gives RNNs memory that can retain previous inputs in its hidden state. For this reason, they’re particularly well suited to tasks such as time series forecasting and natural language processing.
  • Convolutional neural networks (CNNs) are used for pattern recognition. They achieve this by adaptively learning spatial hierarchies of features they extract from input images using many layers.

It’s this last type that we’ll be focusing on in this article. When we add the word “deep” to the term “convolutional neural network,” we’re just flagging that we’re talking about deep learning, a machine learning technique used to build artificial intelligence systems. 

When harnessing AI for business purposes, it’s often helpful to dip a toe into the technical waters. We’ll start by looking at what you can use DCNNs for and then ‘under the hood’ at the architecture of a deep convolutional neural network.

Applications of convolutional neural networks

So, what kind of business technology solutions use deep convolutional neural networks? Unsurprisingly, CNNs are terrific for any solution that needs reliable, automatic optical recognition. These include:

Image recognition

Any application that relies on image recognition is a good candidate for the use of a CNN:

  • Security systems: Automated surveillance systems use CNNs to detect and recognize suspicious activities in real-time.
  • Automated checkouts: Some modern retail checkout solutions use CNNs to speed up self-serve checkout times.

E-commerce

  • Visual search: CNNs enable customers to do visual searches for products rather than typing in the names in a search bar.
  • Recommendation software: Can suggest potential purchases to customers that are visually similar to previous purchases.

Quality control

  • Detecting defects: Automated systems can inspect products for defects using visual data.
  • Automated sorting tasks: Robots can use visual data to complete tasks like sorting or assembling.

How convolutional neural networks work – the architecture

Now that we know what they’re for, let’s look at a typical CNN architecture. It might seem slightly complex at first glance – but don’t worry. We’ll go through it step by step.

What we have here is a representation of the CNN process. As you can see, the input image is a handwritten figure “2”. Here’s how CNN transforms that input into the correct output.

Convolutional layer

The first stage is the convolution layer. (The word “convolution” is just the name of the mathematical operation used here – essentially, a particular way of combining elements.) This step aims to detect patterns in the input data, such as spotting edges, textures, and shapes.

The convolution layer uses a set of filters (also known as kernels), each of which is small in spatial terms but extends through the full depth of the input volume.

As the filter slides (or convolves) around the input image, it multiplies its values by the original pixel values in the image. These products are summed, resulting in a single pixel in the output array. After that, this process repeats across the entire image. Multiple filters are used in each convolutional layer, each detecting different features. 

You end up with the first raw representation of the image in pixel form.

ReLU activation layer

But wait! We need to eliminate any negative numbers cropped up because they’re not particularly useful in this context. The Rectified Linear Unit (ReLU) Activation Layer accomplishes this for us by replacing any negative values with zero using the function:

f(x) = max (0,x)

Now, we need to simplify the image data, which the CNN does using a pooling layer.

Pooling layer

With many complex tech systems, such as containers networking, simplifying design is a primary goal because it can improve overall system performance. This is exactly the thinking behind the use of pooling layers in CNNs.

These layers reduce the spatial dimensions of the output handed on to the following layer, which cuts down on the number of parameters and computations needed overall. In turn, this makes the network more robust – and faster.

There are two different approaches commonly used here:

  • Max pooling: for each 2×2 array, the maximum value is retained and becomes the sole output
  • Average pooling: for each 2×2 array, the average of all four values is calculated and becomes the sole output

Opinions vary as to which is better, but it depends on the data you’re dealing with. Think of it as reducing the resolution of an image by squashing each 2×2 group of pixels together into a single value. In essence, that’s exactly what’s happening.

Fully connected layer

After the series of convolutional and pooling layers have done their bit, CNNs often have one or more fully connected layers where every neuron is connected to every neuron in the previous layer, playing spot the difference. In other words, the purpose of this last layer is to take the information generated by the earlier layers and categorize the features of the image.

The connections between the neurons have what are known as weights. These weights are used to determine the importance of different features extracted by the convolutional layers. They’re adjusted during training to minimize prediction errors.

The fully connected layer uses the softmax function to create a probability distribution around which class the image will likely belong. At the end of which, you have your final output.

How to design your own deep convolutional neural network

Let’s get to grips with how to design your own deep convolutional neural network. First thing first: what are you aiming to use it for?

For basic tasks

If you want your CNN to carry out simple tasks – like recognizing the number “2” in the earlier example– you won’t need an overly complex build. Start with the basics and work your way up. To begin with, have just one hidden layer with around ten kernels and one max pooling layer.

That’s plenty to get you off the ground. Remember that you can always add more layers as you go.

For more complex jobs

Clever leaders know that using tried and trusted solutions improves employee morale. So, if you want to use a CNN for a complex purpose, do yourself a favor and use pre-trained networks.

Luckily, there is something called transfer learning – using a network previously trained on other data. Yes, it’s possible. You’re basically using the existing network as a template, swapping in the layers you want, and adding your dataset. It speeds things up considerably.

Using GitOps principles to develop a CNN for your AI

Using a GitOps model can be helpful when you’re developing your CNN. Here are a few tips for how to approach the design and training aspects:

Version control for model definitions

First, store the architecture of your CNN, hyperparameters, and any other relevant configuration in a Git repository. It would be best if you made any changes to the design or structure of your model through a Git commit.

Automated training pipelines

Remember to set up Continuous Integration (CI) pipelines that trigger model training whenever a commit to the repository exists. You’ll also need to use Continuous Deployment (CD) to automatically deploy trained models to a staging or production environment.

Monitoring the training procedure

Monitor the training process as it goes along. Once the model is deployed, monitor its performance in real-time. If you find that the model’s performance degrades or feel the need to retrain your network with new data, you can use a new Git commit for this.

Reproducibility

The great thing about storing everything in Git is that every state of your model is versioned, from the basic architecture to any trained weights. This means experiments are reproducible – and you can roll back to previous states if you have to.

Infrastructure as code

If you’re using cloud resources like Magento cloud hosting or specific hardware for training and deployment, you can also version the infrastructure setup using tools like Terraform or Ansible. This will mean that the infrastructure setup remains consistent and reproducible.

Getting started with deep convolutional neural networks

As with anything else in tech, the best way to learn is by doing. Luckily, there’s a lot of information to help guide you and plenty of options for simplifying the process, such as using a transfer learning approach.

Deep convolutional neural networks are turning out to be so incredibly useful for optical recognition applications that you’ll likely discover all kinds of handy uses for them in your work. So it’s worth spending a little time and effort to begin getting to grips with developing one for your AI.

It’s time to take back control of the job you love by embracing the power of automation.