Deep Learning: The Fastest Growing Tech. in the World

Rajtilak Pal
14 min readJul 21, 2020

Have you ever wondered how cars drive by themselves? Have you ever wondered how Siri / Cortana respond to your commands sarcastically sometimes? Well, all these benchmark applications have a core algorithm embedded in their systems which are trained by human behaviour. Those who are familiar with the term “Artificial Intelligence” will say that the algorithms driving these benchmark applications are “Machine Learning” algorithms and particularly “Deep Learning” algorithms. But, whenever I hear the term Deep Learning, the image of a neural network pops up in my mind. But have you ever wondered how these clever networks learn? How these networks are built? Or, even, how does it all make sense to any of us?

In this blog, I am going to give you a little tour of the world of Neural Networks, how they work and what they are made up of. So hold on till the end to get the utmost experience of this world.

Table of Contents:

· Perceptron and its resemblance with Human Neurons

· A Peek inside Perceptrons

· The major fault with Perceptrons( First AI Winter )

· Backpropagation: Algorithm that drives the learning of a multi- layer neural network architecture

· Sigmoid neurons

· How multi-layered architectures work?

· Types of Neural Network Architectures

· Why Artificial Neural Networks have received a boom in the recent years?

· Recent advancements

Perceptron and its resemblance with Human Neurons

Human Beings are the most intelligent creatures in the Universe(as far as I know). So, instead of creating learning algorithms from scratch, it would be nice to know what algorithm works inside the human brain which gives us so much learning power. If we can figure out what algorithm works inside the brain, we can then mimic it to create a program which can Learn, Walk, Talk, Reproduce by itself and be aware of its own existence. Well, those were some heavy words but this is what humans have been trying to do from the last century.

Our Nervous System is made up of cells known as neurons. These neurons are the core building blocks of the learning algorithm that works in our brain all the time. So, people have been trying hard to see if we can mimic a neuron and build a computer program which can learn on its own. These neurons are called Artificial Neurons.

The first Artificial Neuron was invented by Frank Rosenblatt in 1958 and is called Perceptron. Before diving into perceptron, let’s see how our own human neurons look like.

A Peek inside Perceptron

Human Neuron

This is what a human neuron looks like. Well, I am not a biologist to explain every part of it but I do know how it works. This cell has a cell body, dendrites and axon terminals. The dendrites are the little connections that help this neuron connect and receive information from other neurons. These dendrites are in millions for a single neuron. Then, according to the information received from the neighbouring neurons, an electrochemical reaction takes place in the cell body and the neuron goes either to its active state or its inactive state thereby firing neurotransmitters from the axons (Please do forgive me if I have written something wrong in the above explanation).

Computer Scientists have tried hard to mimic this neuron and Frank Rosenblatt came up with structure of Perceptron which resembled similar behaviour.

This perceptron is just a simple mathematical function which takes its inputs x1, x2 and x3 and gives back an output y which is either 0 or 1 representing the fact that the neuron is active or not (Similar to the human ones). But we do see w1, w2 and w3 in the picture. What are these? These are the weights of the perceptron. But wait, Weights? What do we need them for?

Well, human neuron dendrides have changing size, they can either become thick (indicating strong connection with its neighbour) or thin (indicating weak connection with its neighbour). To mimic this behaviour, Rosenblatt came up with the idea of weights.

Many people get scared and uninterested when they see maths in the story but, trust me, the calculation is very very simple.

We discussed that perceptron is just a mathematical function whose Inputs are x1, x2 and x3(for this case) and output is either 0 or 1 and it has some weights.

The inputs are multiplied with their corresponding weights and are added up. Now, if this addition results in a sum greater than a Threshold value, the output is 1 otherwise 0.

The function is mathematically written as:

But, the way a perceptron learned was not a good one. We had to manually tweak the weights and Threshold in order to get the right outputs for the right inputs. Of course we could add more inputs and weights in the perceptron( say there are 100 inputs), but in doing so we are troubling ourselves because we have to tweak the 100 weights manually which becomes very hectic very quickly.

The fault in Perceptron

The major fault in perceptron was picked up by Marvin Minskey in 1969. He said that Perceptron was very good in picking up linear data but it cannot approximate the XOR function which means perceptron cannot capture non-linear relationships.

This was a heart-breaking moment for Artificial Intelligence and was called the “First AI winter”. The interest of researchers in the field of AI went down and AI went to sleep for 10 years.

Of course, if multiple perceptrons could be stacked together in a layered architecture, they could capture the non-linearity. In fact, given enough number of layers in the multi-layer perceptron architecture, it could generalize to any function. This is called “The Universal Approximation Theorem”. But multiple layers would mean hundreds of weights and tweaking all of them to get better results was a pain in the a**. This is why the interest of researchers in the field of AI went down.

Backpropagation: Algorithm that drives the learning of a NNet Architecture

In 1986, Geoffrey Hinton published a paper “Learning internal representations by error propagation” which was a landmark in the field of AI and this marked the end of the “First AI Winter”. His paper included this algorithm called Backpropagation.

Through this algorithm, neural networks could be trained in better way instead of tweaking hundreds of weigths randomly and manually. We now had a systematic approach to the training process. This algorithm raised the interest of researchers in the field of AI and people started working on it again.

Sigmoid Neurons

Now-a-days, the concept of perceptrons are considered to be obsolete because they are replaced by another advanced type of Neuron called the Sigmoid Neurons.

Recall that the perceptrons activated only when the weighted sum of inputs passed a certain threshold. The case was different with sigmoid neurons. There was no need of any threshold in the case of sigmoid neurons. Why?

The weighted sum of inputs was passed to an activation function called the sigmoid squishification function which could squeeze the real number line between 0 and 1. Take a look at the diagram below for better visual representation.

Courtesy: 3Blue1Brown

This means if you passed any real number in the sigmoid function, it would give you an answer between 0 and 1.

Courtesy: 3Blue1Brown

This was helpful because each sigmoid neuron could give you a number like 0.7 which could represent a probability instead of saying 1(YES) or Zero(NO) (in the case of perceptrons).

Other Activation functions

Of course, sigmoid was not the only activation function. Other functions like tanh() and relu() also caught the market.

By the way, Relu() is the mostly used activation in the world now because it does not saturate like sigmoid and tanh neurons(advanced topic when you dive deep into these activation functions.)

So, Multi-Layered Architectures with Sigmoid neurons and their learning guided through Backpropagation helps Neural Networks learn very very sophisticated things which normal Machine Learning algorithms cannot capture.

How Multi-Layered Architectures work?

In classical Machine Learning, the data scientist has to find the relationship between the target column and each feature by himself. For example, in the “Titanic” dataset, we had to work with all the columns and ultimately found out that “Embarked” column was not helping us at all in predicting whether a passenger would survive or not. But in a Deep Learning model, we do not have to check anything. We just push everything to the Neural Network and the Network, by itself, finds out the relationship between features and target. But how does it do it?

Well, Neural Networks are the deepest mysteries of the universe that are ever known. These are Black Boxes which cannot be understood accurately. Nobody ever knows what goes on inside a network. We can identify the input features that are causing a group of neurons to fire but we can never identify what pattern that group of neurons is really capturing.

But from the intuition that is developed by AI researchers over the years, we can say that the Hidden Layers of the network are responsible for guiding the network to pickup different patterns of the dataset.

For example, if the above network was to identify a handwritten digit, we could think that the first hidden layer could identify edges and patterns like these.

The second layer could be thought to combine these lowest level features to some higher level features like identifying loops and curves.

The Third Hidden layer could be thought to piece together the loops and patterns to get the representation of the number itself and the final output layer would give us the number associated with the handwritten digit.

But these are just intuitions and we really don’t know if these are even true or not. These just help us to grasp onto something which helps us understand these things.

Types of Neural Network Architectures:

Dense Network

Till now we have been seeing multi-layered architectures where every neuron of a layer was connected to every neuron of the previous and the next layer. Therefore, these are called Dense Networks. The drawback of these networks is that they are computationally very expensive to train. That is the reason people had to come up with newer architectures. There is also another network called Sparse Network where only limited number of connections are allowed between layers.

Convolutional Neural Networks (CNNs)

If we want to do image recognition, we have to pass every pixel value of that image to our network. In case we were using a Dense network, we would have to have 2,073,600 input neurons for an image of resolution 1920x1080. This is huge and computationally very very expensive to train a Dense network like this. What is the solution then?

People analysed the visual cortex of Cats and came up with this new architecture called the Convolutional Neural Network. These convolutional layers are very helpful in identifying patterns in an image and are widely used in Image Recognition. These convolutional layers are very easy to train and can reduce the size of the image which can then be passed to Dense layers for further identification.

Recurrent Neural Networks (RNNs)

Till now we have been using Feed Forward networks. Feed Forward means information would flow from the input layer to the output layer. But RNNs work a little differently. RNNs are these networks where information can be passed back like a loop.

How is this useful? These can be used to retain a context of a speech. Take this example for help.

“I live in France. I speak fluent _______”

If our network was to predict the last word in the sentence, then our neural network could retain the context that the person belongs to France and since the last word should be a language that he speaks, the last word can be French.

RNNs are kind of old school these days and newer variants like LSTMs and GRUs are more frequently used in these days.

These context retaining networks are used widely in Speech Recognition, Text Generation and many more. And these networks are responsible for those sarcastic answers that Siri / Google Home gives you.

Auto-Encoders and Decoders

Auto encoders and decoders are another type of neural networks which are not so famous but are useful. These networks are very useful in Noise Reduction, Image compression, Dimensionality Reduction and many more.

For Example, this network can be used for watermark removal:

Noise Reduction:

Colouring Black and White Movies and so on:

General Adversarial Networks (GANs)

GANs are these beautiful neural networks which are used to master unsupervised Machine Learning. For example take a look at the pictures below:

You might be surprised to know that none of these people exist in the real world. These are images all generated by GANs. Such detail, right! How does it work?

I am not going to go deep into the working of the network but I can give you a little overview.

There are two parts to this network:

1. Generator, and 2. Discriminator

The responsibility of the Generator is to generate images and the responsibility of the Discriminator is to identify whether the generated image is a real or a fake one. When the Generator network will be able fool the Discriminator by its fake images, the Generator is then ready to generate images that look completely humane.

Why Artificial Neural Networks have received a boom in the recent years?

Neural networks are old technology. Old technology in the sense that most of the architectures were researched in the 20th century but what is the reason that these have boomed up so much from the last decade? Well, there are many reasons but I will give you 2 of them.

1. Availability of Labelled Datasets: Neural Networks are hungry for data. We need large datasets to train these networks to reach humane level performance. But in the 20th century, it was very hard to find labelled data. So, the researchers themselves had to construct the dataset on their own which was very difficult.

With the advent of Internet, the availability increased and massive giants like Google, Facebook are producing enough labelled data to work with.

2. Hardware Speeds: We all know that computers were not powerful enough in the 20th century to train complex neural networks. But with the advent of complex and accelerated hardware, the training time is very very less now compared to those of 20th century.

In fact, the competition which pushed Deep Learning to its current level is the ImageNet competition of 2012. The winner of the competition used 2 GPUs to train their neural network for 12 days to receive 16% error rate which was a landmark in the field of Neural Networks. Guess who this winner was! Geoffrey Hinton. Sounds familiar? This is the guy who brought backpropagation into light in 1986 otherwise Deep Learning would not have been at this height today.

Recent advancements

I have talked enough about the old technologies. Let’s see some recent advancements in this field.

Self-Driving Cars

Google had started working with Self-driving cars from 2015 and other companies like Tesla joined to make their own self driving cars. Today, almost every company is working on this technology.

Realtime Object Detection

YOLO (You Only Look Once) is another Neural Network which performs realtime object detection in an image. It can identify cars, people, and other 1000 classes of objects.

Speech Recognition

I have mentioned Google Home, Siri, Cortana many times in this blog. The core Speech Recognition algorithm is another benchmark in this field. These recognition systems are getting better and better now-a-days. It can identify your voice despite the noise in the background which is so cool.

Video to Video Synthesis

This technology is being used by NVIDIA to create simulation environments for games which look completely real. Very excited to see this tech. working!

Reinforcement Learning

Though I have not talked about Reinforcement Learning at all, it is very important to know the amazing advancements in this field too.

In 2016, Google’s Deepmind created an AI called ALPHAGO which could play the ancient game of GO (popular in China, Korea and Japan). The amazing thing about ALPHAGO is that it defeated the World Champion Lee Sedol from Korea in 2016 which is considered to be a benchmark in the field of AI.

So, what is the future?

Well, we don’t really know yet. Some people think that “AI will overtake humans”. Seeing the performance of ALPHAGO, one might say that the above sentence seems somewhat true but the reality is Superintelligence is decades away.

But the way that Deep Learning is evolving, it seems we might see this technology embedded in every corner of the world in a couple of years. You never know. Let’s say after 10 years you are entering your house and your AI companion (like Jarvis) is giving you a little present because of the care that you have given to him.

Thank you for staying along this far. I hope I was able to give you a good adventure of the world of Deep Learning and Neural Networks which will motivate you to join this field with me.

References:

· Powered by Campus{X} Mentorship Programme( Program Head: Nitish Singh )

· Book by Michael Neilson on “Neural Networks and Deep Learning”

· Youtube Playlist of 3Blue1Brown on Neural Networks

--

--

Rajtilak Pal

Android App, Machine Learning and Deep Learning Integration Enthusiast