Search
  • Riya Manchanda

How Do Artificial Intelligence Neural Networks Work? Reading Between the Lines



Artificial Intelligence is undoubtedly the need of the hour and has certainly gained unprecedented popularity in just a matter of a few years. When one thinks of Artificial Intelligence, Machine Learning immediately comes to mind. In this post, we will be discussing one of the very potent branches of Machine Learning: Neural Networks, and going a little bit in-depth about thow it actually works.


If you wish to learn more about the other types of Machine Learning, I suggest you read up on a previous post of mine dedicated entirely to Artificial Intelligence.


Table of Contents:

  1. What are Neural Networks?

  2. Types of Neural Networks

  3. How Do Neural Networks Work?

  4. An Example of A Neural Network

  5. Applications of Neural Networks



What are Neural Networks?


Neural Networks are artificial intelligence systems meant to replicate the computation and functioning ability of an actual human brain. The basic building blocks of such as system are the artificial neurons, which seek to mimic the biological brain cells inside us mammals.


The primary motive of designing such systems is to artificially achieve the problem solving, decision making, and pattern retention capability of an average human being. Considered to be one of the most powerful forms of Artificial Intelligence, this branch of Machine learning proves to be of great advantage when one attempts to model and interpret non-linear relationships (as are most practical real-life situations). Additionally, NNs are also known for being adaptive, self-organised, responsible in terms of filling out missing gaps in knowledge.


Primitively deriving from the lucrative branch of Machine Learning, Neural Networks substantially rely on algorithmic computation, and on repeated training with multiple datasets to minimize error and improve accuracy. Unlike other instanced of Deep Learning systems, Neural Networks are assumed to have no limit in terms of progress.


Similar to the human brain, Neural Networks follow a complex architectural pattern which allows them to find utility in areas and fields where other computation systems fail. Each neuron is known as a node (which is essentially a processor) in this framework. At a very baseline stage, such a typical framework generally consists of three distinct levels, each comprising of their sets of nodes or artificial neurons:


  • Input Level

  • Output Level

  • Hidden Level


Each of these layers have their own distinct function and purpose, which we will look into more later in this post. All of the layers of a Neural Network are required to work synchronously in order to obtain optimum speed and to recognise patterns from raw data, and classify it. It is therefore crucial that these systems are designed with utmost care.


We will now be diving deeper into the various different Neural Network classifications in order to develop a better understanding of each of their functions.



A Brief History of Neural Networks


Neural Networks find their origin somewhere around 1943, and began as an idea revolving around accurately interpreting the functioning of the human. Understanding the cognitive thinking processes was an extremely important concept around the 1940s. The starting point considered for the idea behind artificial networks, is when two academic researchers, published a paper about the processing of biological neurons.


Neurophysiologist Warren McCulloch and Mathematicians Walter Pitts simulated the functioning of the cognitive neurons using an electrical circuit. Such an artificial model of intelligence was termed as 'connectionism'.


Taking this phenomena further, Donald Hebb wrote a book suggesting that Neural Networks harbour the ability to learn and improve over time, by quantifying the stimulus it is exposed to. This was considered the starting point for this fascinating branch of Machine Learning.


Not more than 10 years later, researchers at the Massachusetts Institute of Technology successfully accomplished their goal of developing the first computational system which can mimic human intelligence, in around 1954. Following that, short in 1958, Cornell psychologist Frank Rosenblatt came up with the proposition for the first perceptron, Mark Perceptron I, which could quantify the complex cognitive thinking process of a brain, and made use of the now widely-utilised activation-threshold concept.


And that is just a brief picture of how it all began!



How do Neural Networks Work?


As mentioned before, Neural Networks, which are the basis of Deep Learning, utilise algorithms and pre-existing data to train, learn, and improve. Since their main purpose is to model non-linear relations, naturally the algorithms involved in their development make use of the Linear Regression concept. Linear Regression models are basically functions which take an input, as the independent variable, and attempt to find the dependent variable for that particular input, based on a line of best fit.


Just like any complex framework, the heart of Neural Network algorithms lies within its individual nodes. Each of these individuals is a mathematical function (you guessed it, a Linear Regression model), which works by assigning a weight value to their inputs, and determining the output value for the next layer.


Before we jump into the technicalities, let us understand the aforementioned components of a Neural Network, which together form a perceptron (a perceptron is a system which takes in a binary input, and returns a binary output):

  • Input Layer: The layer that receives data input (the data on which to perform the computation).

  • Output Layer: This is the layer that determines that output, and certainly returns it.

  • Hidden Layer (s): A hidden layer is any layer in the network which performs computation, besides the input and output layers.


Image Source



Now let us look at how this system works step-by-step, first each bit of data is input into the input layer (1 bit in one node). Depending on the purpose of the networks, each input node is connected to the nodes of the first hidden layer through something called a 'channel', which contains a certain weight value to it (these weight help identify the importance of each variable). Each of the nodes in the hidden layer also contain a value, called 'bias', which acts as an extra parameter for performing the computation, shifting values to left or right, and distinguishing the nodes.


The data from the input node travels through the channel and goes into the hidden layer. The value of a node in the hidden layer is now the sum of all initial data values entering into the node multiplied by the weight of the channel, and added to the bias of the node.


Next the data is computed with a threshold function before moving on to the next neuron, to decide which data actually needs to transmit (to classify important and unimportant information). The function basically has computes a value and then compares it to a threshold, and the neuron is only 'activated', which means it passes data, if the value exceeds the threshold. This is the reason the function is also known as an 'activation' function, since it decides which neurons transmit data to the next layer.


The same process repeats for all of the hidden layers, until the data reaches the output layer. The numerical value in the output layer decides the qualitative output which is favoured the most by the data, thus leading to a definite decision. This process, where data transfers forward in a network to reach to a conclusion, is known as forward propagation.


Moving on, let us take a look at the training process for such a Neural Network. Initially the network is fed with an input, along with a pre-determined output by the engineer, so that the output from the Neural Network can be compared to the actual desired output, and the adjustments can be made in the channels (the weight values) between the Hidden Layers accordingly. The error values are determined and are use to decide whether the shift should be greater or lower, and how large it should be. The error data travels from the output layer to the input layer, and this system is thus known as the Back Propagation Algorithm.


In a similar manner, multiple such inputs and outputs are fed to the Neural Network system, until its level of accuracy increases in terms of obtaining the desired output from the specified weight values. This is how a Neural Network is trained to fulfill its purpose.



Types of Neural Networks


The complexity of Neural Networks have allowed them to branch out into distinct types, broadly into 3 categories. Different categories of these frameworks are configured to achieve different purposes, and as a result have their own structural formation and engineering methodology. These 3 categories are namely:


  • Artificial Neural Networks:

  • Recurrent Neural Networks

  • Convolutional Neural Networks


However, there are various other types of Neural Networks as well based on the above three categories, which perform their own function guided by their very own principles:


  • Feedforward Neural Network

  • Radial-Basis Function Neural Network

  • Recurrent Neural Network

  • Modular Neural Network

  • Convolutional Neural Network

  • Sequence-to-Sequence Neural Network


These different types of Neural Networks use different mechanisms to operate, and are all used for different purposes. Let us explore each one in detail:


Feedforward Neural Network

This refers to a simple Neural Network with forward propagation, with no repetitive iterations or looping. This is the most basic form of Neural Network.


Radial-Basis Function Neural Network

The Radial-Basis NN is just another simple Artificial Neural Network, except that uses Radial-Basis functions as the activation functions. This refers to a function which returns a value based on the distance between the input and some fixed point or value.


Recurrent Neural Network

Recurrent Neural Networks are slightly different from FNNs, in the sense, that propagation of data is not unidirectional, it runs in a loop instead. This causes RNNs to have something called a long short-term memory, which is very useful for dealing with sequential data. Since information cycles through an RNN, the nodes retain information of the data that came through the before. Before making a decision, or deciding the output, it considers the previous information it had, thus making it more accurate.


Modular Neural Network

Modular Neural Networks are not really a separate category of Neural Networks, but in fact a different orientation. It is basically a combination of networks integrated to a single one, where each specialise in performing a different part of the problem, much like the division-of-labour concept which is used to portray the harmony between the organ systems inside of a human body. The various inputs form all the different NNs are taken and the final output is generator by something called an 'Integrator'.


Convolutional Neural Network

Convolutional Neural Networks are primarily used for processing and computation concerning images. Convolutional Neural Networks are basically NNs which take in each bit of an image as input, assign importance to it in terms of identifying the image, and then differentiate it from other images.



An Example of A Neural Network


An excellent example of Neural Networks is the Google's Voice Search Algorithm. Google makes use of Long Short-Term Memory Neural Networks, i.e, Recurrent Neural Networks in order to implement its Voice Search.


According to Google, the biggest challenges they faced during this implementation is that speech recognition is just about recognizing individual sounds in audio, but about identifying how these sequences of sounds form existing words, and whether these words combine logically to make sense in the English language. For this decided to separate their modelling into two, one as the acoustic model, and the other to be the language model.


An interesting thing to know at this point would be how Google trained and made proficient, this algorithm of theirs. Training their acoustic algorithm was not the problem, since could do that with so much of preexisting data out there. But problem came down to training the language model, for which there was not a lot of useful data available (the data required was naturally spoken text).


For this they constructed an iterative pipeline, where they used the voicemail data donating by their users to train their language algorithm. In order to respect their users' privacy, they decided not to allow any human to listen to the voicemail, which increased the scope of error in the uncensored data. They used their new and improved acoustic algorithms to obtain fresh data for the language model to be trained with, and as the acoustic model kept improving, it kept producing newer and better versions of the same data which were then used for training the language model. This way, the error in their model dropped drastically!


They also took care of problems like differentiating between useful user acoustic input and background gibberish, and inserting grammar and punctuation into the text perceived by the language model. The RNN is a big step-up from the Gaussian Mixture Model (used by Google in 2009) and the Deep Neural Network (2012) which were used then for the same purposes. They are constantly improving the technology and setting an example for all the up-and-coming Neural Network start-ups globally.



Applications of Neural Networks


By now, you have probably fathomed the immense scope that Neural Networks open us up to, and that you can start to imagine and brainstorm the various extensive applications it could have in real-life. The primary selling-point for Neural Net Technology is that it can successfully model non-linear relationships which has great utility especially in Science.


Presently, there exists many software products which make use of Neural Network Artificial Intelligence. The following is just a brief glance at some possibilities.



Speech and Text Interpretation

Neural Networks have vast usage in the Semantics and Linguistics field, including areas such as text-to-speech conversion, spell-check, speech recognition, semantic classification, paraphrase detection, etc.

The text detection principle from images basically works by identifying plausible areas of text in the image, and then identifying patterns in the region to detect the text. Proposals have been made to build Convolutional Neural Networks along with the long short-term memory model for making Language Detection more accurate.

Neural Text to Speech is another interesting concept, which generates human-like voice artificially. According to RedSpeakerAI, they essentially put together three NNs in order to achieve, one as an acoustic model (to give timbre to the voice), one as the pitch model (to set the tones and variances while speaking), and the last as the duration model (to recognise the duration for each syllable or phoneme while speaking).


Facial Recognition

One very popular application of an ANN is Facial Recognition. The Network is first trained to identify a particular face, and then, using an algorithm called the eigenface algorithm, the network is able to distinguish between human figures. The eigenface algorithm basically depends on converting a face into an eigenface consisting of computational vector (eigenvectors) which the identifies the dimensions and distances between the facial features of the image, and compares it with the data fed to it previously.


Data Mining

Data Mining is the conversion of raw data into useful information, something that Neural Networks excel at performing, due its pattern recognition abilities which were discussed previously. This application is an especially purposeful business tool to analyse market and customer trends, and identify important strategies going forward.


Weather Forecast

Weather Forecast also depends greatly on quality analysis of the available data. According to a research conducted, prediction of future weather is done by identifying previous patterns using a combination of parameters such as temperature, precipitation, wind speed, pressure, humidity, dew visibility, etc. The data for these parameters is used to train the Neural Network using the Long Short-term Memory technique, in order to recognize relations between the parameters and make predictions for the future.


Space Exploration

Space Explorations also exhibits multiples uses of Neural Network Technology, for instance navigation, planet identification, recognising signs of life, and making distance-related predictions.

Space agencies like NASA have began making extensive use of Neural Networks, one instance of which is the Exoplanetary Atmospheric Retrieval. Neural Networks have also been used by NASA to simulate complex systems of celestial bodies with a high level of accuracy by recognising previous relationships. One very useful property of Neural Networks when it comes to Space Exploration is its ability to persist even without all the required information (since it is a form of intelligence).


 

Conclusion

Great Job, you made it through the entire post! Hopefully by now you have a better understanding and a deeper curiosity about the magnificent field of Neural Networks, and are considering learning more about the same. I am sincerely grateful to each of you who made it so far, and if this post really helped, I would request to let me know and motivate me by dropping your likes and comments! Before I sign off, I would like you to remember that Artificial Intelligence is a multidimensional field with a lot of scope in the future, so do not hesitate to learn more about, maybe you will find something that suits your interests!


Until Next Time ~

5 views0 comments