Artificial neural networks

Observe, understand, analyze, and try to reproduce: these human behaviors have been repeated for millions of years, whenever we identify a natural phenomenon that we believe can be copied. We did this with fire, when we realized that we could generate sparks from the friction between objects. We did this with airplanes, based on the analysis of bird aerodynamics. And we do this with artificial neural networks, with the intention of reproducing natural, human intelligence.

As we mentioned in our post on artificial intelligence, since the invention of computers, scientists believed that machines would be capable of emulating human thought. However, how could we emulate it if we didn't know how it actually worked?

Thinking about thinking, more specifically about understanding, knowledge, and learning, is one of humanity's oldest pursuits, spanning different sciences, as we explained in the text on neural networks. Follow along with us in this text to learn how, based on advances in understanding how the mind works, it was possible to create artificial intelligence!

Understanding their structures

In the post on neural networks, we recalled the concept of "cell assemblies": groups of neurons that work together as a processing unit. According to Donald Hebb, who proposed the theory, “assemblies” that are repeatedly activated generate structural or metabolic changes in that set of cells, allowing memory to be stored in a stable manner. From there, it would be possible to learn.

Just like biological neural networks, artificial ones also start with the "neuron." The "artificial" neuron was created in 1958 by Frank Rosenblatt and was named the perceptron: a linear classifier that maps input values to an output value.

The perceptron algorithm calculates an output value from a set of input values, which are weighted to express the importance of each input value. The calculation performed by the perceptron is a weighted sum of the input values and takes into account a bias value, a constant term that does not depend on the input values.

The output value of each perceptron is necessarily one of the input values of all perceptrons in the next layer. In this new layer, the perceptrons perform the same calculation, the weighted sum of the previous values, taking into account a bias value—and so on—until reaching the final value of the artificial neural network.

The artificial neural network created by Rosenblatt is a type of single-layer feedforward neural network. Although it was able to "make decisions," it had limited learning capabilities. Let us remember that in order to learn, human intelligence stores memory in a stable manner in "cell assemblies," and by persistently repeating the activation of these "assemblies," there is growth in the volume of connections and the formation of new "assemblies." In other words, for artificial neural networks to learn broadly, they would need to repeat the activity and establish new sets of connections. The calculation flow could not just move forward; it should be able to go back to be repeated and, from there, enable new connections by creating new "assemblies."

This problem was solved with the backpropagation algorithm, described in the article "Learning Representations by Backpropagation of Errors" by David Rumelhart, Geoffrey Hinton, and Ronald Williams. What the algorithm enables is for a signal to be sent backward—or “backpropagated”—through the network, informing how much and in what direction each perceptron contributed to the final error. The signal informs an error sensitivity, that is, how much a difference in values, whether in inputs or weights, would change the output value. This signal is used to calculate the adjustment of the weights of each connection.

With the backpropagation algorithm, multilayer artificial neural networks began to produce much more reliable results, as there was a more refined prediction for handling errors and using this information to the network's advantage. Consequently, it was easier for it to learn.

Therefore, the greater the number of layers and the deeper they are, the greater the learning capabilities of artificial neural networks. Networks with four or more layers can achieve deep learning, which we are accustomed to hearing about.

Artificial neural networks of different types

From the first artificial neural network with only one layer—known as a "simple perceptron"—we have evolved to multi-layer neural networks, which are capable of deep learning.

Nowadays, it is common to categorize neural networks by application and functionality. Let's look at the most commonly used types:

Feedforward Networks → This is the most traditional neural network architecture, with one or more layers. Information flows in one direction only. This type of network is widely used in pattern recognition, image classification, and time series forecasting.
Convolutional neural networks (CNN) → designed to analyze grid-shaped data. Each layer detects visual patterns (edges, textures, shapes), with subsequent layers combining this data to recognize objects. CNNs are widely used in computer vision, being able to detect objects in photos and videos, as well as being used in voice recognition.
Recurrent neural networks (RNN) → These are networks in which information can flow in multiple directions, forming cycles. Unlike feedforward networks, recurrent networks have feedback, allowing information to remain in the network longer, generating "memory." RNNs are widely used in natural language processing.
Generative neural networks → Formed by two "competing" networks: the generator and the discriminator. The first creates the data and the second distinguishes between real and generated data. In the process, both improve. Generative neural networks are widely used for generating realistic images.
Attention-based neural networks (Transformers) → Instead of processing information in a single stream, attention-based neural networks process it in parallel. The attention mechanism allows the network to assign different "attention" to each piece of information and thus generate context. This network model is also widely used to process natural language and is found in most of the artificial intelligence agents we know—ChatGPT, DeepSeek, among others.
Autoencoder networks → Unsupervised networks that learn to encode input data, find its most relevant characteristics, and reconstruct it into simpler representations. They are widely used for data compression and cleaning.

If you want to experience the power of artificial neural networks in your company, contact us to learn how we integrate artificial intelligence into all our solutions. Click here and talk to us!