Everything you’re afraid to ask about machine learning
Artificial Intelligence is evolving by leaps and bounds, and is currently one of the most complex sciences. When we refer to complexity, we are not talking about the level of difficulty in understanding and innovating (although certainly in this case it is quite high), but we are referring to the degree of interrelationship with other apparently unconnected fields
Types of machine learning
There are two “schools” of thought on how an AI should be constructed appropriately:
- The Connectors: they start from the assumption that we should draw inspiration from the neural networks in the human brain.
- The Symbolists: prefer to move from knowledge banks and fixed rules about how the world works.
Moreover, this different way of looking at things is leading to completely different problem-solving strategies: a problem can be solved through a simpler algorithm, which increases its precision over time (iteration approach), or the problem can be broken down into smaller and smaller blocks (parallel sequential decomposition approach).
To date, there is no clear answer about which approach or school of thought works best, so it is best to briefly discuss the main advances in both pure machine learning techniques and neuroscience with an agnostic lens.
Machine Learning techniques
Machine learning techniques can be roughly divided into supervised and unsupervised methods, with the main difference being whether the data are labelled (supervised learning) or not (unsupervised). A third class can be introduced when talking about AI: reinforcement learning (RL). RL is a learning method for machines based on the simple idea of reward feedback: the machine actually acts on a specific set of circumstances with the aim of maximizing potential future (cumulative) reward. In other words, it is an intermediate method of trial and error between supervised and unsupervised learning.
RL is often accompanied by two major problems, namely the problem of credit allocation and the exploration and exploitation dilemma, in addition to a number of technical issues such as the curse of dimensionality, non-stationary environments or partial observability of the problem. The first relates to the fact that rewards are, by definition, delayed, and a number of specific actions may be required to achieve their objective. The problem is then to identify which of the preceding actions was actually responsible for the final outcome (and thus obtain the reward), and if so to what extent. The latter problem is instead a problem of optimal search: the software must map the environment as accurately as possible to find out its reward structure. There is a problem of optimal stopping, a kind of satisfaction in effect: to what extent should the agent continue to explore the space to look for better strategies, or start exploiting the ones he already knows (and knows they work)?
In addition, self-learning algorithms can be classified according to the results they produce: classification algorithms; regressions; clustering methods; density estimation; and dimensionality reduction methods.
Neuroscience in machine learning
The standard architecture of any SNNA consists of having a series of nodes arranged in an input layer, an output layer and a variable number of hidden layers (which characterize the depth of the network). The inputs of each layer are multiplied by a certain connection weight and added together, to be compared with a threshold level. The signal obtained through the addition is passed to a transfer function, to produce an output signal which, in turn, is passed as an input to the next layer. Learning occurs in the multiple iterations of this process, and is calculated quantitatively by choosing weighting factors that minimize the input-output mapping error given a given set of training data.
They are also often referred to as Deep Learning (DL), especially where there are many layers performing calculation tasks. There are many types of ANNs to date, but the best known are Recurrent Neural Networks (RNN).
Artificial neural networks are a biologically inspired approach that allows software to learn from observational data – in this sense they are sometimes said to mimic the human brain. The first ANN called Threshold Logic Unit (TLU) was introduced in the 1940s by McCulloch and Pitts (1943), and later, forty years later Rumelhart and others (1986) pushed the field by designing the retropropagation training algorithm for forward multi-layer perceptions (MLP).
RNNs use sequential information to make accurate predictions. In traditional ANNs, all inputs are independent of each other. Instead, RNNs perform a certain task for each element of the sequence, maintaining a kind of memory of previous calculations.
The truth is that the DL is certainly a big step forward towards the creation of an AGI, but it also has some limitations. The fundamental one is the exceptional amount of data needed to work properly, which represents the biggest barrier to a wider cross-sectional application. DL is also not easy to debug, and problems are usually solved by feeding more and more data into the network, creating a high dependency on data.
The need for data is going to take a considerable amount of time to train a network. In order to optimize time, networks are often trained in parallel, either by dividing the model between different machines on different GPU cards or in different data cubes through the same model running on different machines in order to adjust parameters.