AI, Machine Learning and Deep Learning

AI or Artificial Intelligence is not new but it has generated a lot of interest and exitement over the last years. What drives the use of AI right now:

  • Availability of sophisticated algorithms
  • Big data
  • Abundant computing and storage resources

A few computing tasks have made incredible progress due to the use of AI, using models that were pre-trained using large amounts of data:

  • Object detection, object recognition, image classification in computer vision.
  • Language translation, sentiment classification in natural language processing.

In this post, I'll explain what AI is all about and how it relates to Machine Learning and Deep Learning

Artificial Intelligence

AI is the ability of machines, typically a computer, to perform tasks that would usually be done by humans because the execution of these tasks requires human insight or intelligence. AI is already around since 1957 with a varying degree of success. But over the years it always had quite some traction: from first demonstrations of problem solvers to the use of expert systems in specific domains around 1980. At the end of last century, the world champion in chess was defeated by IBM's Deep Blue. But it is Moore's Law and the availability of data that removed the last barriers to use AI. Currently, AI is considered a viable option for many new application areas.

Nowadays, when people refer to AI, they implicitly mean Machine Learning, which is a subset of AI. Machine Learning is the ability to learn from data and make predictions for previously unanalysed data using already gathered and processed data. This implies that no explicit programming is needed. Deep Learning is a special case of Machine Learning using deep neural networks.

Let’s first explain the difference between programming and machine learning.

Programming versus Machine Learning

Computers traditionally execute tasks by running a program or application as a sequence of operations of which the execution order is explicitly specified. A program describes how a set of inputs is transformed into a set of outputs. It is the task of programmers to define the appropriate sequence of operations using a programming language. The result is then automatically translated into a sequence of low-level instructions that a computer can execute.

Machine learning works quite different: data is used to learn from. Data, of which you know the relation between input and output, are used to create or train a model that can predict the result for a combination of inputs that hasn’t been encountered before during training. The resulting model can predict an output when presented with new input.  In Machine Learning, the inputs are often called features and the outputs are called labels. You can say that you train the computer how to label new data based on data that the computer has already seen.

Summarizing, the difference between a writing a program and applying machine learning is:

  • With programming, the instructions or the rules are explicitly written down to produce the desired result.
  • With machine learning, the rules are discovered from existing data and applied to new data to predict the outcome.

Machine Learning is therefore very attractive for computer tasks for which it would be extremely difficult or time-consuming to explicitly write the execution order using a programming language.

Machine Learning

Machine learning can be partitioned in 3 types of learning: supervised learning, unsupervised learning, and reinforcement learning. Each type can be implemented using different types of algorithms.

Supervised Learning

Data with a known relation between input and output are used to train the system: the system learns the relationship or the function between the input and output. The result is used to predict the output for inputs that the system hasn’t seen before.

Supervised learning can be partitioned in classification and regression. If the learning task is about determining the label of an output, that is selecting a discrete value, it is called classification. The objective of classification is to predict the discrete value of an output correctly with high accuracy. As an example, a system is trained using many images of different apples and other fruit, each image labeled apple or other respectively. After training, the system can predict for a new image whether it is an apple or not.

If the learning task is about calculating or determining the value of an output, that is determining a value in a continuous range, it is called regression. The objective of regression is to minimise the value of the error when predicting the output value. As an example, the system is trained on the price of houses using features like area size, house size, location. After training, the system can provide a price for other combinations of these features.

Examples of applications where supervised learning is very useful are: price prediction, object detection in computer vision, spam detection, signature recognition, speech recognition,  customer churn prediction, fraud identification and repair prediction.

Unsupervised Learning

Unsupervised learning is used when there is no explicit output value for input data: the learning task is to find patterns in the input data. Unsupervised learning is considered more challenging than supervised learning.

In the figure below, the left-hand side depicts a number of samples and the right-hand side shows the 4 groups or clusters found through unsupervised learning. Only 2 dimensions are used to demonstrate the concept. In real-life examples, there can be more than 100 dimensions across which the samples have to be clustered.

Clustering samples using unsupervised learning.

Examples of applications where unsupervised learning is very useful are: finding associations in data, feature extraction (which is then useful as starting point for supervised learning), data visualisation and anomaly detection.

Reinforcement Learning

Reinforcement learning is a method where the model learns from an environment that behaves according to a set of rules. The model interacts with the environment through an agent. The agent has a state and can take actions that change its state and it will get rewarded or punished for that action pending whether it is a desired action or an undesired action.

Examples where reinforcement learning is useful: robotics, business strategy planning, machine control, autonomous vehicles, complex load-balancing problems like the electricity grid and playing games.

Deep Learning

Deep Learning is a subset of machine learning. It is based on artificial neural networks, which tries to mimic a brain. The figure below is an example of a neural network with one hidden layer. Each neuron or node computes a value based on its weighted inputs (each connections has a weight parameter) and a bias. The result is then mapped to an output value. Deep learning implies that there are at least 3 hidden layers. A key difference with other types of machine learning algorithms is that the network can also extract the features as part of the learning process and it requires less human intervention to learn.

Example of a neural network

Deep learning has triggered a lot of interest in the use of AI. It has been successful in solving complicated tasks for which it is impossible to write the instructions by hand. Examples are object detection in computer vision, playing games and understanding natural language. Larger networks (up to billions of parameters) that have been trained with huge data sets (up to multi-million records) to get to amazing results.

There are many types of artificial neural networks:

  • Convolutional Neural Networks (CNN): often used for computer vision tasks.
  • Generative Adversarial Network (GAN): two neural networks of which one generates new data instances and the other evaluates the new instance for authenticity. GANs are used for image, video and voice generation. It is this type of network that can generate new faces, landscapes, and art.
  • Recurrent Neural Networks (RNN): used for sequential data (for instance data with a time stamp).
  • Graph Neural Networks (GNN): many problems can be modelled as a graph and this type of network exploits this.

Leave a Reply

Your email address will not be published. Required fields are marked *