Machine Learning in a nutshell

Sebastian Orozco Marin


Machine learning is a branch of artificial intelligence that studies how machines can learn without been explicitly programmed making use of data, mathematics, algorithms and statistic, in simple words using the before mentioned tools a computer is able to predict an outcome. For example if you have a dataset that describe the price trough the years of the houses according the sector, the bedrooms and other variables we can use this data to predict an price for the houses for the upcoming years. This is a naive example but today there is more advanced uses like identifying a person according to the measurement of the face characteristics on a image or even we can have self-driving cars that are analyzing data from sensors and cameras to drive safe without a human in charge.

We may think this technologies are very new but in fact they were developed back to 1959 but couldn’t be implemented due to the lack of computational power, but nowadays we count with powerful tools that allow us to get big amount of data and process it in a efficient way allowing the machine learning techniques to flourish during the last years. When we say learn, is not exactly the same as a human, our way to learn is way far more complicated.

Which is the process for a machine to learn?

The base for a machine to learn is to use reliable data based on experience, the better the data more accurate is the result, this data can come from analyzing the internet, the reading of a sensor, images, polls, etc. Then this data is divided into two categories one to train and other to test. Later, we need to have a deep understanding of the task we want the machine to learn, so the best technique can be chosen to obtain the best results. The next step is to train our machine learning algorithm and then testing to see the accuracy of the predictions. If the accuracy is not the correct we need to optimize our model and do the whole process again until we reach the desire outcome. I would like to clarify that this learning process is very susceptible to changes on the task for example a machine that is trained to walk in a perfect way over a plain surface will do it incredibly bad on a irregular surface and this is the main difference with humans. Machine learning can be done for very specific tasks but it doesn't has the ability yet to adapt to different changes that weren’t taught.

What are the different techniques for a machine to learn?

Now lets talk about the different techniques that we have for a machine to learn. This techniques are a mix of algorithms that are computer programs that executes a task using mathematical like calculus and linear algebra and statistics. There is an active research field on this techniques where constantly new methods are discovered. This techniques can be group into three large areas: supervised learning, unsupervised learning and reinforce learning.

Supervised learning.

Supervised learning is a branch of machine learning and belongs to something called statistical learning. As we have mentioned before, the way the machine learn is through data so for a supervised data learning technique the computer should receive information that is labeled with his desired output. For example, we want a machine to learn to detect a cat from an image so the data that the computer receives should say when an image belong to a cat or not, this is called a label, in this way the machine can learn from the different examples looking for the minimum error to produce a function that can be used with new images never seen before and detect with precision when a cat is present. We have several methods that uses supervised learning, the most widely used methods are the following:

  • Linear regression
  • Support vector machine
  • Logistic regression
  • Naive Bayes
  • Linear discriminant analysis
  • Decision trees

The supervised method algorithm.

The supervised method is basically based on the calculation of the error between the desired output and the calculated output. Usually this error is used to calculate constants that are relevant for the precision of the learning so doing an iterative process of calculating the error and then according to that error compute the a new set of constants that tend to minimize the error, optimal results can be found. This can be done using different mathematical techniques like gradient descent, we can think of gradient descent as if you where in a top of a mountain and you can calculate the fastest and better way to reach the base of the mountain. The base of the mountain is the minimum error and the best path is the iterative process to find that minimum based on the fastest slope of each piece of the path.

Unsupervised learning

In opposite to the before mentioned method, this uses data that is not classified into categories or not labeled. For example, let’s think this data as big pool of characteristics of animals all mixed up and raw so we can not say from which specific animal those characteristics are. The main function for this algorithms is to find common patterns on this data and classify them according to this . This type of methods are very valuable because the amount of unlabeled data is by far more abundant that the labeled. The raw data can be seen as coil that is abundant and powerful but is not very valuable, so with this type of algorithms we can classify this data, find common patterns and shape them to built very value diamonds that can be very efficient to construct supervised learning algorithms. The most widely used are methods are:

  • Hierarchical clustering
  • k-means clustering
  • k nearest neighbor
  • principal component analysis
  • singular value decomposition
  • independent component analysis
  • Deep learning

Now lets take a more detailed look on the deep learning algorithm because is one of the most famous and researched fields today.

What is deep learning?

Before we dive into the deep learning concept we need to introduce a neural network. Neural networks are computer systems inspired by the construction and how a human brain works. Neural network are based on a collection of unit or nodes called perceptrons, this units are called artificial neurons. This perceptrons receives several inputs called weights that represent the relative importance of the input and using a cost function convert them into one single output that also feed other neurons creating a simple net trying to simulate how the brains neurons are interconnected. we can see a simple representation of a perceptron or neuron in the following image:


The perceptrons are formed by two function the propagation function that is in charge to take the inputs and create a sum of all of his terms with his respective weights this inputs can be documents, images, or simple numbers. the other function is activation function that is in charge to take this sum and create a simple output based on a mathematical function like a step, a sigmoid function an others and this function affects the global result and needs to be choose carefully according to the problem. In the last image we can see a simple neural network with one layer of input and other layer of output and none inner layers called hidden layers. when we start to have a neural network with multiple hidden layers we can say that we have a deep learning network as show in the following image.


Now we know what is a deep learning network is time to have a very wide idea of how this can be used to learn something. So as you see there is a nest of paths according to the inputs we have those inputs are processed by each neuron and it produces a weight for each path. So inputs are processed from the first layer until we get an output. With this result an error is calculated to proceed to make a back propagation algorithm an calculate all the weights according to the result until we reach the input layer. This process is done in a iterative way until the minimum error is found, so at the end the deep learning network would learn the most efficient path to solve the specific problem. This specific path is the path with the strongest weights.

Until this day deep learning algorithms are intensively researched and they are generated very interesting and applicable results because if we add more hidden layer or change the configuration of the interconnection and use different activation functions the complexity of the problem that can be learnt is bigger, also there is a lot of research because scientist can’t fully understand what happens in the hidden layers. Deep learning algorithms are been used in many different areas from research, education and commercial field achieving very interesting results.

Reinforcement learning

This field is the newest on the block and uses Markov decision processes and dynamic programming to take unlabeled data and create a balance between the exploration of this raw data and the amount of knowledge that can be retrieved from it making this a type of learning something that can make decisions according to the rewards that are set. In simple words this algorithms can learn from trial and error and not directly from a dataset already created as in supervised and unsupervised learning. For example in this type of learning different supervised learning algorithms can be faced and the one that performs better will be the survivor. This techniques are the tip of the iceberg when we talk about machine learning and are a field of great research with a very promising future. For example this kind of learning was used to train the machine that beaten the go world champion. Go is considered one of the most complex games due to the big amounts movements that can be done in one play. The machine beaten the champion 3 to 1.

As we can see machine learning is a very wide field and the intend of this document is just to give a brief introduction and create a general panorama that can be understood for any person with interest in this theme and no background.