Neural Network (NN):
It is a series of algorithms that endeavors to recognize underlying relationshipsin a set of data through a process that mimics the way the human brain operates.
- Convolutional neural network => Good for image recognition
- Long Short-Term memory network => Good for Speech Recognition
- It is a mathematical function that models the functioning of a biological neuron.
- It computes the weighted average of its input and passes the sum through a non-linear function called the activation function (such as the sigmoid).
Fig 1. Training an Artificial Neural Network
- The process of repeated nudging an input of a function by some multiple of the negative gradient is called Gradient Descent.
- When there are one or more inputs, you can use Gradient descent for optimizing the values of the coefficients by iteratively minimizing the error of the model on your training data.
- A learning rate is used as a scale factor and the coefficients are updated in the direction towards minimizing the error.
- This process is repeated until a minimum sum squared error is achieved or no further improvement is possible.
Fig 2. Gradient Descent Image
- It is an algorithm for supervised learning of an ANN using Gradient Descent.
- Given an ANN and an error function it calculates the gradient of the error function w.r.t. the NN weights.
- Here the partial computations of the gradient from one layer are reused in the computation of the gradient for the previous layer.
- The chain rule expressions give us the derivatives that determine each component in the gradient that helps to minimize the cost of the network by repeatedly stepping downhill.