Linear and Non Linear Activation Functions

What is Activation Function?

Activation functions are mathematical equations that determine the output of a neural network. This functions are added to each neuron in the neural network and determines whether it should be fired or not, based on neuron’s input.

There are mainly two types of activation functions

  1. Linear Activation Function
  2. Non Linear Activation Function

What is Linear Activation Function?

In linear activation function value of the activation function goes on increase linearly with increase in the input value.
Linear activation functions are generally used for regression type problem were we have to predict a particular value.
Linear Activation function cannot learn the complex pattern from the data as it is linear.
Output of the linear activation function do not have any range.
Example of Linear activation function is rectified linear unit (ReLU). Relu is extensively used in the hidden layers.
Relu also helps to removes vanishing gradient issue.

Equation for ReLU activation function is

If input value is negative x= -ve

Which means if value of y is negative then output is zero
Derivative of ReLU activation when x= -ve

If input value is positive x= +ve

Which means if value of y is negative then output is zero
Derivative of ReLU activation when x= +ve

Refer the below plot for more details

ReLU Activation Function
ReLU Activation Function with derivative Image Courtesy

What is Non Linear Activation Function?

Output of the Non-linear activation function are not linear.
Non linear activation function like sigmoid are smooth step function.
Non-linear activation function are used in classification problem in the last layer. For example to predict whether the animal is cat or a dog.
Non- linear activation functions also helps to find out the complex patterns present in the data.
Output of the non linear activation are between 0 to 1. So, in way it’s good that output stays in range.

Equation for Sigmoid activation function is

When x>0.5
f = 1
When x<0.5
f = 0

Refer the below plot for more details

Sigmoid Activation Function
Sigmoid Activation Function with derivative Image Courtesy

Non linear activation function leads to a vanishing gradient problem because derivative of the non-linear function like sigmoid is from 0 to 0.25 you can refer previous blog for more details.

The END