# Activation functions — Why the need?

You might often wonder about the activation functions and why do you need it ? Also, there are plethora of information on activation functions online, but can you get a quick introduction with minimal writing and useful insights. Of course, you can and you are in the right place. Because, I have tried to jot down the points that helped me understand the “why” and the “when to use” of activation functions quickly without spending too much time into the details. Let’s get started.

Purpose: to decide whether a neuron should be activated or not.

How: by applying non linear transformations.

Why not linear ? cause without non-linear transformations our model will be a stacked linear regression model and it will fail to learn the complexities of the data just like any linear regression model. So, at each layer we use an activation function.

How many are there? Quite a few

Step function

Sigmoid function

Tanh

ReLU

Leaky ReLU

Softmax

Great! Can we know about each of them ? Sure :)

# Step function

Essentially, it means if the input is greater or equal to a threshold, then the output will be 1, meaning the neuron will be activated else not. NOT used often.

# Sigmoid function

What this function does is — takes the input and outputs a probability value that obviously lies between 0 and 1. And for that reason it is mostly used in the last layer of binary classification.

# TanH function

The hyperbolic TanH function is nothing but the scaled sigmoid function and little shifted so it outputs a value that lies between +1 and -1. Mostly used in the hidden layers.

# ReLU function

The ReLU activation function is widely used in the hidden layers and is one of the most popular choices. What it does is — it outputs 0 for negative input values and simply outputs the input for positive values. Now, it might seem confusing, but it is a non-linear transformation.