Neural network architecture

December 23, 2022

a= f(wp+b)

Here is a single input neuron.

p is the input.

Here, w is the weight. The designer can use any value for the weight. Weight is also called 'offset'.

The bias is also a weight with a constant input of 1. The bias can even be omitted or its value can be changed.

w and b are scalar parameters.

The designer chooses the activation function. Some learning algorithm helps to choose the values of w and b.

wp+b is called the net input.

a is s function of the net input. This function is called the activation function or the transfer function. An example can be the sigmoid activation curve.

Different types of activation functions:

They may be linear or non-linear.

How to choose a transfer function? A neuron tries to solve some problem and the specifications of this problem help to choose the transfer function.

The hard limit transfer function

The first figure shows the hardlimit transfer function. It helps to break the input into two categories. If the argument is less than zero then the output is zero. If the argument is zero or greater then the argument is one. These are the two categories.

The second figure shows the transfer characteristics of the input-output characteristics of the single output neuron.

Linear transfer function

In a linear transfer function, the input and output are equal. The equation forms a straight line. When a single input neuron uses the linear transfer function then it tilts and shifts the line horizontally.

Log-sigmoid transfer function

The input can have any real value. The output has a value between zero and one.

This is differentiable so it is used in multilayer networks and backpropagation.

https://www.geeksforgeeks.org/activation-functions-neural-networks/

Linear activation function

The equation is like a straight line y=x.

The final activation function of the last layer is a linear function of the first layer. It is true for any number of layers.

The range is -∞ to +∞.

The linear activation function is used at the output layer only.

Differentiating the function will bring a constant only. This will depend only on the input.

The hidden layers can have non-linear functions as well.

Multiple input neurons

The input vector is p. It is represented using either a vector or a matrix. Weight is also represented by using a matrix. The resultant is a scaler.

If we have multiple neurons then even the output of the network would be a vector.

Recurrent networks (RNN)

It uses data in the time series which is also called sequential data. They use training data to learn. Data from previous output influence both the input and output of these neural networks. It is different from traditional neural networks. In traditional neural networks, the input and output are independent of each other. But, in RNN, they depend on each other. These neural networks make predictions on the basis of prior inputs. If the RNN is unidirectional RNN then it cannot account for future events.

https://www.ibm.com/topics/recurrent-neural-networks

Delay

A Delay block is a simple building block.

a(t) is the delay output. u(t) is the delay input.

Assumption: Time is updated in discrete steps and takes up only integer values.

a(t) = u(t-1)

Output has to be initialized to t=0.

Here, a(0) is not the weight but it is the initial condition.

Search This Blog

Notebook