Backward Propagation Details

Those of you who are familiar with calculus recognize that the backward propagation equalities described in Neural Network Training involve partial derivatives, and so we can actually prove these equalities. For those of you who are not familiar with calculus or don’t care to see the proof, you can skip the following since it is not essential to understanding the functioning of the neural network.

Properties

For r = the Output Layer

output node bias change

For 1 < h < rHidden layer bias change

For any h > 1Weight change

Proof

As we see in Neural Network Basic Concepts

Link function

Applying the quotient rule, we get

Derivative of activation function

Since

activation a_h

as we see in Neural Network Basic Concepts, it follows that

Again from Neural Network Basic Concepts, we know that

z_h

from which it follows that

Partial derivative z / b and z / w

Also

Partial derivative a / a

Finally, from Neural Network Basic Concepts

Cost function

from which it follows that (dropping the subscript on C and using h = r)

By the chain rule and using the above observations, we get

Using the chain rule one more time, we get

Partial derivative C / b

where h = r. The following also follows for any h > 1.

Partial derivative C / w

The equation

Hidden layer bias change

where h < r is proved in a similar manner using the chain rule.

Leave a Comment