We explain how to train a neural network and assess the accuracy of the resulting network. We use the notation and concepts from Neural Network Basic Concepts and Neural Network Training and Backward Propagation.
Training Methodology
Recall that the objective of training is to calculate the biases and weights so that the neural network is sufficiently capable of solving the problem under consideration. Even after training, we often don’t expect the network to achieve 100% accuracy, but we want to achieve a reasonably high level of capability
Here are the steps used to train a neural network.
Step 1: Define the neural network that is appropriate for the problem being addressed. This includes the number of layers, the number of nodes in each layer, etc.
Step 2: Obtain the training data consisting of pairs (X, Y) and set the value of the training rate λ
Step 3: Run the following training algorithm using the data in the training set until the desired level of accuracy is obtained
Training Algorithm
The following is the algorithm used to train a neural network using all the (X, Y) in the training set
Step 1: Initialization: Initialize the values of the biases of all the nodes in the network and weights for all the links in the network.
Step 2: For each (X, Y) in the training set, use forward propagation to calculate the activation levels at each node and to calculate the cost function C(X, Y). Calculate the overall cost function as the average of the cost function for each of the training pairs.
Step 3: For each (X, Y) in the training set use backward propagation to calculate the Δbh(k) and Δwh(j, k) values based on that (X, Y). Then average these values to obtain the delta values for all the nodes and links in the network.
Loop. Update the biases and weights for all the nodes and links in the network, and repeat steps 2 and 3, the desired number of iterations.
Mini-batches and epochs
Since these steps can be quite time-consuming, especially for networks with lots of nodes, various simplifications can be useful. In particular, instead of using all the training data at once, we can perform the above steps on smaller subsets, called mini-batches, of the training data.
Suppose that you have 1,000 pairs of training data and you set the size of the mini-batches at 20, and so you have 1000/20 = 50 mini-batches. In this case, steps 2 and 3 are performed on each of the 50 mini-batches. After processing these 50 mini-batches, comprising one epoch, all the training has been used. You repeat this process until a predesignated number of training epochs have occurred. After each epoch, you can randomly reorder the training data to obtain different mini-batches.
Testing Phase
Once you have trained the neural network, you need to measure how well it does at addressing the problem under study. To do this you need to collect additional (X, Y) data (i.e. test data) and see how well the network performs. If it does a good enough job for your needs, you have evidence that the network can be used for new X data where the Y values may not be known but need to be calculated via the neural network.
References
Lubick, K. (2022) Training a neural network in a spreadsheet
https://www.youtube.com/watch?v=fjfZZ6S1ad4
https://www.youtube.com/watch?v=1zwnPt73pow
Nielson, M. (2019) Neural networks and deep learning
http://neuralnetworksanddeeplearning.com/