Learning rate: The learning rate is how fast the network learns its parameters.
Momentum: It is a parameter that helps to come out of the local minima and smoothen the jumps while gradient descent.
Number of epochs: The number of times the entire training data is fed to the network while training is referred to as the number of epochs. We increase the number of epochs until the validation accuracy starts decreasing, even if the training accuracy is increasing (overfitting).