What is a cost function?

A cost function is a scalar function that quantifies the error factor of the neural network. Lower the cost function better the neural network. For example, while classifying the image in the MNIST dataset, the input image is digit 2, but the neural network wrongly predicts it to be 3.

A cost function, also known as a loss function or objective function, is a mathematical measure that quantifies the “cost” or “error” associated with the difference between predicted values generated by a model and the actual observed values in a dataset. In the context of machine learning and optimization algorithms, the goal is to minimize this cost function to improve the performance of the model.

The choice of a specific cost function depends on the nature of the problem being solved and the characteristics of the data. For example, in regression problems, where the goal is to predict continuous values, common cost functions include mean squared error (MSE) or mean absolute error (MAE). In classification problems, where the goal is to predict discrete class labels, common cost functions include cross-entropy loss.

Optimizing the cost function involves adjusting the parameters of the model (e.g., weights in a neural network) through techniques like gradient descent, with the objective of reducing the discrepancy between predicted and actual values, thus improving the model’s performance.