Skip to main content

Loss function

To train a network, you need to reduce the error the network makes. This error is called the "loss function".

In most cases, it looks like this:

L(θ,θ^)=E(θθ^)L( \theta, \hat{\theta} ) = \mathbb{E}(|| \theta - \hat{\theta} ||) In this formula, θ\theta is a value that we try to predict and θ^\hat{\theta} is an estimator of θ\theta, the value predicted by the model. xxx \to ||x|| is a norm. Depending on the task you train the model for you will use a different norm.

Commonly used ones are:

  • The L2 norm (also known as the euclidian distance)
  • The cross entropy (not actually a norm)

E\mathbb{E} is the expected value.

In simple words, the error (or loss) is the difference between what we want and what we have.

The loss function is used to Train a neural network

Usually, the loss function can be evaluated on a given dataset θ\theta. The training loss is the loss on the training dataset while the evaluation (also named testing loss) is the loss computed on the testing dataset.