Member-only story
Intuition of Dice Coefficient: Why it can deal with imbalanced dataset
For image segmentation task in computer vision, it is quite common to use dice loss as the loss function to deal with the imbalanced dataset. It works quite well in practice, but why?
First let’s look at the formula of Dice Similarity Coefficient (DSC):
For Dice Loss, it is just
Dice Loss = 1- DSC
Dice Loss ranges from 0 to 1. Basically, DSC measures the relative overlapping area between ground truth and prediction, compared to the total area of Ground Truth + Prediction. So, the Dice Loss measures the ‘non-overlapping’ between the ground truth and the prediction, which is the ‘loss’ of the model since ground truth and prediction does not overlap. Let’s look at the example below:

The Dice Coefficient for the object on the left and the object on the right are the same: 0.25, and Dice Loss is 1–0.25 = 0.75 (you can try to do your own calculation), even though the left object is larger than the right object. This means that the model prediction on the left, larger object has the same effect as the prediction on the right, smaller object, since they have the same Dice Loss.
Let’s shift the prediction by 1 pixel to the right: