Schema illustrating Dice Score calculation from overlapping medical segmentations

Segmentation is widely used in medical imaging to delineate anatomical structures from images by clearly defining the contours of an organ or a region of interest. 

When this process is automated, the accuracy of the results must be assessed, particularly when structures need to be analyzed, compared, or followed over time. It is in this context that the Dice Score is used. This metric makes it possible to measure the overlap of a segmentation and to facilitate its interpretation in medical practice.

What is the Dice Score? 

The Dice Score, also known as “Sørensen–Dice coefficient », is based on a simple principle: measuring the degree of overlap between two segmentations. In practice, it compares a reference segmentation, most often produced by an expert, with a segmentation to be evaluated, for example one generated automatically. 

The more the areas defined by the two segmentations overlap, the higher the Dice Score. Conversely, differences in contours or non-overlapping regions lead to a lower value. It provides a rigorous spatial evaluation, ensuring that the automated segmentation doesn’t just match the size of the organ, but also its exact location and shape within the image 

The Dice Score therefore makes it possible to objectively quantify the similarity between two segmentations. 

How to read a Dice Score 

The Dice Score takes a value between 0 and 1. 

  • 1.0 corresponds to a perfect overlap between two segmentations. 
  • 0.0 means there is no overlap. 
  • Intermediate values indicate a higher or lower level of similarity. 

In medical imaging, higher Dice Scores, generally between 0.8 and 0.9, are associated with a good level of accuracy, depending on the organ studied and the clinical context. Lower values may reflect contour differences, complex regions, or natural variability in interpretation. 

How to interpret the Dice Score in clinical practice 

The Dice Score provides a numerical measure of the similarity between two segmentations and helps assess their overlap. A higher value indicates a strong overlap between the contours being compared, while a lower value highlights more pronounced differences. 

In medical practice, a perfect overlap is rarely observed. Anatomical structures often have complex contours, and natural variability exists, including between human experts. When a segmentation is produced by an artificial intelligence system, the Dice Score helps to objectify this level of overlap and to position the results relative to a clinical reference. 

The Dice Score therefore stands as a key indicator for evaluating the reliability of an automated segmentation. It helps assess the quality of results produced by an AI system, while remaining complementary to the clinical judgment and expertise of healthcare professionals.

Conclusion

The Dice Score serves as a useful reference for evaluating the overlap of a segmentation and objectively comparing different contours. When properly interpreted, it helps to better understand the quality of the results obtained and to place them within their clinical context, in support of informed medical practice.