A Norwegian retrospective study in an organized screening environment analyzed the ability of an AI system to classify breast cancers by mammographic density. The commercial AI system Transpara, developed by ScreenPoint Medical, was used to score mammography exams in this study and the volumetric breast density was obtained from the automated software VolparaDensity. This study, including nearly 100,000 mammographic examinations, found that 89.7% (617/688) of screen-detected cancers and 44.6% (87/195) of the interval cancers were assigned an AI score of 10, implicating the highest suspicion of malignancy. This might indicate a potential for earlier detection of interval cancer when using AI in screen reading. Moreover, the highest rate of screen-detected cancers was observed for heterogeneously dense tissue cases (8.4 per 1000), while the lowest observed was for almost entirely fatty breasts (4.0 per 1000). The highest interval cancer rate (4.1 per 1000) was observed for extremely dense breasts while the lowest rate was observed for almost entirely fatty breasts (0.4 per 1000). When comparing the percentage of screen-detected cancers with an AI score of 10 between density categories, the highest percentage was observed for extremely dense breasts. The study showed promising results for AI to classify cancer cases into different risk score categories regardless of mammographic density. It also highlighted the impact in screen reading workloads and missed screen-detected cancer in three triaging scenarios using these AI scores and mammographic density information.
Read full study
Abstract
Objective
To explore the ability of artificial intelligence (AI) to classify breast cancer by mammographic density in an organized screening program.
Materials and methods
We included information about 99,489 examinations from 74,941 women who participated in BreastScreen Norway, 2013–2019. All examinations were analyzed with an AI system that assigned a malignancy risk score (AI score) from 1 (lowest) to 10 (highest) for each examination. Mammographic density was classified into Volpara density grade (VDG), VDG1–4; VDG1 indicated fatty and VDG4 extremely dense breasts. Screen-detected and interval cancers with an AI score of 1–10 were stratified by VDG.
Results
We found 10,406 (10.5% of the total) examinations to have an AI risk score of 10, of which 6.7% (704/10,406) was breast cancer. The cancers represented 89.7% (617/688) of the screen-detected and 44.6% (87/195) of the interval cancers. 20.3% (20,178/99,489) of the examinations were classified as VDG1 and 6.1% (6047/99,489) as VDG4. For screen-detected cancers, 84.0% (68/81, 95% CI, 74.1–91.2) had an AI score of 10 for VDG1, 88.9% (328/369, 95% CI, 85.2–91.9) for VDG2, 92.5% (185/200, 95% CI, 87.9–95.7) for VDG3, and 94.7% (36/38, 95% CI, 82.3–99.4) for VDG4. For interval cancers, the percentages with an AI score of 10 were 33.3% (3/9, 95% CI, 7.5–70.1) for VDG1 and 48.0% (12/25, 95% CI, 27.8–68.7) for VDG4.
Conclusion
The tested AI system performed well according to cancer detection across all density categories, especially for extremely dense breasts. The highest proportion of screen-detected cancers with an AI score of 10 was observed for women classified as VDG4.
Clinical relevance statement
Our study demonstrates that AI can correctly classify the majority of screen-detected and about half of the interval breast cancers, regardless of breast density.