A German prospective, multicenter, real-world study called PRAIM evaluated Vara (MX Healthcare GmbH), an AI tool for breast cancer detection in a population-based mammography screening program. This study involved 463,094 women aged 50–69, screened across 12 sites, with 260,739 exams supported by the AI tool. The AI implemented a decision referral approach using normal triaging and a safety net. The normal triaging flagged mammograms deemed highly unlikely to have abnormalities, enabling radiologists to focus on higher-priority cases. For suspicious findings, the safety net alerted radiologists when AI detected concerns in cases initially marked as normal by the radiologists, prompting reevaluation of flagged areas.
Radiologists using the AI tool achieved a 17.6% higher cancer detection rate (6.7 per 1,000 women) compared to the control group (5.7 per 1,000). The recall rate in the AI group was 2.5% lower than the control group (37.4 vs 38.3 per 1,000). The positive predictive value (PPV) of recalls and biopsies was also higher in the AI group (recall PPV: 17.9% vs. 14.9%; biopsy PPV: 64.5% vs. 59.2%).
These results demonstrate that the AI tool improves efficiency and accuracy in mammography screening, demonstrating real-world feasibility while maintaining safety through its dual mechanisms of normal triaging and the safety net.
Read full study
Nationwide real-world implementation of AI for cancer detection in population-based mammography screening.
Nature Medicine, 2024
Abstract
Artificial intelligence (AI) in mammography screening has shown promise in retrospective evaluations, but few prospective studies exist. PRAIM is an observational, multicenter, real-world, noninferiority, implementation study comparing the performance of AI-supported double reading to standard double reading (without AI) among women (50-69 years old) undergoing organized mammography screening at 12 sites in Germany. Radiologists in this study voluntarily chose whether to use the AI system. From July 2021 to February 2023, a total of 463,094 women were screened (260,739 with AI support) by 119 radiologists. Radiologists in the AI-supported screening group achieved a breast cancer detection rate of 6.7 per 1,000, which was 17.6% (95% confidence interval: +5.7%, +30.8%) higher than and statistically superior to the rate (5.7 per 1,000) achieved in the control group. The recall rate in the AI group was 37.4 per 1,000, which was lower than and noninferior to that (38.3 per 1,000) in the control group (percentage difference: -2.5% (-6.5%, +1.7%)). The positive predictive value (PPV) of recall was 17.9% in the AI group compared to 14.9% in the control group. The PPV of biopsy was 64.5% in the AI group versus 59.2% in the control group. Compared to standard double reading, AI-supported double reading was associated with a higher breast cancer detection rate without negatively affecting the recall rate, strongly indicating that AI can improve mammography screening metrics.