Purpose: To evaluate prospectively the accuracy of a lesion classification system designed for use in a magnetic resonance (MR) imaging high-breast-cancer-risk screening study. Materials and Methods: All participating patients provided written informed consent. Ethics committee approval was obtained. The results of 1541 contrast material–enhanced breast MR imaging examinations were analyzed; 1441 screening examinations were performed in 638 women aged 24–51 years at high risk for breast cancer, and 100 examinations were performed in 100 women aged 23–81 years. Lesion analysis was performed in 991 breasts, which were divided into design (491 breasts) and testing (500 breasts) sets. The reference standard was histologic analysis of biopsy samples, fine-needle aspiration cytology, or minimal follow-up of 24 months. The scoring system involved the use of five features: morphology (MOR), pattern of enhancement (POE), percentage of maximal focal enhancement (PMFE), maximal signal intensity–time ratio (MITR), and pattern of contrast material washout (POCW). The system was evaluated by means of (a) assessment of interreader agreement, as expressed in κ statistics, for 315 breasts in which both readers analyzed the same lesion, (b) assessment of the diagnostic accuracy of the scored components with receiver operating characteristic curve analysis, and (c) logistic regression analysis to determine which components of the scoring system were critical to the final score. A new simplified scoring system developed with the design set was applied to the testing set. Results: There was moderate reader agreement regarding overall lesion outcome (ie, malignant, suspicious, or benign) (κ = 0.58) and less agreement regarding the scored components. The area under the receiver operating characteristic curve (AUC) for the overall lesion score, 0.88, was higher than the AUC for any one component. The components MOR, POE, and POCW yielded the best overall result. PMFE and MITR did not contribute to diagnostic utility. Applying a simplified scoring system to the testing set yielded a nonsignificantly (P = .2) higher AUC than did applying the original scoring system (sensitivity, 84%; specificity, 86.0%). Conclusion: Good diagnostic accuracy can be achieved by using simple qualitative descriptors of lesion enhancement, including POCW. In the context of screening, quantitative enhancement parameters appear to be less useful for lesion characterization.
- breast scoring