Performance of large language models for CAD-RADS 2.0 classification derived from cardiac CT reports

This study evaluates the performance of seven large language models (LLMs) in generating CAD-RADS 2.0 scores from cardiac CT reports, including all modifiers. The models, comprising both cloud-based and locally hosted solutions, were assessed for their ability to handle the complexity of CAD-RADS 2.0 classification, which includes plaque burden, high-risk plaque features, and ischemia. GPT-4o and Llama 3 70B demonstrated high accuracy (93 ​% and 92.5 ​%, respectively), while open-source models a…

Read the full article on journalofcardiovascularct.com