Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling conceptlevel interventions. Our code is publicly available at https://github.com/Sorades/CLAT.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TMI.2024.3429148 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!