CAD和AI
人工智能(AI)的发展是过去40年中取得的重大科技进步之一。采用计算机辅助分析乳腺X射线图像的历史可以追溯到20世纪80年代。起初,计算机辅助被称为“计算机辅助检测(CAD)”。在能直接获得数字乳腺X射线图像之前,我们已经能够实现将屏/片乳腺X射线摄影图像“数字化”为像素值以供计算机使用。我们为开发者提供“训练病例”。我记得我花了几个小时回顾那些正常的、有病理证实的发现的乳腺图像,标注那些发现,然后描述所见。开发者随后“教育”计算机在乳腺X射线图像上“寻找”那些我们在训练病例中所描述的“类似”发现。事实证明,研发的CAD系统在发现与导管原位癌(通常也与浸润性癌相关)密切相关的“微钙化”方面相当出色,对于放射科医生来说,这些也是最容易发现的。但对计算机来说,要找到肿块和结构扭曲更困难些。
最初的CAD研究非常乐观。在一些研究中,CAD能够识别出70%以上被放射科医生遗漏的癌症(1,2)。这些都是在病例回顾时可以看到的“遗漏”癌症的病例。还有一些“前瞻性”研究,CAD被用于辅助放射科医生,其中一些研究取得了可喜的成果(3)。全视野数字乳腺X射线摄影(FFDM)技术的发展极大地促进了CAD的应用,因为FFDM可以省掉屏/片图像数字化的步骤。CAD可以直接应用在FFDM图像诊断上。遗憾的是,尽管许多放射科医生已经将CAD运用到他们的分析中,但尚不清楚CAD其实在早期发现方面也大有用处。
双人阅片
由于每位放射科医生都会漏诊一些在事后才能发现的癌症,而且第一位放射科医生漏诊的癌症往往会被第二位放射科医生发现,欧洲人建议每次筛查都要由两位或两位以上的放射科医生进行评估。我们自己在麻省总院的实践过程中发现,双人阅片让我们对早期癌症的检出率提高了5~7%。在我们的实践中,“双人阅片”优于使用CAD系统,但它增加了筛查的成本。我们目标是尽可能降低成本,同时最大限度地发现早期癌症。
人工智能(AI)
近年来,AI已经被开发应用于帮助检测乳腺X射线检查中的乳腺癌。AI和CAD的不同之处在于:我们(放射科医生)“教育”CAD系统在乳腺X射线图像上去寻找我们定义为可能提示恶性肿瘤的发现。AI系统则是利用“神经网络”。这些计算机是基于对我们大脑工作原理的简化理解而设计的。计算机被输入成千上万张乳腺X射线图像,并被告知哪些是患有癌症的乳腺X线影像,哪些是未患乳腺癌的女性乳腺X线影像。然后,计算机“学会”识别阳性和阴性的乳腺X射线图像。
由于计算机永远不会“忘记”任何一个病例,不会分心,也不会疲倦,在我看来,它们在发现癌症方面应该比人类更出色,但迄今为止情况并非如此。令我惊讶的是,AI系统才刚刚开始在检测癌症方面与放射科医生持平,一些人就声称AI现在已经更擅长发现癌症,我不确定事实是否真如声称。我猜想,计算机只有能从自己的错误中吸取教训,它们才会有所改进,不过能否达到还有待观察。
一个主要问题是,迄今为止,计算机无法解释它们为何关注乳房中的某个区域。它们可以标注某个区域,但却无法解释它们“看到”的是什么。这些系统是真正的“黑盒子”。计算机看到一幅图像,然后“分析”图像并得出答案,但它无法告诉我们为什么它关注或不关注。随着我们越来越依赖AI,这个问题会越来越严重。
另一个问题是,有许多AI系统是独立开发的。其效果应和计算机所受的训练数据类似,但这些训练数据各不相同,且通常不被分享。理想情况下,需要对计算机所使用的乳腺X射线影像进行质量控制使其能检出癌症的机会最大化,并且训练图像应该包含所有种族女性的乳腺类型。同时应该开发一个巨大的图像文件库,并与所有开发人员进行共享,以便CAD能准确地解读来自所有群体的乳腺X射线图像。
我仍然对AI寄予厚望。对数亿需要接受筛查的女性进行筛查是一项艰巨的任务。每年的乳腺X射线摄影检查绝大多数都是阴性的,每年只有不到百分之一的女性会被查出患有乳腺癌,这让情况变得更加复杂。除非我们能确定哪些女性不会患上乳腺癌(目前我们还不能做到,而且可能永远也做不到),否则所有女性都面临风险,我们需要对所有女性进行筛查。目前的建议是,AI的首要用途将是尝试识别所有对放射科医生和/或计算机而言没有显示乳腺癌征象的乳腺X射线图像。如果计算机能排除20%-30%的病例,放射科医生就不需要对这部分病例进行阅读,因为这些病例图像上没有任何需要被关注的地方。这是一个重大的优势,但这是一个尚未被认可的优势。乳腺X射线摄影筛查是一个产出比非常低的过程,因为在100名女性中,只有不到1%的人会被发现患有癌症。因此,哪怕计算机只是漏掉了其中一个癌症(本可以被医生发现的),那高强度的筛查努力的效果就会受损。
在美国,我们还有另外一个问题。如果计算机被用来通过识别“阴性”病例来减少放射科医生需要审查的病例数量,而结果却是计算机遗漏了本可以被医生发现的病例,那么谁将对这个错误负责呢?在美国,如果漏诊了乳腺癌,放射科医生需要承担责任。那么当计算机漏掉癌症时,谁来负责呢?
我期望AI会变得越来越好,最终达到优于人类阅片者的水平。那样我们就可以依靠AI来进行所有乳腺X射线图像的阅片和诊断,使放射科医生可以专注于“诊断”成像和流程。为了实现这一目标,我们需要这样的系统:计算机能够让我们了解它是如何分析一个病例并得出结论的。目前我们尚未实现这个根本的重要目标。
CAD AND AI
Daniel B. Kopans, M.D., F.A.C.R., F.S.B.I.
Professor of Radiology Harvard Medical School
Founder-Breast Imaging Division Massachusetts General Hospital
Copyright 12/27/2024
Clearly one of the major technological advances of the past 40 years is the development of Artificial Intelligence(AI).The use of computers to help analyze mammograms dates back to the 1980’s.In the beginning,computer assistance was called “Computer Aided Detection(CAD)”.Before we were able to obtain digital mammograms directly,we were able to “digitize” Screen/Film mammograms to provide the pixel numbers for the computer use.We then provided developers with “training cases”.I remember spending hours reviewing normal mammograms along with those with findings and proven pathology,and highlighting the findings on the mammograms.We then provided descriptions as to what we were seeing.Using the “digitized” mammograms,the developers then “taught” the computer to “look for” findings on mammograms that “looked like” what we had described on the training cases.It turned out that the CAD systems that were developed were fairly good at finding “microcalcifications” that are frequently associated with Ductal Carcinoma In Situ(and often associated with invasive cancers),but these are also the easiest for the radiologist to find.Masses and architectural distortion were more difficult for the computer to find.
The initial CAD studies were very optimistic.In several,the CAD systems were able to identify more than 70% of the cancers that had been missed by the radiologists(1,2).These were studies of “missed” cancers that were visible in retrospect.There were some “prospective” studies where CAD was used to assist radiologists,and some had promising results(3).The use of CAD was greatly facilitated by the development of Full Field Digital Mammography(FFDM) since it removed the need to digitize the screen/film images.CAD could be applied directly to the interpretation of FFDM images.Unfortunately,although many radiologists added CAD to their analyses,it is not clear that CAD has actually had much benefit in early detection.
DOUBLE READING
Since every radiologist will fail to see cancers that are visible in retrospect,and since a cancer that may be missed by the first radiologist is often seen by a second,the Europeans taught us to have two or more radiologists evaluate each screening study.In our own practice at the Massachusetts General Hospital,having a second reader increased our detection of early cancers by 5-7%.In our practice,“double reading” was superior to using a CAD system,but it added more expense to screening when the goal is to keep the costs as low as possible while maximizing the early detection of cancers.
ARTIFICIAL INTELLIGENCE
More recently AI has been developed to help detect breast cancer on mammograms.AI differs from CAD in that we(radiologists) “taught” the CAD systems to look for findings on mammography that we have defined as suggesting possible malignancy.AI systems utilize “neural networks”.These are computers that are designed based on a simplified understanding of how our brains work.The computer is shown thousands of mammograms and told which mammograms are of a breast with cancer and which mammograms are of women who do not have breast cancer.The computer then “learns” to recognize the positive and negative mammograms.
Since computers never “forget” a case;are not distracted;and do not tire,it seems to me that they should be better at finding cancers than humans,but,thus far,this has not been the case.I have been surprised that AI systems are just beginning to equal radiologists in detecting cancers and some are claiming that they are now better at finding cancers,but I am not sure this is the case.I suspect that as computers learn from their own mistakes they will improve,but that remains to be seen.
A major problem is that computers,thus far,cannot explain why they were concerned about an area in a breast.They can highlight an area,but they cannot explain what they are “seeing”.The systems are truly “black boxes”.The computer is shown an image,and it “analyzes” the image and produces an answer,but it cannot tell us why it was or was not concerned.This will be more and more of a problem as we rely more and more on AI.
Another problem is that there are numerous AI systems under independent development.The results are only as good as the training studies that the computers are shown and many of these are unique and not being shared between the various developers.Ideally,there need to be controls on the quality of the mammograms that are used by the computers so that they have the best chance of detecting cancers,and training images need to be included from women of all ethnic backgrounds.A huge file should be developed and images should be shared between developers so that CAD can accurately interpret mammograms from all groups.
I continue to have high hopes for AI.Screening the hundreds of millions of women who need to be screened is a large task.This is compounded by the fact that the vast majority of annual mammograms will be negative and fewer than 1 woman in 100 will be found to have breast cancer each year.Unless we can determine which women will not develop breast cancer(and we cannot do this as yet and may never be able to this) all women are at risk and we need to screen all women.What is being suggested is that the first use for AI will be to try to identify all the mammograms that show no evidence of breast cancer either to a radiologist and/or the computer.This would be a major advantage if the computer can eliminate 20-30% of cases that the radiologist does not need to review since there is nothing on these studies that will concern a human reader.Even this is still an unrecognized advantage.Screening mammograms are a very low yield process since,again,fewer than 1 in 100 will be obtained in women who are found to have cancer.Consequently,if the computer misses even one of these cancers,that could have been detected had the study been reviewed by a human,the intense screening effort will have been compromised.
In the U.S. we have an additional problem.If computers are used to reduce the number of studies that radiologists have to review by identifying the “negative” studies,and it turns out that the computer missed a case that a human could have found,who will be responsible for the error?In the U.S. the radiologist is responsible if a cancer is missed.Who will be responsible when a computer misses a cancer?
My hope is that AI will get better and better and become sufficiently superior to human readers that we can rely on AI to interpret all the screening mammograms so that radiologists can concentrate on “Diagnostic” imaging and procedures.To accomplish this we will need systems such that the computer can provide an understanding of how it analyzed a study and reached the conclusion that it reached.We have still not yet reached these fundamentally important goals.
REFERENCES
---------------------------
1.Warren Burhenne LJ,Wood SA,D'Orsi CJ,Feig SA,Kopans DB,O'Shaughnessy KF,Sickles EA,Tabar L,Vyborny CJ,Castellino RA.Potential contribution of computer-aided detection to the sensitivity of screening mammography.Radiology.2000;215:554-62.
2.Birdwell RL,Ikeda DM,O'Shaughnessy KF,Sickles EA.Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computer-aided detection.Radiology 2001;219:192-202
3.Freer TW,Ulissey MJ.Screening mammography with computer-aided detection:prospective study of 12,860 patients in a community breast center.Radiology.2001;220:781-6.