SAVING LIVES FROM BREAST CANCER
The death rate from breast cancer in the United States had been unchanged dating back to 1940 when the Connecticut Tumor Registry began to collect data (1). There was no national database in the US until the start of the Surveillance Epidemiology and End Results (SEER) program that began in 1974 under the guidance of the National Cancer Institute.The SEER program monitors breast cancer incidence and deaths in 19 regions around the U.S. and extrapolates these datato try to understand the incidence and death rates in the entire country.
Mammography screening began in the U.S. in the mid 1980’s (2). As anticipated, soon after, in 1990, for the first time in 50 years, deaths from breast cancer began to fall.Since then they have continued to decline each year so that there are now over 40% fewer women who die from breast cancer than would have died had the preceding 50 year rate continued. Hendrick et al estimate that more than 600,000 lives have been saved since 1990 (3). There is debate as to precisely why there has been such a dramatic decline in breast cancer. Some claim that the decline in deaths is due to improvements in therapy. Therapy has improved, but therapyis still unable to cure advanced, metastatic breast cancer. In fact, the data strongly suggest that deaths have declined because breast cancers are being treated at a time intheir growth when cure is possible (prior to successful metastatic spread) due to earlier detection through screening. This is supported by the fact that the death rate for men with breast cancer has not declined at all since 1990 (4)(the treatment for male breast cancer is similar to that for women) while the death rate for women has steadily and, dramatically, declined (3).
The best estimate of the relative contributions to the reduction indeaths of pure treatment compared to early detection and earlier treatment isseen in two studies from Sweden.The first looked at the incidence of death among women ages 40-69 (5)and compared those who had participated in screening with those who had not. All had access to modern therapy and even women who did not participate in screening had a small decline in deaths,but the incidence of death among women who participated in screening was 60% lower at 10 years among women who had participated in screening compared tothose who did not.The incidence of deaths was 47% lower at 20 years for participants in screening.
In an extremely large follow-up analysis of more than 500,000 women in Sweden (6) the results were similar.“Women who participated in mammography screening had a statistically significant 41% reduction in their risk of dying of breast cancer within 10 years.” Part of this benefit was due to the fact that, among women who were screened, there had been a 25% reduction in the rate of advanced cancers.
There is no question that advances in therapy have led to delays indeaths for women with breast cancer, but therapy can cure breast cancers ifthey are treated early.Screening leads to early detection which leads to fewer deaths.
RANDOMIZED CONTROLLED TRIALS ARE THE ONLY WAY TO PROVE THAT SCREENING SAVES LIVES
As discussed earlier,simply looking at how long women survive after their cancers are discovered cannot be used to prove that early detection saves lives.Numerous biases complicateany analysis based on how long women survive from the date of diagnosis todeath. “Leadtime bias” simply means that screening may find a cancer earlier in its course, but the woman may still die at the same time she would have died had her cancer not been found earlier. Survival time [the timebetween detection and death] may appear longer, but the date of death is unchanged.Certainly, “selection bias”is also problematic. Perhaps women who decide to participate in screening are more health conscious and they tend to live longer with their cancers making it appear that screening resulted in longer survival when the results are due to healthier women self-selecting to participate in screening. Another biasis “length bias sampling”.There is no question that periodic screening is more likely to find slower growing cancers(very fast growing cancers are less likely to be detected and often become clinically evident between screens).If we were to compare survival among women with screen detected cancers (slower growing) compared to all cancers we might also be misled into thinking that screening was influencing outcome.
The only way to eliminate these biases and prove that screening saves lives is through the use of randomized, controlled trials (RCT).The goal of an RCT is to develop two identical populations of women.One group will be offered screening and the other will be the “unscreened” control group.Breast cancer is a common cancer,but there are actually not that many women per 1000 in the population who develop it each year.Consequently, an RCT must target a very large population of women (tens of thousands).If the women are randomly divided in a blinded process before anything is known about them that couldbias the allocation to the group offered screening or the unscreened controlgroup,two smaller groups will be formed.Again,if there are enough participants,and the allocation is truly random, the groups will be identical. If nothing else is done there would be the same number of women who develop breast cancer in each group and the same number of women who will die from breast cancer in each group.
In an RCT one group is offered screening while the other group actsas an unscreened,comparison. Properly performed RCT’s eliminate all of “biases” mentioned earlier.In addition to identical groups,RCT’s do not look at survival, but rather they look at absolute deaths (mortality) which is not subject to bias. Since women cannotbe forced to be screened, the women in the screening arm are invited. The RCT’s are trials of “invitation” to screening.
CAVEAT: It is important torealize that since women are invited to be screened,not every woman agrees to participate.If a woman is randomly assigned to the screening arm,but she refuses to be screened (“noncompliance”)and she dies from breast cancer she is still counted as having been screened.Similarly, if a woman isassigned to the control arm but she gets a mammogram outside of the trial andhas her life saved,she is still counted as an unscreened control (“contamination”). This sounds unreasonable, but it is required to prevent self-selection from biasing thestudy. Regardless, it is important torealize that because of non-compliance and contamination RCT’s under estimate the benefit of screening.
STATISTICAL POWER AND “PROOF” OF BENEFIT
When trials are being designed it is critical that an estimate be made as to whether the trial will have sufficient numbers of women so that if there is a decline in deaths the difference between deaths in the screening arm and deaths among the control women will be “statistically significant”. The “power calculation” is performed while designing trials by determining what the expected difference will be between the two groups and then estimating how many women will need to participate sothat if the expected decrease in deaths is achieved, it will, statistically significant. Since trials are expensive to perform, the numbers involved are often marginal and the results can be misleading. As stated by Lachin:
“.. if the statistical test fails to reach significance, the power of thetest becomes a critical factor in reaching an inference. It is not widely appreciated that the failure to achieve statistical significance may often be related more to the low powerof the trial than to an actual lack of difference between the competing therapies. Clinical trials with inadequate sample size are thus doomed to failure before they begin and serve only to confuse the issue of determining the most effective therapy for a givencondition." (7).
In fact, this is precisely what happened in 1993 when the U.S.National Cancer Institute (NCI) concluded that there was no benefit for screening women ages 40-49. The NCI madethe mistake of requiring that, in order for a trial to prove benefit, deathshad to decline within 5 years of the start of a trial. This makes no sense scientifically because of“length bias sampling” (8). They compounded the mistake by separating out women ages 40-49 from the RCT data from all women who had participated who were ages 40-74 and analyzing them separately when the trials were not designed to permit this. There were insufficient numbers of women who participated ages 40-49 to show, with significance, that the decline in deaths that the NCI had required within 5 years of the start of the trials. The RCT’s were not designed to permit this unplanned retrospective subgroup analysis and lacked the statistical power to permit it (9). The trials involved only one third the number of women ages 40-49 needed to show a significant reduction in deaths within 5 years of the start of the trials. This turned out to be a scientifically unsupportable requirement since deaths arenot expected to decline for at least 5 years after the start of a trial It took several more years for the numbers to reach statistical significance(10). Nevertheless, ultimately, the trial shave proven a benefit for screening women ages 40-74.
If an RCT includes enough women and if there are fewer deaths amongthe screened women such that the difference in deaths among the screened women compared to the unscreened controls is large enough to be “statistically significant”, then the benefit of screening is proved. In fact, this is the only way to prove that amedical intervention is efficacious unless the benefit is so large that itcannot be ignored. For example, the advantage of cervical cancer screening has never been proven in an RCT, but thedecline in deaths from cervical cancer related to Pap testing is so large thatit could not be ignored. In one study the rate of deaths declined when Pap testing was introduced and then it roseagain when testing was stopped only to decline once more with a resumption oftesting. This is unusual. RCT’s are the best way to prove the efficacy of any intervention.
There have been 8 RCT’s of breast cancer screening (9 if theSwedish 2 County Trial is counted as two trials).
1.The Health Insurance Plan of New York (HIP)
2.The Swedish Two-County Trial – Kopparberg (nowDalarma) and Ostergotland)
3.The Malmo Trial
4.The Stockholm Trial
5.The Gothenberg Trial
6.The Edinburgh Trial
7.The National Breast Screening Study of Canada
8.The UK Age Trial
A good summary of the RCT’s is provided by Duffy et al (11).
The Health Insurance Plan (HIP) Trial performed in New York was the first RCT of breast cancer screening. HIP was the first to prove that early detection by screening reduced deaths forwomen ages 40-64 (12).
The Edinburgh trial showed a major decline in deaths from breast cancer among screened women (13),but its results have been challenged because there was an imbalance in socioeconomic factors between the screened women and the controls.
The Canadian trial (CNBSS) was actually two trials. One evaluated screening among women ages 40-49 and a second evaluated screening women ages 50 to 64. Unfortunately, the trials were poorly designed, and executed. In addition to using outdated mammography and untrained technologists and radiologists, they violated the major rules for RCT’s. Random allocation is critical to ensure that the two sides are equal. The CNBSS had an unblinded allocation process that resulted in more women with advanced cancers and larger cancers being placed in the screening arm (14). Consequently, just as with the Edinburgh data, the CNBSS results (which differed from the other trials and showed no benefit from screening anyone) should be excluded from consideration.
An overview of the Swedish mammography screening trials provide the best evidence of the benefit of mammography screening (15).Women in these trials were only screened with mammography. There was no clinical breast examination in the Swedish trials. Of all the trials, the Swedish Two County Trial, that used the highest quality mammography, provided the best results for mammography screening. This trial produced a 30% reduction in deaths and also showed that deaths continued to decline decades later since breast cancers can be lethal many years later (16).
There has been a great deal of confusion created about screening women ages 40-49 because of the inappropriate use of retrospective subgroup analysis of data lacking statistical power in the early years of follow-up ofthe trials. This was clarified in 1997 in an analysis by Hendrick et al (17)that showed that with longer follow-up (increasing the statistical power) the trials proved a benefit for screening women ages 40-49 that was as robust asfor women ages 50 and over. There are nodata supporting the use of the age of 50 as a threshold for screening. None of the parameters of screening change abruptly at the age of 50 or any other age (18).
Unfortunately, there has never been an RCT to evaluate the importance of the screening interval (time between screens). Our own modeling and logic shows that the more frequently we screen the more lives will be saved (19). Annual screening appears to be reasonable,and observational studies show that cancers are found at a smaller size and earlier stage with annual vs. biennial screening.
SCREENING HIGH RISK PATIENTS
As noted earlier, the vast majority of women (75%) have no identifiable risk for developing breast cancer beyond being female and aging. None of the RCT’s have stratified by risk and all the trials involved women from the general population. Consequently, there is no proof that screening only high risk women will save any lives. We also have no idea which women will not develop breast cancer. All women are at risk and until there are accurate ways to identify women who are not at risk all women should have access to screening.
Since some women have significantly increased risk of developing breast cancer it is recommended that these women, in addition to annual mammography might benefit (there is no proof) from screening between mammograms with Magnetic Resonance Imaging (MRI). MRI is the most sensitive way to detect breast cancers. It is recommended all women have annual mammography and that very high risk women be screened every 6 months alternating between mammography and MRI.
THE BOTTOM LINE
The fact is that the earlier detection of breast cancer saves lives. Mammography screening is not perfect. It does not detect all breast cancers. Some breast cancers are already metastatic before they can be found by screening. However, hundreds of thousands of lives have been saved by mammography screening and early detection. While we await a universal cure or a safe way to prevent breast cancer (none is on the horizon), the RCT’s have proven that the best way to save lives for women ages 40-74 is to detect breast cancers earlier by screening.
REFERENCES
1 Anderson WF, Jatoi I, Devesa SS. Assessing the impact of screening mammography: Breast cancer incidence and mortality rates in Connecticut (1943-2002). Breast Cancer Res Treat. 2006 Oct;99(3):333-40.
2 Kopans DB. Beyond Randomized, Controlled Trials: Organized Mammographic Scree