The estrogen receptor (ER) assay is used to determine a number of very important things. Which patients with early breast cancer should receive adjuvant chemotherapy. Whether or not chemotherapy should include hormonal therapy. In the advanced setting, whether chemotherapy should be given versus hormonal therapy. These are all very important decisions. The stakes are high with the ER assay, in terms of potentially harming the patient.
The ER assay is broadly accepted to be the number one prognostic test in all of clinical oncology, from the standpoint of drug selection. The test is used to make gravely important treatment decisions, generally between cytotoxic chemotherapy on one hand or hormonal therapy on the other hand or the combination of chemotherapy and hormonal therapy. In some situations, this test is used to determine if patients are to receive any drug treatment at all. In contrast, cell culture assays are simply used to select between treatment regimens with otherwise equal efficacy in patient populations -- situations in which the choice could be made by a coin toss or, more commonly, on the basis of remuneration to the treating physician, with equivalent results on a population basis, though certainly not at the level of the individual patient. So, if anything, the "bar" should be higher for the ER assay than for cell culture assays. So what data exist to "validate" the most important predictive laboratory test in clinical oncology?
The history of the ER test is that it was originally developed as a complicated biochemical test, generically called the "radioligand binding assay" (RLB assay). The RLB assay was "validated" in the 1970s and very early 1980s by means of retrospective correlations with clinical outcomes for patients treated with hormonal therapy. Overall, in retrospective correlations with hundreds (not thousands) of patients, the RLB assay was found to be about 60% accurate in predicting for treatment activity and 90% accurate in predicting for treatment non-activity. In other words, an RLB assay "positive" tumor had a 60% chance of responding to hormonal treatment. An RLB "negative" tumor had a 10% chance of responding to hormonal treatment and were more likely to recur after "curative" surgery. There were never any prospective randomized Phase 3 trials to prove that either performing or not performing the test made a difference in treatment outcomes. There were just retrospective studies correlating assay results with clinical response to treatment.
In the early 1980s, the technology was changed from the complicated RLB assay, which could be done only in a few highly specialized laboratories to a much more simple immunohistochemical (microscope slide) assay, which could be done in most pathology labs. This newer assay was initially validated by comparison of the RLB assay in the specialized labs. The IHC assay was validated in studies in which archival specimens were batch processed in the same time frame by a single team of laboratory workers. There were not real world conditions, in which specimens are accessioned, processed, stained, and read by different people, at different times, using different reagents. Various studies have shown that there is often a broad variation of results between different laboratories, in formal proficiency testing studies. And yet, hundreds of thousands of cancer patients have had life and death treatment decisions based on these tests (the IHC test for Her2/neu is an even more egregious example, and the IHC test for EGFR is more egregious still).
The new assay correlated reasonably well with the older assay and it replaced the older assay. No one ever did a prospective or even retrospective study to show how the newer assay correlated with and predicted for response to treatment. It was just "the old assay works and the new assay correlates (in a few, highly specialized laboratories) with the old assay; so the new assay is OK to use."
In 2006, there was finally a study (in a highly sophisticated laboratory) showing how well the new assay predicts. A study on the ability of the IHC ER assay to predict for clinical reponse to hormonal therapy (Yamashita, et al. Breast Cancer 13:74-83, 2006). In a very small study, which was retrospective, meaning they could draw the best possible cut off lines after the fact, in a total of 75 patients studied, they found that ER positive patients had a 56% response rate, while ER negative patients had a 20% response rate. And these were data from a laboratory which certainly had above-average expertise in performing the test. Correlations which are vastly inferior to those obtained in much bigger and better studies with cell culture assays.
Here we have the most universally admired and utilized predictive test for treatment selection in all of clinical oncology, and it is validated only by the most retrospective and limited of data, and, even then, the predictive accuracy of the test is vastly inferior to that of the cell culture assays being performed. What in the world is the justification for claiming that the "bar" should be higher for using the cell culture assays to choose between docetaxel and 5FU (or capecitabine) in breast cancer, than to use the ER IHC test to select between tamoxifen and paclitaxel/cyclophosphamide in breast cancer?
If a highly sophisticated lab gets such lousy correlations, then you can imagine the accuracy of tests done in community hospitals. And yet every patient with breast cancer gets this test and in almost every patient the information is used to make much more critical decisions than in the cases of both the Her2/neu assay and also the cell culture assay.
Questions regarding the best methodology of HER2 testing as well as the clinical applications of such testing remain. Ultimately, the most useful test will be the one that correlates best with HER2-mediated cellular biology and clinical outcome. The comparison of HER2 detection with clinical end points will allow clarification of the predictive value of a particular method.