Inside the WHI E+P Methodology: What Most Summaries Skip

At a glance
| Parameter | Detail | |---|---| | N | 16,608 postmenopausal women | | Intervention | Conjugated equine estrogens 0.625 mg/d + medroxyprogesterone acetate 2.5 mg/d (CEE + MPA) | | Comparator | Matching placebo | | Duration | Planned 8.5 years; stopped early at 5.2-year mean follow-up | | Primary endpoints | CHD (nonfatal MI + coronary death) as primary benefit; invasive breast cancer as primary adverse outcome | | Key result | HR 1.29 (95% CI 1.02-1.63) for CHD; HR 1.26 (95% CI 1.00-1.59) for invasive breast cancer; global index crossed pre-specified harm boundary |
Why the design matters more than the headline
The 2002 JAMA publication of the WHI estrogen-plus-progestin arm changed prescribing patterns worldwide almost overnight. Combined HRT prescriptions dropped by roughly 50% within two years. But most of the public and clinical narrative compressed the trial into a single sentence: "HRT causes breast cancer and heart attacks." That compression lost critical context embedded in the trial's methodology.
Understanding what the WHI E+P actually tested, in whom, with what dose, against what comparator, and measured by what endpoint structure, is the difference between informed prescribing and reflexive avoidance.
Enrollment and the age question
The WHI recruited women aged 50 to 79 from 40 US clinical centers between 1993 and 1998. The stated goal was to evaluate chronic disease prevention in postmenopausal women. This framing is important: the trial was not designed to evaluate symptom management in the menopausal transition.
The mean participant age was 63.3 years. Only 10.7% of participants were aged 50 to 54. Over 66% were 60 or older at randomization. The median time since menopause was 12 years. These women were, on average, more than a decade past the point when most clinicians initiate HRT for vasomotor symptoms.
Inclusion required an intact uterus (women without a uterus were assigned to the separate CEE-alone arm). Women currently using HRT underwent a 3-month washout before randomization. Exclusion criteria filtered for competing mortality risks, prior breast cancer, and conditions where HRT was strictly contraindicated, but the criteria did not exclude women with pre-existing cardiovascular disease risk factors. Roughly 36% had hypertension at baseline, 12.5% were on statins, and the mean BMI was 28.5.
This baseline risk profile matters. The enrolled population carried substantially more cardiovascular risk than a typical 51-year-old presenting to her gynecologist with hot flashes.
The intervention: fixed-dose, single-formulation
Every participant in the active arm received the same regimen: CEE 0.625 mg plus continuous MPA 2.5 mg daily. There was no dose titration, no option for cyclical progestin, no transdermal estrogen arm, and no use of micronized progesterone.
This design choice maximized internal validity at the cost of external generalizability. In clinical practice, HRT is individualized. Lower doses, transdermal delivery, and micronized progesterone (which has a different pharmacologic risk profile from MPA) are widely used. The WHI answered a question about one specific combination at one specific dose in one specific delivery route.
MPA, a synthetic progestin, has androgenic and glucocorticoid activity that micronized progesterone lacks. Observational data from the French E3N cohort later showed that estrogen combined with micronized progesterone carried no detectable breast cancer signal over a median 8.1 years of follow-up, while estrogen plus synthetic progestins did. The WHI did not and could not test this distinction.
Randomization and blinding
Randomization was computer-generated, stratified by clinical center and age decade (50-59, 60-69, 70-79). Allocation was 1:1 to active pills or matched placebo. Participants, investigators, and outcome adjudicators were blinded.
Unblinding was permitted for safety events, and the Data and Safety Monitoring Board (DSMB) reviewed accumulating data on a pre-specified schedule. A noteworthy feature of the trial: adherence declined substantially over time. By year 6, only about 54% of women assigned to active treatment were still taking their pills. Roughly 10.7% of placebo-group women initiated open-label HRT during the study period. This crossover dilutes the treatment signal in both directions. The primary 2002 report analyzed results by intention-to-treat, which is methodologically correct but means the observed hazard ratios underestimate what compliant users actually experienced, for both harms and benefits.
The endpoint architecture
This is where the WHI's design gets genuinely unusual, and where most summaries fail their readers. The trial did not use a single primary endpoint. It used a dual-primary structure layered under a composite "global index."
The HealthRX WHI Endpoint Hierarchy Framework:
| Level | Endpoint | Role in the trial | |---|---|---| | Primary benefit | CHD (nonfatal MI + coronary death) | The hypothesis was that HRT would reduce CHD | | Primary adverse | Invasive breast cancer | Pre-specified as the main expected harm | | Global index | CHD + stroke + PE + breast cancer + endometrial cancer + colorectal cancer + hip fracture + death from other causes | Composite net-benefit/harm metric; the stopping boundary was tied to this | | Secondary | Stroke, PE, colorectal cancer, hip fracture, endometrial cancer, death | Monitored individually but not the formal basis for early termination |
The global index was the DSMB's operational stopping tool. When the weighted sum of harms minus benefits crossed a pre-defined O'Brien-Fleming boundary, the board recommended termination.
This structure creates an interpretive challenge. The global index treats an additional hip fracture prevented as offsetting, unit-for-unit, an additional breast cancer diagnosed. Clinically, those events are not equivalent in severity, duration of morbidity, or patient preference. The index was a statistical convenience for monitoring, not a clinical equivalence claim, but it was widely misread as one.
The adjusted confidence intervals for the dual-primary endpoints used a Bonferroni-type correction. The nominal 95% CI for CHD was adjusted to a monitoring boundary that accounted for repeated interim looks. The breast cancer finding crossed its pre-specified boundary only at the final analysis. Neither primary endpoint, taken alone, met the conventional two-sided p < 0.05 criterion at every interim analysis, which is part of why the DSMB deliberated for months before stopping.
The statistical framework and what it actually estimated
The primary analysis used Cox proportional hazards models, stratified by age and randomization status in the dietary modification trial (a co-running WHI component that shared participants). Hazard ratios and 95% CIs were the reported effect measures.
The estimand was an intention-to-treat average treatment effect: what happens when you assign a population of women aged 50 to 79 to take CEE + MPA versus placebo, regardless of actual adherence. This is the right estimand for a policy question ("should we recommend combined HRT for chronic disease prevention in all postmenopausal women?"). It is the wrong estimand for a clinical question ("what is the risk for my 52-year-old patient who will actually take the medication as prescribed for 3 to 5 years?").
Per-protocol and as-treated analyses were conducted secondarily. The as-treated analysis in the 2002 publication showed somewhat larger hazard ratios for breast cancer in adherent women, consistent with the dilution effect of nonadherence in the ITT analysis.
The timing hypothesis the original design could not test
Post-hoc age-stratified analyses, published later in Rossouw et al. 2007, revealed a pattern the original design was not powered to detect. Women aged 50 to 59 showed a non-significant trend toward reduced CHD (HR 0.93), while women aged 70 to 79 showed a clear increase (HR 1.44). For all-cause mortality, the 50-to-59 subgroup showed HR 0.70 (95% CI 0.51-0.96).
This "timing hypothesis" or "window of opportunity" concept, that HRT initiated near menopause may be cardioprotective while initiation decades later may be harmful, was not part of the original trial design. The trial was powered for the overall cohort, not for age-decade subgroups. These subgroup findings are hypothesis-generating, not confirmatory.
The subsequent KEEPS trial and ELITE trial were specifically designed to test early-initiation HRT in younger postmenopausal women. ELITE found that estradiol started within 6 years of menopause slowed carotid intima-media thickness progression, while estradiol started 10 or more years after menopause did not. These trials support the biologic plausibility of the timing hypothesis but remain smaller than the WHI.
Limitations the authors acknowledged
The 2002 publication itself noted several design-level limitations:
- Single formulation tested. Results cannot be extended to other estrogen types, progestins, doses, or routes.
- Adherence decline. The ITT analysis is diluted by nonadherence and crossover.
- Population age. The mean age of 63 limits applicability to typical HRT candidates in their early 50s.
- Short follow-up relative to cancer latency. 5.2 years is insufficient to fully characterize breast cancer risk, which may require 10 or more years of exposure data.
- Lack of symptomatic enrollment criterion. The trial tested disease prevention, not symptom treatment. Women were not required to have vasomotor symptoms.
The 2017 cumulative follow-up at 18 years post-randomization showed that the breast cancer signal persisted but all-cause mortality was not significantly different between groups (HR 1.00 to 95% CI 0.94-1.07 for CEE + MPA). This longer-term data is critical context that the 2002 headline lacked.
How the comparator choice shaped the narrative
The comparator was inert placebo. There was no active-comparator arm testing, for example, lifestyle modification, statin therapy, or lower-dose HRT against full-dose CEE + MPA. In the early 1990s, when the trial was designed, the dominant clinical hypothesis was that HRT would replace statins as first-line cardiovascular prevention in postmenopausal women. The placebo comparison was designed to test that specific hypothesis.
By the time the results published in 2002, statin therapy had become standard cardiovascular prevention. The clinical question had shifted from "HRT vs. nothing" to "HRT vs. modern cardiovascular risk management." The WHI could not answer this updated question.
What the FDA label reflects
Current FDA labeling for CEE + MPA products carries a boxed warning derived directly from the WHI findings. The label specifies that estrogens with progestins should not be used for cardiovascular disease prevention. It mandates using the lowest effective dose for the shortest duration consistent with treatment goals.
The 2022 Hormone Therapy Position Statement from the North American Menopause Society (NAMS) recontextualizes the WHI data: for symptomatic women under 60 or within 10 years of menopause, the benefit-risk ratio of HRT is generally favorable. This position reflects the age-stratified reanalysis, not the original headline finding.
The bottom line on methodology
The WHI E+P was a well-executed trial that answered the question it was designed to answer: should fixed-dose oral CEE + MPA be prescribed as chronic disease prevention to unselected postmenopausal women aged 50 to 79? The answer was no. The problem was not the trial. The problem was that the clinical world interpreted a prevention trial as a treatment trial, generalized a fixed-dose oral regimen to all HRT, and collapsed a nuanced age-interaction signal into a single risk number.
Reading the methodology, not just the abstract, is what separates evidence-based prescribing from headline-based avoidance.
Frequently asked questions
›
›
›
›
›
›
›
›
›
›
References
- Writing Group for the Women's Health Initiative Investigators. "Risks and benefits of estrogen plus progestin in healthy postmenopausal women." JAMA. 2002;288(3):321-333. PubMed
- Rossouw JE, Prentice RL, Manson JE, et al. "Postmenopausal hormone therapy and risk of cardiovascular disease by age and years since menopause." JAMA. 2007;297(13):1465-1477. PubMed
- Hodis HN, Mack WJ, Henderson VW, et al. "Vascular effects of early versus late postmenopausal treatment with estradiol." N Engl J Med. 2016;374(13):1221-1231. PubMed
- Manson JE, Aragaki AK, Rossouw JE, et al. "Menopausal hormone therapy and long-term all-cause and cause-specific mortality: the Women's Health Initiative randomized trials." JAMA. 2017;318(10):927-938. PubMed
- The 2022 Hormone Therapy Position Statement of The North American Menopause Society. Menopause. 2022;29(7):767-794. PubMed
- FDA Label for Prempro (conjugated estrogens/medroxyprogesterone acetate). FDA