WHI vs KEEPS vs ELITE: Three Studies, Three Views of Menopausal HRT

At a Glance
At a glance
| Trial | N | Population (mean age) | Regimen | Follow-up | Primary Endpoint | Primary Result | Dropout / Discontinuation | Key Adverse Events | |---|---|---|---|---|---|---|---|---| | WHI E+P (Rossouw 2002) | 16,608 | Postmenopausal women with intact uterus; mean age 63.3 yrs | CEE 0.625 mg/day + MPA 2.5 mg/day (oral, continuous) | 5.2 yrs (stopped early) | Incident CHD (clinical outcome) | HR 1.29 (95% CI 1.02, 1.63) for CHD; trial stopped for breast cancer signal (HR 1.26) | ~42% discontinued active treatment | Breast cancer, VTE, stroke increased; colorectal cancer, hip fracture reduced | | WHI E-alone (Anderson 2004) | 10,739 | Postmenopausal women post-hysterectomy; mean age 63.6 yrs | CEE 0.625 mg/day (oral) | 6.8 yrs (stopped early) | Incident CHD (clinical outcome) | HR 0.91 (95% CI 0.75, 1.12); no significant CHD effect | ~54% discontinued active treatment | Stroke increased (HR 1.39); breast cancer HR 0.77 (reduced, NS trend); VTE increased | | KEEPS (Harman 2014) | 727 | Recently menopausal women; mean age 52.6 yrs; within 6 to 36 months of FMP | Oral CEE 0.45 mg/day + micronized progesterone 200 mg/day (cyclic) vs transdermal E2 0.05 mg/day + micronized progesterone vs placebo | 4 yrs | CIMT progression (surrogate) | No significant difference among arms in CIMT progression | ~17% withdrew | Mood and hot flash improvement with active arms; no excess breast cancer or CVD events at 4 yrs | | ELITE (Hodis 2016) | 643 | Postmenopausal women stratified by time since menopause (<6 yrs vs >10 yrs); mean age 55.4 yrs | Oral 17β-estradiol 1 mg/day + vaginal progesterone gel (for uterus-intact women) vs placebo | 5 yrs (median) | CIMT progression (surrogate) | CIMT slowed in early (<6 yr) group (p=0.008); no benefit in late (>10 yr) group | ~23% withdrew | No significant increase in CVD events, breast cancer, or VTE |
Population Differences: Why These Trials Are Not Talking About the Same Woman
The single largest source of confusion in interpreting these four datasets is conflating the patient each trial enrolled. The WHI, by design and by the epidemiological wisdom of its era, targeted older postmenopausal women to maximize event rates. The mean age at randomization in both WHI arms was approximately 63 years, and roughly 70% of enrolled participants were more than 10 years past their final menstrual period at baseline. That design choice was scientifically defensible for detecting cardiovascular events in a reasonable timeframe, but it created a population profoundly unlike the perimenopausal woman asking her clinician about symptom relief at age 51.
KEEPS enrolled women aged 42 to 58, within 6 to 36 months of their final menstrual period, with a mean age of 52.6 years. ELITE went further by explicitly stratifying its participants: women less than six years past menopause versus women more than ten years past. That stratification was the methodological heart of the timing hypothesis test, and it is what makes ELITE uniquely informative compared with any single-arm study.
These population differences matter for generalizability in at least three concrete ways.
Vascular biology at baseline. Early menopause is associated with a period of estrogen-responsive, relatively healthy arterial endothelium. Subclinical atherosclerosis accumulates over years without estrogen. By the time the typical WHI participant was randomized, her coronary arteries had likely accumulated years of subclinical plaque. Animal and mechanistic data suggest that estrogen may accelerate plaque instability in already-diseased arteries while slowing progression in healthy ones. The ELITE investigators articulated this directly, and their CIMT data are consistent with it.
Breast tissue biology. Breast cancer risk tied to estrogen-progestogen combinations is influenced by cumulative exposure and the baseline proliferative state of breast epithelium. Older women in the WHI may have had different baseline mammographic density and background risk than the younger women in KEEPS. Those differences make direct comparison of breast cancer event rates across trials unreliable without age-standardization, which none of the trials performed against each other.
Cognitive reserve. Neurobiological evidence suggests that estrogen receptors in hippocampal and prefrontal regions are most responsive when estrogen exposure begins before significant neuronal estrogen-deprivation occurs. The WHI Memory Study (WHIMS) found increased dementia risk with CEE+MPA in women aged 65 and older. KEEPS-Cognitive and Affective (KEEPS-Cog) enrolled women a decade younger at a fundamentally different stage of neural aging.
Methodology Differences: Surrogates vs. Events, Formulations, and Blinding
These trials differ not only in who they studied but in what they were asking and how they measured the answer.
Clinical outcomes vs. surrogate endpoints. Both WHI arms were powered for hard clinical events: incident coronary heart disease, invasive breast cancer, stroke, pulmonary embolism, colorectal cancer, and hip fracture. Their sample sizes (over 10,000 per arm) were calculated around event rates expected in older women. KEEPS and ELITE, enrolling younger women with low short-term event rates, were powered instead around carotid intima-media thickness (CIMT) progression, a validated surrogate for subclinical atherosclerosis. CIMT improvement does not prove reduction in myocardial infarction or stroke; it is a plausible intermediate marker. This is a genuine limitation of KEEPS and ELITE: neither was large enough, or long enough, to detect differences in hard cardiovascular events. Surrogate-endpoint trials generate hypotheses; they do not confirm clinical benefit.
Hormone formulation. This is where the four datasets diverge most sharply in their practical implications. The WHI used conjugated equine estrogens (CEE) at 0.625 mg/day, the standard oral dose of its era, combined with medroxyprogesterone acetate (MPA), a synthetic progestogen. KEEPS used two active arms: one replicating the WHI's oral CEE but at a lower dose (0.45 mg), and one using transdermal 17β-estradiol (50 mcg/patch) with oral micronized progesterone. ELITE used oral 17β-estradiol 1 mg with vaginal progesterone gel.
MPA and micronized progesterone are not biologically equivalent. MPA has glucocorticoid and partial androgen receptor activity, can oppose estrogen's favorable effects on HDL cholesterol, and is associated in observational data (particularly the Million Women Study) and the WHI itself with higher breast cancer risk than estrogen alone. Micronized progesterone has a more selective receptor profile. Whether this difference drives meaningful clinical outcome divergence is unproven in adequately powered head-to-head RCTs, but the mechanistic rationale for suspecting it does is credible.
Blinding and placebo design. All four datasets used double-blind, placebo-controlled designs, which is a shared methodological strength. Dropout and discontinuation rates differed substantially, with WHI arms experiencing discontinuation rates of 42 to 54 percent, compared with 17 to 23 percent in KEEPS and ELITE. High dropout in the WHI complicated per-protocol analyses and may have diluted observed effect sizes in both directions.
Statistical approach. The WHI used a sequential monitoring design with pre-specified stopping rules based on a global index. The trial was stopped early (E+P arm at 5.2 years, E-alone at 6.8 years) on that basis. Early stopping of trials tends to inflate apparent effect sizes for the outcomes that triggered stopping, a well-recognized statistical artifact. KEEPS and ELITE ran to planned completion.
Results, Matched: What Each Trial Found on Shared Outcomes
Cardiovascular Outcomes
The WHI E+P arm reported a significantly elevated CHD hazard ratio of 1.29 in the overall population. The E-alone arm did not find a significant CHD effect (HR 0.91). A post-hoc age-stratified WHI analysis published in 2007 found that women aged 50 to 59 who received CEE alone had a lower coronary calcium score and a trend toward reduced CHD, providing within-WHI support for the timing hypothesis even before KEEPS and ELITE reported.
KEEPS found no significant difference in CIMT progression between either active arm and placebo over four years. This null CIMT result has sometimes been misread as contradicting the timing hypothesis; it does not. KEEPS was not powered to detect CIMT differences that might require more than four years to become apparent in a low-risk population.
ELITE found statistically significant slowing of CIMT progression in the early-menopause stratum (p=0.008) but not in the late-menopause stratum, directly testing and supporting the timing hypothesis on its pre-specified primary endpoint. Coronary artery calcium was a secondary endpoint in ELITE; scores were lower in the early-initiation group, though that finding did not reach significance.
Breast Cancer Risk
The WHI E+P arm reported an increased invasive breast cancer risk (HR 1.26 to 95% CI 1.00, 1.59) that contributed to early stopping. The WHI E-alone arm found a non-significant reduction (HR 0.77). That divergence between arms has been replicated in observational cohorts and supports the inference that the progestogen component, specifically MPA, drives most of the breast cancer signal in combined HRT.
Neither KEEPS nor ELITE was powered or long enough to evaluate breast cancer incidence. Both reported no statistically significant excess breast cancer events in their active arms, but given sample sizes under 800 and follow-up under six years, absence of a breast cancer signal in these trials is uninformative about long-term risk. Clinicians and patients should not interpret KEEPS and ELITE as breast cancer safety data.
Cognitive Outcomes
The WHIMS sub-study found that CEE+MPA significantly increased the risk of probable dementia (HR 2.05 to 95% CI 1.21, 3.48) in women aged 65 and older, and CEE alone showed a non-significant trend in the same direction. The KEEPS-Cog substudy found that both oral CEE and transdermal estradiol improved some mood and affect measures without worsening cognitive test performance, and a follow-up KEEPS-Cog report suggested possible verbal memory benefits with oral CEE. These findings are not contradictory to WHIMS once age at initiation is accounted for. ELITE's cognitive sub-study found no significant between-group differences on the primary cognitive composite score, though secondary measures trended favorably in the early-initiation group.
Taken together, the cognitive data suggest a window where estrogen initiation is at worst neutral and possibly beneficial for brain function, followed by a window where initiation is potentially harmful. The transition point is not precisely established.
Bone Density
All four datasets consistently found protective effects on bone. The WHI E+P arm reduced hip fracture risk (HR 0.66), one of the few outcomes that favored active treatment. The E-alone arm similarly reduced hip fracture (HR 0.61). KEEPS reported significant improvement in lumbar spine and total hip BMD in both active arms versus placebo. ELITE, though not primarily designed for bone outcomes, reported preserved BMD in the estradiol group. Bone protection from estrogen is the most consistent finding across all formulations, timing groups, and study designs represented here.
What the Trials Together Do and Do Not Establish
What they establish. Estrogen-based HRT, regardless of formulation, protects bone. That signal is consistent, dose-dependent, and clinically meaningful. The addition of a progestogen (necessary for uterine protection in women with an intact uterus) increases breast cancer risk when that progestogen is MPA; whether micronized progesterone carries the same risk at the population level remains an open question not answered by any trial reviewed here. Initiating hormone therapy more than a decade past menopause in older women does not confer cardiovascular benefit and may cause harm, particularly from stroke and VTE. That is what the WHI actually shows for its actual population.
What they do not establish. None of these trials proves that initiating estradiol-based transdermal therapy with micronized progesterone within two years of menopause in a healthy, recently menopausal woman reduces myocardial infarction, stroke, or dementia. KEEPS and ELITE provide biologically plausible surrogate-marker support for a benefit, but surrogate markers are not clinical outcomes. No adequately powered RCT has tested that specific scenario, at that specific formulation, with hard clinical event endpoints. That gap in the evidence is large and should be communicated plainly to patients.
The 2022 Menopause Society (NAMS) position statement endorses a timing-sensitive, formulation-aware approach to HRT, which is consistent with reading the four trials in combination. But that endorsement is not the same as Level A evidence from a clinical-outcomes trial in younger, recently menopausal women.
Outstanding Questions for the Next Trial
-
The definitive timing-hypothesis trial. A large, adequately powered RCT enrolling women within two years of natural menopause, randomized to transdermal 17β-estradiol plus micronized progesterone versus placebo, powered for myocardial infarction and stroke, with minimum 10-year follow-up, does not exist. It should.
-
Progestogen type and breast cancer. No head-to-head RCT has compared MPA versus micronized progesterone on breast cancer incidence as a primary endpoint with adequate power. Observational data are suggestive but confounded.
-
Cognitive benefit confirmation. A prospective trial randomizing recently menopausal women specifically to evaluate Alzheimer's disease incidence or validated cognitive decline measures, not CIMT or surrogate scores, is needed to resolve whether the KEEPS-Cog and WHIMS signals reflect true timing-dependent neuroprotection.
-
Optimal duration. None of these trials adequately characterizes risk-benefit balance beyond 10 years of continuous use in women who began therapy in early menopause.
-
Subgroup heterogeneity. How BRCA1/2 carrier status, baseline cardiovascular risk scores, baseline mammographic density, or specific menopausal symptom burden should modify initiation decisions remains poorly quantified in RCT data.
Frequently asked questions
›
›
›
References
- Rossouw JE, et al. Risks and benefits of estrogen plus progestin in healthy postmenopausal women. JAMA. 2002;288(3):321-333. https://pubmed.ncbi.nlm.nih.gov/12117397/
- Anderson GL, et al. Effects of conjugated equine estrogen in postmenopausal women with hysterectomy. JAMA. 2004;291(14):1701-1712. https://pubmed.ncbi.nlm.nih.gov/15082697/
- Harman SM, et al. KEEPS: The Kronos Early Estrogen Prevention Study. Ann Intern Med. 2014;161(4):249-260. https://pubmed.ncbi.nlm.nih.gov/25069009/
- Hodis HN, et al. Vascular effects of early versus late postmenopausal treatment with estradiol. N Engl J Med. 2016;374(13):1221-1231. https://pubmed.ncbi.nlm.nih.gov/27028912/
- Shumaker SA, et al. Estrogen plus progestin and the incidence of dementia and mild cognitive impairment in postmenopausal women: the Women's Health Initiative Memory Study. JAMA. 2003;289(20):2651-2662. https://pubmed.ncbi.nlm.nih.gov/12485966/
- Beral V; Million Women Study Collaborators. Breast cancer and hormone-replacement therapy in the Million Women Study. Lancet. 2003;362(9382):419-427. https://pubmed.ncbi.nlm.nih.gov/12927427/
- Rossouw JE, et al. Postmenopausal hormone therapy and risk of cardiovascular disease by age and years since menopause. JAMA. 2007;297(13):1465-1477. https://pubmed.ncbi.nlm.nih.gov/17380512/
- The Menopause Society. 2022 Hormone Therapy Position Statement. Menopause. 2022;29(7):767-794. https://doi.org/10.1097/GME.0000000000002028