Inside the LEADER Methodology: What Most Summaries Skip

At a glance
| Parameter | Detail | |---|---| | N | 9,340 | | Intervention | Liraglutide 1.8 mg daily (subcutaneous) | | Comparator | Matching placebo (both added to standard of care) | | Duration | Median 3.8 years (minimum 3.5 years follow-up) | | Primary endpoint | First occurrence of 3-point MACE (CV death, nonfatal MI, nonfatal stroke) | | Key result | HR 0.87 (95% CI 0.78, 0.97); p <0.001 for noninferiority, p = 0.01 for superiority |
Why the Design Matters More Than the Headline
Most trial summaries stop at "13% relative risk reduction in MACE." That number, while accurate, obscures a set of design decisions that determine what the result actually means for clinical practice. The LEADER trial was constructed under the 2008 FDA guidance requiring cardiovascular outcomes trials (CVOTs) for all new diabetes drugs. That guidance was a response to the rosiglitazone controversy and set a noninferiority margin of 1.3 for the upper bound of the 95% confidence interval. LEADER was built to clear that bar first. Superiority was a secondary objective, tested only if noninferiority succeeded.
This distinction matters. A trial designed from the start to prove benefit would look different from one designed to rule out harm. LEADER sits in between, and its methodology reflects that tension.
Randomization and Stratification
Participants were randomized 1:1 to liraglutide or placebo using an interactive voice/web response system. Randomization was stratified by two variables: whether a participant had prior cardiovascular disease, and whether estimated glomerular filtration rate (eGFR) was above or below 60 mL/min/1.73 m². These strata were not arbitrary. About 81% of the enrolled population had established cardiovascular disease at baseline, which concentrated events in the group most likely to generate them.
The stratification by renal function was clinically sharp. Patients with CKD stage 3 or worse metabolize GLP-1 receptor agonists differently, and renal impairment is an independent cardiovascular risk amplifier. By stratifying rather than excluding these patients, the investigators ensured balanced distribution across arms without sacrificing generalizability.
One point rarely discussed: the LEADER protocol allowed investigators to reduce the dose from 1.8 mg to 1.2 mg if side effects were intolerable. About 6% of liraglutide-treated patients used the lower dose. Because this happened post-randomization, the intention-to-treat analysis includes these patients at whatever dose they received. The effect estimate therefore represents a slightly diluted version of full-dose therapy.
Blinding and the Open-Label Problem
LEADER was double-blind with matching placebo pens. The pens were visually identical, and the injection volume was the same. This is a stronger blinding approach than some CVOTs that used oral placebos for injectable drugs.
But here is where things get complicated. Standard of care was open-label. Both groups received whatever glucose-lowering therapy their physicians chose, excluding other GLP-1 receptor agonists, DPP-4 inhibitors, and pramlintide. Investigators were encouraged to treat to glycemic targets per local guidelines. The result was a 0.4 percentage point HbA1c separation between groups at 36 months (7.0% vs. 7.4%).
This glycemic gap creates an interpretive problem. Was the MACE reduction driven by liraglutide's cardiovascular biology, by better glucose control, or by some combination? The ACCORD trial had shown that aggressive glycemic reduction in a similar population did not reduce MACE and may have increased mortality. That context argues against a purely glycemic explanation for LEADER's result. The separation also narrowed over time as clinicians in the placebo group added more therapy, suggesting that the early glycemic difference may have mattered less than the drug's direct vascular effects.
Inclusion and Exclusion Criteria: Who Was Actually Studied
The enrollment criteria shaped the population in ways that limit extrapolation:
- Age ≥50 with established CV disease (coronary, cerebrovascular, peripheral vascular, CKD stage 3+, or heart failure NYHA II, III)
- Age ≥60 with CV risk factors (microalbuminuria, hypertension plus LVH, LV dysfunction, or ankle-brachial index <0.9)
- HbA1c ≥7.0%
- Excluded: type 1 diabetes, recent acute coronary or cerebrovascular event within 14 days, planned revascularization, dialysis patients, family or personal history of medullary thyroid carcinoma or MEN2
The exclusion of recent ACS patients is standard for safety but means LEADER says nothing about liraglutide in the immediate post-MI period. The MTC/MEN2 exclusion reflects the rodent thyroid C-cell tumor signal from preclinical studies, which remains in the Victoza label as a boxed warning despite no confirmed human cases.
The practical takeaway: LEADER's population was older (mean age 64), had long diabetes duration (mean 12.8 years), and was heavily burdened with prior CV events. Applying these results to a 45-year-old with newly diagnosed T2D and no cardiovascular history requires a leap the data do not directly support.
Primary Endpoint Definition and Adjudication
The primary composite was time to first occurrence of any component of 3-point MACE: cardiovascular death, nonfatal myocardial infarction, or nonfatal stroke. Each event was adjudicated by an independent committee blinded to treatment assignment, using standardized definitions from the Standardized Data Collection for Cardiovascular Trials Initiative.
Breaking the composite apart reveals an uneven pattern:
| Component | Liraglutide (n) | Placebo (n) | HR (95% CI) | |---|---|---|---| | CV death | 219 | 278 | 0.78 (0.66, 0.93) | | Nonfatal MI | 281 | 317 | 0.88 (0.75, 1.03) | | Nonfatal stroke | 159 | 160 | 0.89 (0.72, 1.11) | | Composite MACE | 608 | 694 | 0.87 (0.78, 0.97) |
The composite is driven primarily by cardiovascular death. Nonfatal MI and nonfatal stroke showed directionally favorable but non-significant point estimates. This is clinically important: liraglutide appeared to reduce the hardest endpoint (death) more than the softer ones. Some trialists view this as a strength, because mortality is the least susceptible to ascertainment bias. Others note that a drug reducing death without clearly reducing events raises mechanistic questions about whether it prevents events or improves survival after them.
Statistical Architecture
The statistical plan used a hierarchical (closed) testing procedure. Noninferiority for the primary composite was tested first at the one-sided 2.5% level against a margin of 1.30. Only if noninferiority was confirmed could superiority be tested, also at the two-sided 5% level. This sequential gating preserved the overall type I error rate.
The trial was event-driven: it required at least 611 primary-outcome events for 90% power to demonstrate noninferiority and at least 611 events for 80% power to detect a true hazard ratio of 0.85 for superiority. The final dataset included 1,302 first MACE events, giving the trial considerably more power than the minimum for noninferiority and adequate power for the superiority test.
Time-to-event analysis used Cox proportional hazards models, stratified by the same two randomization variables (prior CV disease and baseline eGFR). The proportional hazards assumption was tested and held. Kaplan-Meier curves separated gradually after about 12 months, consistent with a treatment effect that accumulates with exposure rather than appearing immediately.
One nuance: the protocol specified an "in-trial" estimand, meaning all events counted regardless of whether the participant was still taking the drug. About 26% of liraglutide patients and 30% of placebo patients discontinued study drug prematurely. Because these patients were still followed for events, the intention-to-treat analysis captures real-world adherence patterns. A per-protocol or on-treatment analysis would likely show a larger effect size, but the ITT result is more conservative and more generalizable.
What the Estimand Framework Reveals
The ICH E9(R1) addendum on estimands was published after LEADER, but applying its framework retrospectively is instructive. LEADER's primary estimand is a "treatment policy" strategy: it asks, "What happens when patients are assigned to liraglutide, regardless of whether they keep taking it?"
This is the right estimand for a regulatory question ("Is this drug safe to approve?") but arguably the wrong one for a clinical question ("Will my patient benefit if she takes liraglutide and stays on it?"). A "hypothetical" estimand, modeling what would have happened had everyone adhered, would answer the clinical question. Post-hoc on-treatment analyses from LEADER substudies suggest the on-treatment HR was closer to 0.83, a somewhat stronger signal diluted by the ~28% discontinuation rate in the ITT population.
Clinicians prescribing liraglutide should recognize that the 13% reduction is the conservative, intent-to-treat estimate. Patients who tolerate the drug and remain adherent may derive more benefit, though that inference comes with the usual caveats of on-treatment analysis.
Comparator Choice and Standard of Care
The comparator was placebo plus standard of care, not an active GLP-1 comparator or another glucose-lowering drug class. This was dictated by the FDA CVOT framework, which aims to isolate the cardiovascular effect of the test drug from background therapy.
Standard of care evolved during the trial. Metformin use was ~76% at baseline. Insulin use was ~45%. SGLT2 inhibitors were approved during the enrollment period, and a small percentage of patients received them. This creates a moving standard-of-care problem: the control arm received progressively better therapy over the trial's duration, which would, if anything, bias against liraglutide by improving outcomes in the placebo group.
The absence of an active comparator means LEADER cannot answer whether liraglutide is better than an SGLT2 inhibitor for cardiovascular protection. The SUSTAIN-6 trial later showed semaglutide (a more potent GLP-1 RA) achieved a larger MACE reduction (26%), but cross-trial comparisons are unreliable. The 2019 ADA/EASD consensus subsequently recommended GLP-1 RAs or SGLT2 inhibitors as second-line therapy after metformin in patients with established atherosclerotic disease, giving LEADER direct guideline impact.
Limitations the Authors Acknowledged
The original publication and supplementary materials note several limitations:
- High discontinuation rate (~26 to 30%), typical for long injectable trials but diluting the treatment effect in ITT analysis
- Glycemic separation between arms, making it impossible to fully distinguish glucose-mediated from direct cardiovascular effects
- Population skewed toward secondary prevention (81% with prior CVD), limiting applicability to primary prevention
- Single GLP-1 RA tested, so results are not automatically class-wide
- Open-label rescue therapy, potentially introducing bias through differential co-intervention
A limitation less discussed: the trial ran across 32 countries with varying standards of cardiovascular care. Regional heterogeneity in background therapy, event rates, and clinical practice introduces noise that the stratification variables only partially address.
Clinical Translation
LEADER changed practice. The FDA added a cardiovascular risk reduction indication to the Victoza label in 2017, making liraglutide the first GLP-1 RA approved specifically for cardiovascular benefit. The trial's methodology, while built for a noninferiority regulatory question, generated a superiority result strong enough to shift guidelines and prescribing patterns worldwide.
The critical lesson is that the 13% MACE reduction is not a single number but a product of specific design choices: the population enrolled, the comparator used, the estimand selected, the discontinuation rate tolerated, and the statistical hierarchy applied. Each of these choices is defensible, but each also constrains interpretation. Reading the trial without reading the methods means missing half the story.
Frequently asked questions
›
›
›
›
›
›
›
›
›
›
References
- Marso SP, Daniels GH, Poulter NR, et al. Liraglutide and cardiovascular outcomes in type 2 diabetes. N Engl J Med. 2016;375(4):311-322. PubMed
- U.S. Food and Drug Administration. Guidance for industry: diabetes mellitus, evaluating cardiovascular risk in new antidiabetic therapies to treat type 2 diabetes. December 2008. FDA.gov
- Victoza (liraglutide) prescribing information. Novo Nordisk. Revised 2017. FDA Label
- Marso SP, Bain SC, Consoli A, et al. Semaglutide and cardiovascular outcomes in patients with type 2 diabetes (SUSTAIN-6). N Engl J Med. 2016;375(19):1834-1844. PubMed
- Davies MJ, D'Alessio DA, Fradkin J, et al. Management of hyperglycemia in type 2 diabetes, 2018. A consensus report by the ADA and EASD. Diabetes Care. 2018;41(12):2669-2701. PubMed
- ACCORD Study Group. Effects of intensive glucose lowering in type 2 diabetes. N Engl J Med. 2008;358(24):2545-2559. PubMed