Honest Criticisms and Limitations of the REWIND Trial

GLP-1 medication and metabolic health image for Honest Criticisms and Limitations of the REWIND Trial

At a glance

| Detail | Value | |---|---| | N | 9,901 | | Intervention | Dulaglutide 1.5 mg weekly | | Comparator | Placebo (both added to standard of care) | | Duration | Median 5.4 years | | Primary endpoint | First occurrence of 3-point MACE (CV death, nonfatal MI, nonfatal stroke) | | Key result | HR 0.88 (95% CI 0.79, 0.99; p = 0.026) |

Why a Limitations Page Matters

Most summaries of REWIND stop at the headline: dulaglutide cut MACE by 12% in type 2 diabetes patients, including those without established cardiovascular disease. That result is real, but a single hazard ratio never tells the full story. Clinical decisions depend on understanding who was enrolled, what was measured, how the analysis was run, and what the investigators themselves flagged as weaknesses.

This page catalogs those weaknesses honestly, not to dismiss REWIND, but to clarify where its evidence is strong and where clinicians should exercise caution.

Enrollment Biases and Population Selection

The HealthRX Enrollment Bias Checklist for CVOTs

We apply four questions to every cardiovascular outcomes trial to assess how enrollment choices shape results:

  1. Risk-floor question. Did the trial require established CVD, or did it allow primary-prevention patients? If both, what was the mix?
  2. Baseline therapy question. Were participants already on guideline-directed medical therapy (statins, ACE inhibitors, antiplatelets), or was background treatment suboptimal?
  3. Run-in filter question. Did a placebo run-in period exclude non-adherent patients before randomization?
  4. Geographic representation question. Did recruitment sites reflect the global burden of the disease, or were certain regions overrepresented?

Applying this to REWIND:

Risk floor. REWIND is unusual among GLP-1 CVOTs because only 31.5% of participants had prior cardiovascular events at baseline. The majority qualified through cardiovascular risk factors alone (Gerstein et al., Lancet 2019). This is often cited as a strength, but it also means the absolute event rate was lower (2.7 events per 100 person-years in placebo), which compresses absolute risk reduction. In practical terms, the number needed to treat (NNT) over 5 years is approximately 60, substantially higher than in trials like LEADER (liraglutide), which enrolled a sicker cohort.

Baseline therapy. Over 80% of REWIND participants were on statins and antihypertensives at enrollment. Background care was relatively optimized, meaning the incremental benefit of dulaglutide was layered on top of aggressive medical management. Whether the same relative risk reduction would hold in populations with less access to baseline therapies remains unknown.

Run-in period. REWIND did not use a placebo run-in, which is a genuine methodological advantage. Some competing trials (SUSTAIN-6, for example) used single-blind run-in phases that could filter out early dropouts and inflate adherence figures.

Geography. The trial enrolled participants across 24 countries, with substantial representation from Latin America and Asia-Pacific alongside North America and Europe. Geographic diversity is a strength, though regional differences in standard-of-care intensity can introduce heterogeneity in background treatment effects.

Follow-Up Duration: Strength and Liability

REWIND's median follow-up of 5.4 years is the longest among completed GLP-1 receptor agonist CVOTs. Longer follow-up increases statistical power and allows slow-acting benefits (atherosclerotic plaque stabilization, weight reduction) to accumulate. It also, however, introduces challenges.

High discontinuation. Approximately 24% of participants in each arm discontinued the study drug prematurely, though vital-status follow-up remained above 97% (Gerstein et al., 2019). Drug discontinuation dilutes the treatment effect in an intention-to-treat analysis. The reported 12% reduction is therefore a conservative estimate of the on-treatment effect, but it also means the trial reflects real-world adherence imperfectly, since clinical practice discontinuation patterns may differ from those in a monitored trial setting.

Event-driven design ambiguity. The protocol targeted at least 1,200 MACE events. The final count was 1,257. The Kaplan-Meier curves separated gradually over time, with no clear early divergence in the first 12 to 18 months. This pattern is consistent with an anti-atherosclerotic mechanism rather than an acute cardioprotective effect, but it also means the benefit confidence interval was narrow enough to cross unity if even modestly fewer events had occurred.

Statistical Considerations and Effect Size

The primary result (HR 0.88 to 95% CI 0.79, 0.99, p = 0.026) meets the conventional threshold for statistical significance, but several statistical features merit attention.

Marginal significance

The upper bound of the confidence interval is 0.99. This is technically below 1.00, but just barely. The trial was designed with 90% power to detect a 15% reduction; the observed reduction was 12%. Had the effect been even slightly smaller, the trial would have been formally "negative" by its own statistical plan.

Multiplicity and secondary endpoints

The protocol pre-specified a hierarchical testing procedure. Because the primary endpoint was significant, secondary endpoints could be tested. Among components of the composite:

| MACE component | HR (95% CI) | |---|---| | Cardiovascular death | 0.91 (0.78, 1.06) | | Nonfatal myocardial infarction | 0.96 (0.79, 1.16) | | Nonfatal stroke | 0.76 (0.61, 0.95) |

The composite was driven primarily by nonfatal stroke reduction. Neither cardiovascular death nor nonfatal MI reached significance individually. This does not invalidate the composite, but it raises a question: is the clinical significance of a stroke-driven MACE reduction the same as one driven by mortality?

Subgroup consistency

Pre-specified subgroup analyses showed consistent direction of effect across age, sex, BMI, HbA1c, and baseline cardiovascular status categories. The interaction p-values were non-significant. The subgroup with established cardiovascular disease showed a numerically similar HR (0.87) to the primary-prevention subgroup (0.87), though both subgroups had wide confidence intervals when analyzed separately.

Generalizability Gaps

HbA1c range

Mean baseline HbA1c was 7.3%, and the entry criterion allowed patients with HbA1c as low as <9.5%. The population was, on average, reasonably well controlled. Patients with more severe hyperglycemia (HbA1c >9.5%) were excluded. Whether dulaglutide's cardiovascular effect extends to patients with poorly controlled diabetes is unstudied by this trial.

BMI and weight

Mean baseline BMI was approximately 32 kg/m². GLP-1 receptor agonists produce weight loss, and weight reduction is a plausible mediator of cardiovascular benefit. The applicability of REWIND's results to patients with normal or low BMI is uncertain. Conversely, patients with BMI >40 kg/m² were underrepresented, so the trial may also underestimate or overestimate effects in severe obesity.

Renal function

Patients with eGFR <15 mL/min/1.73m² were excluded. Since then, post-hoc analyses of REWIND have explored renal outcomes (Gerstein et al., Lancet 2019, secondary renal analysis), showing favorable effects on a composite renal endpoint. These post-hoc findings are hypothesis-generating, not confirmatory.

Race and ethnicity

The trial enrolled a majority-white population (approximately 76%). Hispanic/Latino participants made up about 20%, and Black participants represented a small minority. Given that cardiovascular risk profiles and GLP-1 receptor agonist pharmacokinetics may differ across racial and ethnic groups, this limits generalizability for underrepresented populations.

Conflict-of-Interest and Sponsor Involvement

REWIND was funded by Eli Lilly and Company. The dulaglutide (Trulicity) prescribing information now prominently features the cardiovascular indication derived from this trial.

Specific concerns:

  • Trial design. Eli Lilly employees were involved in study design, data collection, and data analysis, as disclosed in the publication (Gerstein et al., 2019).
  • Steering committee. The academic steering committee included investigators with financial relationships with Lilly, though the statistical analysis was performed at the Population Health Research Institute (PHRI) at McMaster University, which provides some analytical independence.
  • Publication timing. The primary results were presented at the American Diabetes Association 2019 Scientific Sessions and published simultaneously in the Lancet. This is standard practice for major CVOTs but means the peer-review timeline was compressed.

None of these facts invalidate the data. They do, however, place a responsibility on the reader to examine the results independently rather than accepting the framing at face value. The 2019 ADA Standards of Medical Care and later the 2023 ADA/EASD consensus incorporated REWIND into their recommendations for GLP-1 RA use in patients with type 2 diabetes and cardiovascular risk.

What Post-Publication Commentary Raised

Letters to the editor and subsequent editorials highlighted several points:

  1. Absolute vs. relative risk. The 12% relative reduction translates to an absolute risk difference of roughly 1.5 percentage points over 5.4 years. For a primary-prevention patient, the absolute benefit is smaller still. Editorials in the Lancet cautioned against conflating relative and absolute benefit when counseling patients.

  2. Composite endpoint composition. The heavy contribution of nonfatal stroke to the composite signal prompted discussion about whether dulaglutide should be positioned specifically as stroke-protective rather than broadly cardioprotective. No standalone stroke prevention indication has been pursued.

  3. Comparisons to other GLP-1 RAs. REWIND's HR of 0.88 is numerically less impressive than LEADER's 0.87 for liraglutide or SUSTAIN-6's 0.74 for semaglutide. Cross-trial comparisons are methodologically inappropriate given different populations and designs, but they inevitably influence clinical preference, especially given semaglutide's larger effect size in a shorter trial.

  4. Dose ceiling. REWIND tested only dulaglutide 1.5 mg weekly, the highest approved dose at the time of enrollment. Higher doses (3.0 mg and 4.5 mg) were later approved for glycemic control. Whether higher doses would yield greater cardiovascular benefit is unknown.

Putting REWIND in Context

REWIND contributed meaningfully to the evidence base for GLP-1 receptor agonists in cardiovascular risk reduction. Its inclusion of primary-prevention patients and extended follow-up filled gaps left by earlier CVOTs. At the same time, the modest effect size, high discontinuation rate, stroke-driven composite, and standard sponsor involvement all warrant transparency in clinical discussions.

For prescribers, the practical takeaway is straightforward: dulaglutide likely reduces MACE in type 2 diabetes patients with cardiovascular risk factors, but the absolute benefit is modest for lower-risk individuals. Shared decision-making should incorporate these nuances rather than defaulting to the headline hazard ratio.

Frequently asked questions

References

  • Gerstein HC, Colhoun HM, Dagenais GR, et al. Dulaglutide and cardiovascular outcomes in type 2 diabetes (REWIND): a double-blind, randomised placebo-controlled trial. Lancet. 2019;394(10193):121-130. PubMed
  • Trulicity (dulaglutide) prescribing information. Eli Lilly and Company. FDA Label
  • American Diabetes Association. Standards of Medical Care in Diabetes, 2019. Diabetes Care. 2019;42(Suppl 1). PubMed
  • Davies MJ, Aroda VR, Collins BS, et al. Management of hyperglycaemia in type 2 diabetes, 2022. A consensus report by the ADA and EASD. Diabetologia. 2022;65(12):1925-1966. PubMed
  • Marso SP, Daniels GH, Poulter NR, et al. Liraglutide and cardiovascular outcomes in type 2 diabetes (LEADER). N Engl J Med. 2016;375(4):311-322. PubMed