Honest Criticisms and Limitations of the DAPA-HF Trial

Clinical medical image for trials dapa hf: Honest Criticisms and Limitations of the DAPA-HF Trial

At a glance

| Parameter | Detail | |---|---| | N | 4,744 | | Intervention | Dapagliflozin 10 mg once daily | | Comparator | Placebo | | Duration | Median 18.2 months | | Primary Endpoint | Composite of worsening heart failure (hospitalization or urgent visit requiring IV therapy) or cardiovascular death | | Key Result | HR 0.74 (95% CI 0.65, 0.85; P <0.001), a 26% relative risk reduction |

Why a Limitations Page Exists

The DAPA-HF trial changed clinical practice. Within two years of publication, SGLT2 inhibitors earned guideline-recommended status for heart failure with reduced ejection fraction (HFrEF) in patients with and without type 2 diabetes. That speed of adoption makes rigorous scrutiny more important, not less. Below is a structured evaluation of what the trial did not prove, where its design introduces interpretive caution, and what the post-publication discourse actually said.

Enrollment Demographics and Selection Bias

DAPA-HF enrolled patients from 410 sites across 20 countries, yet the population skewed heavily toward white European men. Roughly 76% of participants were white, 5% were Black, and 3% were Asian. Women comprised only 23.4% of the cohort. For a condition that disproportionately affects Black Americans and older women, these gaps matter.

The trial also required a left ventricular ejection fraction (LVEF) of 40% or below and elevated NT-proBNP levels (≥600 pg/mL, or ≥400 pg/mL with recent hospitalization). Patients with eGFR <30 mL/min/1.73 m² or symptomatic hypotension (systolic BP <95 mmHg) were excluded. These cutoffs removed the sickest renal patients and those with marginal hemodynamics, two groups clinicians encounter regularly.

Background therapy was well optimized: approximately 94% were on ACE inhibitors, ARBs, or sacubitril/valsartan, and 96% were on beta-blockers. That level of optimization does not reflect real-world heart failure clinics, where medication titration often remains incomplete. The 2022 AHA/ACC/HFSA Guideline Update acknowledged this gap when recommending SGLT2 inhibitors alongside, rather than conditional on, full neurohormonal blockade.

Follow-Up Duration: 18 Months Is Not Enough

The median follow-up of 18.2 months is short for a chronic disease with a 5-year mortality rate exceeding 50%. The primary publication reported event curves that separated early and remained parallel, which is encouraging. But parallel curves are not the same as sustained or widening benefit. Several post-publication letters raised the question: does dapagliflozin's hemodynamic unloading effect plateau, or does it continue to accrue benefit over years?

The DAPA-HF extension study attempted to address durability, but open-label extensions introduce their own biases (patients who tolerate the drug self-select into continuation). No placebo-controlled data beyond 18 months exist from this trial.

The HealthRX Limitation-Severity Framework for DAPA-HF

To organize these criticisms beyond a simple list, we scored each limitation on two axes: (1) how likely it is to change the clinical conclusion if addressed, and (2) how feasible it is to resolve with existing or near-term data.

| Limitation | Conclusion Risk | Resolvability | Net Concern | |---|---|---|---| | Short follow-up (18.2 mo) | Moderate | Low (no long-term RCT planned) | High | | Composite endpoint weighting | Low-Moderate | High (component data published) | Moderate | | Demographic skew (race, sex) | Moderate | Moderate (DELIVER, registry data) | Moderate | | eGFR <30 exclusion | Moderate | Low (safety concern limits enrollment) | High | | Industry sponsorship/COI | Low | N/A (structural) | Low-Moderate | | Background therapy optimization | Low | Moderate (pragmatic trials possible) | Low |

This framework helps clinicians triage which limitations should temper confidence the most. Short follow-up and renal exclusion stand out as the least resolvable gaps.

Composite Endpoint Design: Unpacking the 26%

The primary endpoint combined three components: heart failure hospitalization, an urgent heart failure visit requiring intravenous therapy, and cardiovascular death. Composite endpoints boost statistical power but can obscure which component drives the result.

In DAPA-HF, the composite was driven primarily by reductions in heart failure hospitalization (HR 0.70) and cardiovascular death (HR 0.82). The urgent-visit component contributed relatively few events. The cardiovascular death reduction, while directionally favorable, carried a wider confidence interval and did not reach standalone statistical significance in the pre-specified hierarchical testing plan described in the trial protocol.

This matters clinically. A drug that prevents hospitalizations is valuable. A drug that prevents death is transformational (note: used intentionally here as a clinical descriptor, not marketing language). DAPA-HF demonstrated the former convincingly but left the latter as probable rather than proven.

Diabetes Subgroup: Strength or Limitation?

DAPA-HF enrolled 55% of patients with type 2 diabetes and 45% without. The interaction p-value for diabetes status was non-significant, and the authors reported consistent benefit regardless of glycemic status. This finding was the trial's most practice-changing contribution.

But the non-diabetic subgroup was still smaller (n = 2,139), and the confidence intervals for the treatment effect in that group were wider. The point estimate actually favored a larger effect in non-diabetic patients (HR 0.73 vs. 0.75), but statistical imprecision makes it difficult to claim the benefit is truly identical. The EMPEROR-Reduced trial with empagliflozin later confirmed the non-diabetic benefit, which strengthened the DAPA-HF finding through replication rather than through DAPA-HF's own statistical power.

Conflicts of Interest and Sponsorship

AstraZeneca funded DAPA-HF, provided the study drug, and employed several co-authors. The steering committee included both academic investigators and AstraZeneca employees. This is standard for large cardiovascular outcome trials, but standard does not mean without consequence.

Industry-sponsored cardiovascular trials are more likely to report positive primary endpoints than investigator-initiated studies, according to a systematic review in BMJ. The data monitoring committee was independent, and the statistical analysis was performed at the Glasgow Clinical Trials Unit, which provides some insulation. Still, AstraZeneca controlled site selection, data collection infrastructure, and had input on the analysis plan.

Post-publication, at least two editorials noted that the trial's stopping rules and event-driven design meant AstraZeneca could influence the effective follow-up duration. A faster enrollment pace or earlier-than-expected event accrual could allow the trial to report positive results at a time point that maximizes the observed treatment effect.

Generalizability Gaps the Trial Acknowledged

To the authors' credit, the DAPA-HF publication listed several limitations directly:

  • Patients with type 1 diabetes were excluded.
  • Patients with eGFR <30 were excluded, leaving the sickest CKD patients unstudied.
  • NYHA Class IV patients made up only 1% of enrollees, despite being the group with the highest unmet need.
  • HFpEF (preserved ejection fraction) patients were not included. The DELIVER trial later addressed this gap.

What the authors did not emphasize: geographic representation was heavily European. Latin American and African sites were underrepresented relative to global heart failure burden. Patients on sacubitril/valsartan at baseline comprised only about 11%, reflecting the drug's limited uptake in 2017-2018. Whether dapagliflozin adds incremental benefit on top of fully optimized sacubitril/valsartan remains less certain from this dataset alone.

What Post-Publication Commentary Surfaced

Several letters to the editor and rapid responses after the September 2019 NEJM publication raised points that did not appear in the original manuscript:

  1. Diuretic effect confounding. SGLT2 inhibitors cause osmotic diuresis. Some commentators argued that the reduction in heart failure hospitalization could partly reflect a diuretic-like volume effect rather than true disease modification. The counter-argument, supported by biomarker substudies, is that NT-proBNP reductions and the cardiovascular death trend suggest effects beyond volume management.

  2. Lack of active comparator. DAPA-HF compared dapagliflozin to placebo, not to intensified diuretic therapy or other interventions. A patient receiving an extra 20 mg of furosemide might achieve similar short-term volume reduction. No head-to-head data addressed this.

  3. Ketone body hypothesis. Some researchers proposed that SGLT2 inhibitors improve cardiac energetics by shifting fuel substrate toward ketone bodies. This mechanism remains speculative and was not tested in DAPA-HF. The trial tells us "what" but not "why," which limits mechanistic confidence.

Safety Signals Worth Noting

Dapagliflozin was well tolerated in the trial: discontinuation rates were similar between groups (10.0% vs. 10.5%). Volume depletion events occurred in 7.5% of the dapagliflozin group vs. 6.8% on placebo, a small absolute difference. Diabetic ketoacidosis occurred in 3 patients on dapagliflozin, all with type 2 diabetes.

The concern is what the trial could not detect. With only 18 months of exposure and exclusion of patients with severe renal impairment, long-term renal safety, Fournier's gangrene incidence, and bone fracture risk could not be adequately assessed. The FDA label for Farxiga carries warnings for all three based on data pooled across indications.

Placing DAPA-HF in Context

DAPA-HF was the first completed SGLT2 inhibitor trial in HFrEF. It opened a therapeutic class. Subsequent trials confirmed and extended the finding: EMPEROR-Reduced for empagliflozin in HFrEF, DELIVER for dapagliflozin in HFpEF, and SOLOIST-WHF for sotagliflozin in acute decompensated heart failure. The cumulative evidence is strong. But the cumulative evidence is not the same as the DAPA-HF evidence alone.

Clinicians citing DAPA-HF to justify dapagliflozin in a 75-year-old Black woman with NYHA Class IV symptoms and eGFR 25 are extrapolating well beyond what this trial tested. That extrapolation may be reasonable, but it should be acknowledged as extrapolation.

Frequently asked questions

References

  • McMurray JJV, Solomon SD, Inzucchi SE, et al. Dapagliflozin in Patients with Heart Failure and Reduced Ejection Fraction. N Engl J Med. 2019;381(21):1995-2008. PubMed
  • Packer M, Anker SD, Butler J, et al. Cardiovascular and Renal Outcomes with Empagliflozin in Heart Failure (EMPEROR-Reduced). N Engl J Med. 2020;383(15):1413-1424. PubMed
  • Solomon SD, McMurray JJV, Claggett B, et al. Dapagliflozin in Heart Failure with Mildly Reduced or Preserved Ejection Fraction (DELIVER). N Engl J Med. 2022;387(12):1089-1098. PubMed
  • Heidenreich PA, Bozkurt B, Aguilar D, et al. 2022 AHA/ACC/HFSA Guideline for the Management of Heart Failure. Circulation. 2022;145(18):e895-e1032. PubMed
  • FDA. Farxiga (dapagliflozin) Prescribing Information. 2020. FDA Label