Honest Criticisms and Limitations of the STEP-2 Trial

GLP-1 medication and metabolic health image for Honest Criticisms and Limitations of the STEP-2 Trial

At a glance

| Detail | Value | |---|---| | N | 1,210 | | Intervention | Semaglutide 2.4 mg subcutaneous once weekly | | Comparator | Semaglutide 1.0 mg and placebo (both with lifestyle intervention) | | Duration | 68 weeks | | Primary endpoint | Percentage change in body weight from baseline | | Key result | −9.6% with semaglutide 2.4 mg vs −3.4% with placebo |

What STEP-2 Set Out to Do

The STEP-2 trial was the second in Novo Nordisk's Semaglutide Treatment Effect in People with Obesity program. While STEP-1 enrolled adults without diabetes, STEP-2 specifically recruited adults with a BMI of 27 or higher who also carried a diagnosis of type 2 diabetes. Participants received either semaglutide 2.4 mg, semaglutide 1.0 mg (the dose already approved for glycemic control), or placebo, all layered on top of a lifestyle intervention consisting of dietary counseling and increased physical activity.

The primary endpoint was straightforward: percentage change in body weight at week 68. A co-primary endpoint asked whether participants achieved at least 5% weight reduction. The headline, a 9.6% mean weight loss with the higher semaglutide dose, looked impressive. But the headline is never the full story.

Enrollment Biases That Shaped the Cohort

Clinical trials select for a population that often does not mirror the patients sitting in real-world exam rooms. STEP-2 is no exception. The following framework organizes the major enrollment biases by category and clinical consequence.

The HealthRX STEP-2 Enrollment Bias Matrix

| Bias Category | Specific Concern | Clinical Consequence | |---|---|---| | BMI floor | Required BMI ≥27 kg/m²; mean baseline BMI was ~35.7 | Excludes leaner T2D patients common in South and East Asian populations | | HbA1c cap | Enrolled only HbA1c 7.0−10.0% | Removes both well-controlled and severely uncontrolled patients | | Diabetes duration | Mean duration ~8 years | Excludes newly diagnosed patients and those with >20-year disease courses | | Background medications | Allowed 1−3 oral antidiabetics; excluded insulin and GLP-1 RA users | The very patients most likely to be prescribed semaglutide in practice were filtered out | | Geographic skew | 149 sites across 12 countries, but weighted toward North America and Europe | Under-represents populations with the steepest T2D incidence growth | | Run-in compliance | Required completion of a dose-escalation schedule | Self-selects for tolerability; real-world patients who quit at 0.5 mg never enter the dataset |

The HbA1c window of 7.0% to 10.0% is particularly noteworthy. Clinicians frequently encounter patients with A1c values above 10% who need both glycemic and weight management. STEP-2 tells us nothing about how semaglutide 2.4 mg performs in that group. Similarly, the exclusion of anyone already taking insulin or a GLP-1 receptor agonist removed the patients with the most advanced metabolic disease, the ones who arguably need a potent weight-loss intervention most.

Discontinuation and Missing Data

Roughly 19% of participants in the semaglutide 2.4 mg arm did not complete 68 weeks of treatment. The trial used a treatment-policy estimand (intention-to-treat) for the primary analysis, meaning data from dropouts were included through a multiple-imputation framework. This approach is standard and appropriate, but it also means the observed 9.6% weight loss figure blends genuine drug responders with imputed values for people who stopped injecting months earlier.

The trial also reported a "trial product estimand" that estimated the effect in participants who stayed on drug. Under that lens, weight loss was closer to 10.6%. The gap between 9.6% and 10.6% may seem small, but it reflects a real dilution from non-adherence. In clinical practice, where adherence barriers include cost, needle aversion, and GI side effects without the support structure of a trial, the dilution would likely be larger.

Worth noting: discontinuation due to adverse events was 5.9% in the 2.4 mg group versus 2.8% with placebo, per the published data. Gastrointestinal complaints (nausea, vomiting, diarrhea) drove most of these exits. The trial's dose-escalation protocol was designed to mitigate GI symptoms, yet nearly 1 in 17 participants on the higher dose still could not tolerate it well enough to continue.

Duration: 68 Weeks Is Not Long Enough

Type 2 diabetes is a lifelong condition. Obesity in the context of T2D is also chronic. A 68-week trial tells us what semaglutide does during the first 16 months, but nothing about years two through ten. Several questions remain unanswered by the STEP-2 dataset:

  • Weight regain after discontinuation. The STEP-1 extension data published later showed that participants regained roughly two-thirds of lost weight within a year of stopping semaglutide. STEP-2 did not include a structured withdrawal phase, so we lack parallel data in the T2D population.
  • Durability of HbA1c improvements. Mean HbA1c dropped by 1.6 percentage points in the 2.4 mg arm. Whether this holds at three or five years, or whether beta-cell decline erodes the benefit, is unknown from this dataset.
  • Long-term safety signals. Pancreatitis, gallbladder disease, and thyroid C-cell concerns require longer observation windows than 68 weeks. The Wegovy prescribing information carries a boxed warning about medullary thyroid carcinoma risk based on rodent data. STEP-2's follow-up was too short to add human evidence on this question.

Statistical Considerations Worth Examining

The primary analysis used a mixed model for repeated measures (MMRM) with multiple imputation for missing data. The trial was adequately powered, and the results were statistically significant with p <0.0001. However, several statistical nuances deserve attention.

Multiplicity adjustments. STEP-2 tested two doses against placebo across two co-primary endpoints and several secondary endpoints. A hierarchical testing procedure controlled the family-wise error rate, but the sheer number of comparisons raises the probability that at least some secondary findings (body composition subanalyses, quality-of-life scores) cleared significance by chance.

Subgroup heterogeneity. Pre-specified subgroup analyses showed that patients with baseline HbA1c above 8.5% lost less weight than those with lower HbA1c values. Patients on three background oral antidiabetics also showed attenuated weight loss relative to those on one drug. These interactions were not powered for statistical testing, and the trial publication appropriately noted they should be considered hypothesis-generating. Clinicians should be cautious about expecting uniform 9.6% reductions across all T2D phenotypes.

Placebo response magnitude. A 3.4% weight loss with placebo plus lifestyle intervention is not trivial. The lifestyle component included monthly counseling sessions and a 500 kcal/day deficit recommendation. In routine care without such structured support, the placebo arm's result may overstate what "usual care" delivers, which in turn may slightly inflate the perceived drug-specific benefit.

Conflict of Interest and Sponsor Involvement

Novo Nordisk funded STEP-2, provided the study drug, and was involved in study design, data collection, data analysis, and manuscript preparation. The trial's steering committee included academic investigators, but the statistical analysis was performed by the sponsor. Several authors disclosed consulting fees, advisory board participation, or equity in Novo Nordisk or competing pharmaceutical companies.

This does not automatically invalidate the findings. Industry-funded trials follow the same regulatory standards as investigator-initiated studies, and the data were reviewed by independent regulatory agencies including the FDA and EMA before approval. But sponsor involvement in statistical analysis means the academic community is trusting the company's analysts to make modeling decisions (handling of outliers, choice of imputation method, sensitivity analysis specifications) that can shift results by fractions of a percentage point. Independent replication of the statistical analysis from patient-level data has not been publicly reported.

The American Diabetes Association's Standards of Care recommend semaglutide for weight management in T2D based heavily on the STEP program. This is reasonable given the totality of evidence, but the circular dependency (sponsor funds trial, trial informs guidelines, guidelines drive prescribing, prescribing generates revenue) is a structural feature of modern drug development that readers should keep in mind.

Post-Publication Commentary

Several points emerged in the scientific discourse following STEP-2's publication:

Comparator choice criticism. Some commentators noted the absence of an active comparator arm using an established weight-loss medication (such as liraglutide 3.0 mg, which was already approved for obesity at the time of enrollment). Placebo-controlled designs establish efficacy against nothing; head-to-head trials establish clinical positioning. Without an active comparator, prescribers cannot directly conclude from STEP-2 that semaglutide 2.4 mg outperforms other pharmacologic options in T2D patients.

Body composition concerns. Weight loss with GLP-1 receptor agonists includes both fat mass and lean mass reduction. STEP-2 did not mandate DEXA scanning across all participants, limiting the ability to characterize the ratio of fat-to-lean tissue loss. Subsequent analyses from the STEP program and independent studies (including work by Wilding et al. in NEJM from STEP-1) suggested lean mass losses of roughly 40% of total weight lost, a proportion that raises concerns about sarcopenia risk in older or already-frail T2D patients.

Cost-effectiveness gap. At a U.S. list price exceeding $1,300 per month for branded Wegovy, several health technology assessment bodies questioned whether the magnitude of weight loss in T2D (less than in the non-diabetic STEP-1 population) justified the cost, particularly when generic metformin and SGLT2 inhibitors offer weight-neutral or modest weight-loss alternatives at a fraction of the price.

What STEP-2 Does and Does Not Prove

STEP-2 provides solid evidence that semaglutide 2.4 mg produces clinically meaningful weight loss in a selected population of adults with T2D and overweight or obesity over 68 weeks. It does not prove that these results generalize to insulin-treated patients, those with very high or very low HbA1c, lean individuals with T2D, or patients outside of a structured clinical trial environment. It does not address long-term safety, weight maintenance after drug withdrawal, or superiority over active comparators.

These are not fatal flaws. They are the ordinary boundaries of a single Phase 3 trial. The problem is not that STEP-2 has limitations. The problem is that marketing narratives and headline-driven coverage often present the 9.6% figure as if it were a universal guarantee rather than a population-average estimate from a carefully selected cohort studied for just over a year.

Frequently asked questions

References

  • Davies M, et al. Semaglutide 2.4 mg once a week in adults with overweight or obesity, and type 2 diabetes (STEP 2): a randomised, double-blind, double-dummy, placebo-controlled, phase 3 trial. Lancet. 2021;397(10278):971-984. PubMed
  • Wilding JPH, et al. Once-weekly semaglutide in adults with overweight or obesity. N Engl J Med. 2021;384(11):989-1002. PubMed
  • Wegovy (semaglutide) prescribing information. U.S. Food and Drug Administration. FDA Label
  • American Diabetes Association Professional Practice Committee. Standards of Care in Diabetes, 2023. Diabetes Care. 2023;46(Suppl 1). PubMed
  • Lincoff AM, et al. Semaglutide and cardiovascular outcomes in obesity without diabetes (SELECT). N Engl J Med. 2023;389(24):2221-2232. PubMed