Honest Criticisms and Limitations of the STEP-3 Trial

At a glance
| Detail | Value | |---|---| | N | 611 (randomized 2:1) | | Intervention | Semaglutide 2.4 mg subcutaneous weekly + IBT (30 counseling sessions) | | Comparator | Placebo subcutaneous weekly + IBT (30 counseling sessions) | | Duration | 68 weeks (initial 8-week low-calorie diet phase, then 60-week treatment) | | Primary endpoint | Percentage change in body weight from baseline to week 68 | | Key result | -16.0% semaglutide+IBT vs -5.7% placebo+IBT (estimated treatment difference: -10.3 percentage points, 95% CI -12.0 to -8.6; p <0.001) | | ClinicalTrials.gov | NCT03611582 |
What STEP-3 Actually Tested
STEP-3 was one of four phase 3 trials in the Semaglutide Treatment Effect in People with Obesity (STEP) program. Its distinguishing feature was pairing semaglutide 2.4 mg with IBT, a structured counseling program involving 30 sessions over 68 weeks. Both arms received the same IBT protocol, plus an initial 8-week low-calorie diet (1,000 to 1,200 kcal/day using meal replacements). The trial randomized 611 adults without diabetes at 41 U.S. sites.
The design aimed to answer whether semaglutide adds clinically meaningful weight loss on top of an already intensive lifestyle program. It did. But the design also introduced several layers of complexity that limit how cleanly the results translate to typical clinical practice.
Enrollment Bias: Who Was Actually Studied
The participant pool raises immediate generalizability questions. According to the published baseline characteristics, 77% of enrolled participants were women and 84% were white. Hispanic/Latino participants accounted for roughly 14% of the cohort. Black participants comprised approximately 10%, despite obesity disproportionately affecting Black Americans at higher rates than any other racial group per CDC prevalence data.
HealthRX Enrollment Representativeness Scorecard
| Dimension | STEP-3 Cohort | U.S. Obesity Population | Gap Assessment | |---|---|---|---| | Sex (% female) | 77% | ~55% | Over-represented | | Race (% white) | 84% | ~58% | Over-represented | | Race (% Black) | ~10% | ~22% | Under-represented | | Ethnicity (% Hispanic) | ~14% | ~18% | Slightly under-represented | | Type 2 diabetes | Excluded | ~30-40% of obese adults | Entirely excluded | | Mean BMI | 38.0 kg/m² | Varies widely | Mid-range class II | | Age range (mean) | ~46 years | Spans 18-75+ | Narrower window | | Geographic scope | 41 U.S. sites only | Global burden | U.S.-centric |
This scorecard highlights that the trial population was meaningfully whiter, more female, and metabolically healthier than the real-world population most likely to receive semaglutide prescriptions. Excluding type 2 diabetes was intentional (STEP-2 covered that population), but the cumulative demographic skew limits confidence in extrapolating effect sizes across groups.
All 41 trial sites were located in the United States. This is not inherently a flaw, but it constrains generalizability to populations with different dietary patterns, healthcare access models, and obesity phenotypes. The Wegovy prescribing information does not restrict use by race or geography, yet the evidence base supporting it leans on populations that do not mirror its likely global user base.
The IBT Confound: Isolating Drug Effect Is Harder Than It Looks
Both arms received 30 sessions of structured behavioral counseling. This was the point: STEP-3 tested semaglutide as an add-on to IBT. But this design creates an interpretive challenge.
The placebo+IBT arm lost 5.7% of body weight. That is a clinically meaningful result on its own, exceeding the 5% threshold that the FDA's 2007 obesity drug guidance considers pharmacologically relevant. So clinicians reading STEP-3 need to ask: how much of the 16.0% in the semaglutide arm reflects the drug alone, how much reflects IBT alone, and how much reflects a synergistic interaction?
STEP-1, which tested semaglutide 2.4 mg against placebo with standard lifestyle counseling (not IBT), showed 14.9% weight loss in the drug arm versus 2.4% for placebo. Comparing across trials is imprecise, but the rough arithmetic suggests IBT may have added approximately 1 percentage point to semaglutide's effect while contributing roughly 3.3 additional percentage points in the placebo arm. This pattern hints that the drug-IBT interaction may not be as synergistic as the headline number implies. IBT helps, but it helps the placebo group proportionally more.
The trial did not include a semaglutide-without-IBT arm. Without that control, the additive contribution of IBT on top of semaglutide remains an estimate, not a measured quantity.
Duration and Durability: The 68-Week Ceiling
Obesity is a chronic disease. The Endocrine Society's 2024 clinical practice guideline explicitly frames pharmacotherapy as long-term, potentially lifelong treatment. STEP-3's 68-week window captures the weight-loss phase but tells us very little about what happens at years two, three, or five.
The STEP-1 extension trial (STEP-1 Extension) demonstrated significant weight regain after semaglutide discontinuation, with participants regaining approximately two-thirds of their lost weight within one year of stopping the drug. STEP-3 did not include a structured off-treatment follow-up period. Clinicians are left extrapolating from adjacent data.
Three specific durability questions remain unanswered by STEP-3:
- Weight trajectory beyond 68 weeks. Weight loss curves in the semaglutide arm were still trending downward at week 68, suggesting the nadir may not have been reached. Whether additional weight loss or a plateau follows is unknown from this trial.
- IBT adherence decay. Thirty counseling sessions over 68 weeks is resource-intensive. Real-world IBT completion rates are substantially lower than trial conditions. No data exist on outcomes when patients complete only half the sessions.
- Metabolic parameter stability. Improvements in waist circumference, blood pressure, and lipids tracked with weight loss. Whether these improvements persist, plateau, or reverse after week 68 was not assessed.
Statistical Considerations and Missing Data
The trial used a treatment-policy estimand (intention-to-treat principle) as the primary analysis, which includes data from all randomized participants regardless of treatment adherence. This is the correct regulatory approach. But the supplementary estimand (in which data after treatment discontinuation were excluded) showed an even larger treatment effect, meaning the ITT result is conservative relative to the on-treatment effect.
Discontinuation rates matter here. Approximately 5.8% of semaglutide participants stopped treatment due to adverse events, primarily gastrointestinal. The multiple imputation methods used to handle missing data assume data are missing at random. If participants who dropped out had systematically different trajectories (which GI-intolerant patients plausibly would), this assumption introduces bias in either direction.
The trial's co-primary endpoints were percentage body weight change and the proportion achieving at least 5% weight loss. Both were met with p <0.001. But secondary endpoints (10%, 15%, 20% weight loss thresholds) were tested in a hierarchical fashion. All were statistically significant, which is a strong signal, but the hierarchical gatekeeping means each subsequent threshold's p-value is valid only if all prior thresholds passed. They did, but this structure is worth noting for methodological transparency.
Adverse Events and the GI Tolerability Problem
Gastrointestinal adverse events occurred in 82.8% of the semaglutide arm versus 63.2% of the placebo arm. Nausea affected 57.1% of semaglutide-treated participants. These rates are high enough to question real-world adherence outside of a clinical trial's structured support.
| Adverse Event | Semaglutide + IBT | Placebo + IBT | |---|---|---| | Any GI event | 82.8% | 63.2% | | Nausea | 57.1% | 28.3% | | Diarrhea | 34.5% | 21.6% | | Vomiting | 21.9% | 7.4% | | Constipation | 29.5% | 14.9% | | Discontinuation due to AE | 5.8% | 2.9% |
The 8-week initial low-calorie diet phase may have contributed to GI symptoms in both arms, making it difficult to attribute all gastrointestinal events solely to the drug. This confound was not fully addressed in the published analysis.
Conflict of Interest and Funding
Novo Nordisk funded STEP-3. Novo Nordisk employees co-designed the protocol, participated in data analysis, and co-authored the publication. Several academic investigators reported consulting fees, advisory board payments, or research grants from Novo Nordisk or competing pharmaceutical companies.
This does not automatically invalidate the results. Industry-funded trials with pre-registered protocols and independent statistical analysis can produce reliable data. But the funding structure matters for two reasons. First, endpoint selection, dose escalation schedules, and comparator design reflect sponsor priorities. Second, post-hoc analyses and publication timing are influenced by commercial considerations. The STEP-3 protocol was registered prospectively, and the endpoints match the registration. That is reassuring but does not eliminate all bias.
What Post-Publication Commentary Surfaced
Several letters to the editor and invited commentaries raised points after the 2021 JAMA publication:
- The active comparator question. Multiple commentators noted that comparing semaglutide+IBT to placebo+IBT does not tell clinicians how semaglutide+IBT compares to semaglutide alone or to semaglutide plus standard (non-intensive) counseling. This is arguably the question most relevant to clinical decision-making and resource allocation.
- The meal replacement lead-in. The 8-week low-calorie diet using meal replacements is not standard of care in most obesity clinics. Starting all participants from a diet-induced weight-loss baseline may have inflated the total weight change numbers relative to what a patient starting semaglutide from their habitual diet would experience.
- Cost-effectiveness silence. STEP-3 reported efficacy but not cost data. Given Wegovy's list price (approximately $1,300/month at U.S. launch) plus the cost of 30 IBT sessions, the total treatment cost is substantial. No formal cost-effectiveness analysis was embedded in the trial or published alongside it.
- Cardiovascular outcomes. STEP-3 was not powered for cardiovascular events. The subsequent SELECT trial addressed this gap, but SELECT used a different population (established cardiovascular disease, no diabetes requirement). Whether the STEP-3 population specifically derives cardiovascular benefit remains unproven.
Bottom Line for Clinicians
STEP-3 demonstrated that semaglutide 2.4 mg adds approximately 10 percentage points of weight loss on top of intensive behavioral therapy. That is a real and clinically meaningful effect. The criticisms here do not erase that finding. They define its boundaries.
The trial tells you what happens in predominantly white American women without diabetes who complete 30 counseling sessions over 68 weeks in an industry-sponsored protocol. It does not tell you what happens in the broader population, over longer time horizons, or in healthcare systems where IBT at this intensity is unavailable. Prescribers should apply the effect estimate with appropriate uncertainty for patients who fall outside the trial's narrow demographic window.
Frequently asked questions
›
›
›
›
›
›
›
›
›
›
References
- Wadden TA, Bailey TS, Billings LK, et al. Effect of subcutaneous semaglutide vs placebo as an adjunct to intensive behavioral therapy on body weight in adults with overweight or obesity: the STEP 3 randomized clinical trial. JAMA. 2021;325(14):1403-1413. PubMed
- Wilding JPH, Batterham RL, Calanna S, et al. Once-weekly semaglutide in adults with overweight or obesity (STEP 1). N Engl J Med. 2021;384(11):989-1002. PubMed
- Lincoff AM, Brown-Frandsen K, Colhoun HM, et al. Semaglutide and cardiovascular outcomes in obesity without diabetes (SELECT). N Engl J Med. 2023;389(24):2221-2232. PubMed
- Wegovy (semaglutide) prescribing information. U.S. Food and Drug Administration. FDA Label
- Heymsfield SB, Hu HH, Engel SB. Obesity pharmacotherapy: current status and emerging approaches. Endocr Rev. 2024;45(4):489-512. Endocrine Society Guideline