Inside the STEP-3 Methodology: What Most Summaries Skip

GLP-1 medication and metabolic health image for Inside the STEP-3 Methodology: What Most Summaries Skip

At a glance

| Parameter | Detail | |---|---| | Trial | STEP-3 (Semaglutide Treatment Effect in People with Obesity 3) | | N | 611 randomized (407 semaglutide, 204 placebo) | | Intervention | Semaglutide 2.4 mg subcutaneous weekly + intensive behavioral therapy (IBT) including initial 8-week low-calorie diet | | Comparator | Placebo injection weekly + identical IBT program | | Duration | 68 weeks (16-week dose escalation, 52 weeks at maintenance dose) | | Primary endpoint | Percentage change in body weight from randomization to week 68 | | Key result | −16.0% (semaglutide + IBT) vs −5.7% (placebo + IBT); estimated treatment difference −10.3 percentage points (Wadden et al., JAMA 2021) |

Why the Comparator Matters More Than the Drug

Most trial summaries lead with the 16% weight loss figure. That number is real, but it is inseparable from the comparator design. STEP-3 did not test semaglutide against diet advice and a pamphlet. Both arms received 30 individual counseling sessions over 68 weeks, plus an initial 8-week low-calorie diet phase (1,000 to 1,200 kcal/day using meal replacements). The placebo arm lost 5.7% of body weight, a result that would be clinically meaningful on its own in many obesity trials (Wadden et al., JAMA 2021).

This design choice was intentional. The STEP program included four key trials, each isolating a different clinical question. STEP-1 tested semaglutide against lifestyle counseling alone. STEP-3 asked a narrower question: for patients already receiving the most intensive non-surgical behavioral treatment available, does adding semaglutide produce additional weight loss? The answer is yes, about 10 extra percentage points, but that framing changes how you apply the result.

A clinician reading only the headline might assume 16% weight loss is the default expectation for any patient starting semaglutide. It is not. That figure reflects a combination of pharmacotherapy, caloric restriction through meal replacements, and high-frequency behavioral counseling. Patients who receive semaglutide without the IBT component (as in STEP-1) lost 14.9%, suggesting IBT added roughly 1 to 2 percentage points on top of the drug effect alone.

Randomization and Blinding Architecture

STEP-3 used a 2:1 randomization ratio favoring semaglutide over placebo. This is common in obesity trials where recruitment is competitive and retention matters. Participants know they have a higher probability of receiving the active drug, which can improve enrollment and reduce early dropout (Wadden et al., JAMA 2021).

Randomization was stratified by two factors: race (White vs other) and the presence of prediabetes at screening (yes vs no). The race stratification variable reflected known differences in GLP-1 receptor agonist response across populations and ensured balanced representation. The prediabetes stratification accounted for the metabolic heterogeneity that affects both weight loss trajectories and adverse event profiles.

Blinding relied on matched placebo injections. The semaglutide pen and placebo pen were visually identical. However, the trial acknowledged a practical limitation: gastrointestinal side effects (nausea, diarrhea, vomiting) occurred at substantially higher rates in the semaglutide arm. In clinical practice, both participants and investigators can sometimes infer treatment assignment from side-effect burden. The FDA's clinical review for semaglutide 2.4 mg noted this issue across the STEP program, though formal unblinding assessments were not reported.

Inclusion and Exclusion: Who Was (and Wasn't) in This Trial

The enrollment criteria defined a specific population. Participants needed a BMI of 30 kg/m² or higher, or 27 kg/m² or higher with at least one weight-related comorbidity. They had to have at least one self-reported unsuccessful dietary effort. The upper BMI was not capped, but the median BMI at baseline was approximately 38 kg/m².

Key exclusions shaped the generalizability:

| Exclusion criterion | Clinical implication | |---|---| | Type 2 diabetes | Removes ~40% of real-world obesity patients; STEP-2 addressed this population separately | | Prior bariatric surgery | Altered GI anatomy changes GLP-1 pharmacokinetics; these patients were studied elsewhere | | GLP-1 RA use within 90 days | Prevents carryover effects but excludes patients switching from liraglutide | | HbA1c ≥ 6.5% | Reinforces the non-diabetic focus; borderline patients were excluded even without a diabetes diagnosis | | Psychiatric disorders deemed unstable | Standard safety exclusion, but removes a population with high obesity prevalence |

The trial population was 81% female and 83% White. Mean age was 46 years. These demographics are important for external validity. Real-world obesity populations are more diverse, older on average, and more likely to have type 2 diabetes. The American Gastroenterological Association's 2022 obesity guideline emphasizes that trial populations often under-represent the patients most commonly seen in clinical practice.

The Estimand Framework: Two Ways to Read the Same Data

STEP-3 pre-specified two estimands, and this is where methodology discussions usually get skipped. Understanding the difference between them is essential to interpreting the result honestly.

Treatment policy estimand: Includes all randomized participants regardless of whether they completed treatment or adhered to the protocol. If a patient stopped semaglutide at week 20 and regained weight, that regain is counted in the semaglutide arm's result. This reflects what happens in a real healthcare system where some patients discontinue.

Trial product estimand: Estimates the treatment effect assuming all participants remained on their assigned drug for the full 68 weeks. This uses a mixed model for repeated measures (MMRM) that handles missing data under a missing-at-random assumption. It answers the question: what would the weight loss be if everyone stayed on the drug?

The primary analysis used the treatment policy estimand. The headline −16.0% figure comes from this analysis. The trial product estimand yielded −16.8% for semaglutide, a slightly larger effect as expected (because it mathematically removes the dilution from dropouts who regained weight).

Here is why this matters for clinical translation:

| Estimand | Semaglutide + IBT | Placebo + IBT | Difference | What it answers | |---|---|---|---|---| | Treatment policy | −16.0% | −5.7% | −10.3 pp | What happens in a population prescribed this regimen? | | Trial product | −16.8% | −6.2% | −10.6 pp | What happens if the patient stays on the drug the full course? |

The gap between the two estimands is modest in STEP-3 (about 0.8 percentage points), which indicates that treatment discontinuation was relatively low and weight regain among discontinuers was limited within the trial's timeframe. In trials with higher dropout rates, this gap widens considerably.

Statistical Approach and Multiplicity Control

The primary endpoint was confirmed using a closed testing procedure to control for multiplicity across two co-primary endpoints (percent weight change and the proportion achieving ≥5% weight loss). The trial used a gatekeeping strategy: the percent weight change endpoint had to achieve significance before the categorical endpoint was formally tested. Both achieved p < 0.001.

Missing data handling deserves specific attention. The treatment policy estimand used multiple imputation based on data from participants in the same treatment arm who discontinued treatment early. This approach assumes that a patient who drops out will follow a weight trajectory similar to other dropouts in the same arm, not similar to completers. It is a conservative assumption that pulls the semaglutide arm's result toward a smaller effect.

The trial product estimand used an MMRM with treatment, stratification factors, visit, and treatment-by-visit interaction as fixed effects, with baseline body weight as a covariate. This standard approach is well-validated for longitudinal weight data but depends on the missing-at-random assumption holding. If patients who dropped out had systematically worse (or better) outcomes than predicted by their observed data, the estimate could be biased in either direction.

The 8-Week Low-Calorie Run-In: Methodological Feature or Confound?

The initial 8-week low-calorie diet (LCD) phase is a distinctive feature of STEP-3 that does not appear in the other STEP trials. Participants consumed 1,000 to 1,200 kcal/day using meal replacements before randomization body weight was measured at week 0 (after the LCD phase was complete).

This means the baseline weight in STEP-3 already reflects some diet-induced weight loss. The 16% reduction reported is on top of whatever weight was lost during the LCD phase. Some participants may have lost 3 to 5% during those 8 weeks. The total weight loss from pre-LCD baseline to week 68 would therefore be larger than 16%.

The LCD phase also overlapped with the semaglutide dose-escalation period (weeks 0 to 16). During escalation, semaglutide's appetite-suppressive effects are building. The combination of caloric restriction and emerging GLP-1 activity may explain the steep early weight-loss curve in STEP-3 relative to STEP-1, where no LCD phase was used.

Whether this design feature enhances or limits generalizability depends on clinical context. Many obesity medicine practices do use initial LCD phases to build momentum before starting pharmacotherapy. For those clinicians, STEP-3 maps closely to their workflow. For primary care physicians who prescribe semaglutide without structured dietary intervention, STEP-1 or STEP-5 (long-term data) may be more relevant reference points.

Adverse Events: What the Numbers Actually Show

Gastrointestinal events dominated the safety profile, consistent with the broader STEP program. In the semaglutide arm, 82.8% of participants reported at least one adverse event vs 63.2% in placebo. Nausea occurred in 53.4% of semaglutide participants vs 19.6% of placebo participants. Most GI events were mild to moderate and occurred during dose escalation.

Treatment discontinuation due to adverse events was 3.4% in the semaglutide group and 0% in the placebo group. This low discontinuation rate is notable and likely reflects the IBT component: regular counseling sessions may have helped participants manage side effects and persist through the nausea-heavy early weeks.

Limitations the Authors Acknowledged

The published paper and supplementary materials list several limitations:

  • The 68-week duration does not address long-term weight maintenance after treatment cessation. STEP-4 and STEP-5 partially fill this gap.
  • The IBT program is resource-intensive and not widely available outside academic obesity centers.
  • The 2:1 randomization ratio, while improving enrollment, reduces statistical power for detecting differences in rare adverse events.
  • The population was predominantly White and female, limiting generalizability to more diverse populations.
  • The LCD run-in means the absolute total weight loss from the patient's true starting weight is underreported.

Frequently asked questions

References

  1. Wadden TA, Bailey TS, Billings LK, et al. Effect of subcutaneous semaglutide vs placebo as an adjunct to intensive behavioral therapy on body weight in adults with overweight or obesity: the STEP 3 randomized clinical trial. JAMA. 2021;325(14):1403-1413. PubMed
  2. Wilding JPH, Batterham RL, Calanna S, et al. Once-weekly semaglutide in adults with overweight or obesity (STEP 1). N Engl J Med. 2021;384(11):989-1002. PubMed
  3. FDA. Wegovy (semaglutide) prescribing information. Revised 2021. accessdata.fda.gov
  4. Garvey WT, Batterham RL, Bhatt DL, et al. Two-year effect of semaglutide 2.4 mg vs placebo in adults with overweight or obesity (STEP 5). Nat Med. 2022;28(10):2083-2091. PubMed
  5. Grunvald E, Shah R, Hernaez R, et al. AGA clinical practice guideline on pharmacological interventions for adults with obesity. Gastroenterology. 2022;163(5):1198-1225. PubMed