How does STEP-3 differ from STEP-1?

STEP-1 tested semaglutide 2.4 mg against lifestyle counseling alone, without an intensive behavioral therapy program or an initial low-calorie diet phase. STEP-3 added both, making the comparator arm more active. The semaglutide arm lost 16.0% in STEP-3 vs 14.9% in STEP-1, suggesting IBT adds a modest additional benefit on top of the drug.

What was the low-calorie diet phase in STEP-3?

Participants consumed 1,000 to 1,200 kcal/day using meal replacements for the first 8 weeks of the trial. This phase overlapped with the semaglutide dose-escalation period. The reported 16% weight loss is measured from post-LCD baseline, not from the participant's original weight before caloric restriction began.

What is an estimand framework in a clinical trial?

An estimand defines precisely what treatment effect the trial is estimating, including how events like treatment discontinuation are handled. STEP-3 used two: the treatment policy estimand (counts all patients regardless of adherence) and the trial product estimand (estimates effect if all patients completed therapy). The distinction affects how large the reported effect appears.

Why was 2:1 randomization used instead of 1:1?

A 2:1 ratio gives more participants the active drug, which can improve enrollment and retention in trials where patients are motivated to receive treatment. The tradeoff is slightly less statistical power for detecting rare adverse events in the smaller placebo group.

Were patients with type 2 diabetes included in STEP-3?

No. Patients with an HbA1c of 6.5% or higher or a diagnosis of type 2 diabetes were excluded. The STEP-2 trial specifically addressed the diabetic population. This exclusion means STEP-3 results do not directly apply to the large population of people with both obesity and type 2 diabetes.

How intensive was the behavioral therapy in STEP-3?

Participants received 30 individual counseling sessions over 68 weeks, focused on diet, physical activity, and behavioral strategies. This is substantially more contact than standard lifestyle advice in most obesity trials and far exceeds what most patients receive in routine clinical care.

Did the placebo group lose a clinically meaningful amount of weight?

Yes. The placebo arm (which still received the full IBT program) lost 5.7% of body weight, a result that would meet the FDA's threshold for clinically meaningful weight loss in many contexts. This underscores that the behavioral component alone was therapeutically active.

What happened to participants who stopped semaglutide early?

Under the treatment policy estimand, their subsequent weight data (including any regain) was included in the semaglutide arm's result. The trial used multiple imputation to estimate their trajectories. Because discontinuation rates were low (3.4% for adverse events), the impact on the overall result was small.

Is the 16% weight loss achievable without intensive behavioral therapy?

STEP-1, which used standard lifestyle counseling rather than IBT, showed 14.9% weight loss with semaglutide 2.4 mg. The difference suggests IBT contributes roughly 1 to 2 additional percentage points. The drug accounts for the majority of the effect, but the behavioral program provides a measurable incremental benefit.

How does the STEP-3 population compare to real-world obesity patients?

The trial enrolled predominantly White (83%), female (81%) participants with a mean age of 46 and no type 2 diabetes. Real-world obesity populations are more diverse, older, and more medically complex. Clinicians should consider these differences when projecting STEP-3 results onto their patient panels.

Inside the STEP-3 Methodology: What Most Summaries Skip

By HealthRX.com Medical Team

Published May 25, 2026Updated May 25, 2026Last reviewed May 25, 2026

Clinical image for Inside the STEP-3 Methodology: What Most Summaries Skip Image: HealthRX.com AI-generated clinical image

At a glance

| Parameter | Detail | |---|---| | Trial | STEP-3 (Semaglutide Treatment Effect in People with Obesity 3) | | N | 611 randomized (407 semaglutide, 204 placebo) | | Intervention | Semaglutide 2.4 mg subcutaneous weekly + intensive behavioral therapy (IBT) including initial 8-week low-calorie diet | | Comparator | Placebo injection weekly + identical IBT program | | Duration | 68 weeks (16-week dose escalation, 52 weeks at maintenance dose) | | Primary endpoint | Percentage change in body weight from randomization to week 68 | | Key result | −16.0% (semaglutide + IBT) vs −5.7% (placebo + IBT); estimated treatment difference −10.3 percentage points (Wadden et al., JAMA 2021) |

Why the Comparator Matters More Than the Drug

Most trial summaries lead with the 16% weight loss figure. That number is real, but it is inseparable from the comparator design. STEP-3 did not test semaglutide against diet advice and a pamphlet. Both arms received 30 individual counseling sessions over 68 weeks, plus an initial 8-week low-calorie diet phase (1,000 to 1,200 kcal/day using meal replacements). The placebo arm lost 5.7% of body weight, a result that would be clinically meaningful on its own in many obesity trials (Wadden et al., JAMA 2021).

This design choice was intentional. The STEP program included four key trials, each isolating a different clinical question. STEP-1 tested semaglutide against lifestyle counseling alone. STEP-3 asked a narrower question: for patients already receiving the most intensive non-surgical behavioral treatment available, does adding semaglutide produce additional weight loss? The answer is yes, about 10 extra percentage points, but that framing changes how you apply the result.

A clinician reading only the headline might assume 16% weight loss is the default expectation for any patient starting semaglutide. It is not. That figure reflects a combination of pharmacotherapy, caloric restriction through meal replacements, and high-frequency behavioral counseling. Patients who receive semaglutide without the IBT component (as in STEP-1) lost 14.9%, suggesting IBT added roughly 1 to 2 percentage points on top of the drug effect alone.

Randomization and Blinding Architecture

STEP-3 used a 2:1 randomization ratio favoring semaglutide over placebo. This is common in obesity trials where recruitment is competitive and retention matters. Participants know they have a higher probability of receiving the active drug, which can improve enrollment and reduce early dropout (Wadden et al., JAMA 2021).

Randomization was stratified by two factors: race (White vs other) and the presence of prediabetes at screening (yes vs no). The race stratification variable reflected known differences in GLP-1 receptor agonist response across populations and ensured balanced representation. The prediabetes stratification accounted for the metabolic heterogeneity that affects both weight loss trajectories and adverse event profiles.

Blinding relied on matched placebo injections. The semaglutide pen and placebo pen were visually identical. However, the trial acknowledged a practical limitation: gastrointestinal side effects (nausea, diarrhea, vomiting) occurred at substantially higher rates in the semaglutide arm. In clinical practice, both participants and investigators can sometimes infer treatment assignment from side-effect burden. The FDA's clinical review for semaglutide 2.4 mg noted this issue across the STEP program, though formal unblinding assessments were not reported.

Inclusion and Exclusion: Who Was (and Wasn't) in This Trial

The enrollment criteria defined a specific population. Participants needed a BMI of 30 kg/m² or higher, or 27 kg/m² or higher with at least one weight-related comorbidity. They had to have at least one self-reported unsuccessful dietary effort. The upper BMI was not capped, but the median BMI at baseline was approximately 38 kg/m².

Key exclusions shaped the generalizability:

| Exclusion criterion | Clinical implication | |---|---| | Type 2 diabetes | Removes ~40% of real-world obesity patients; STEP-2 addressed this population separately | | Prior bariatric surgery | Altered GI anatomy changes GLP-1 pharmacokinetics; these patients were studied elsewhere | | GLP-1 RA use within 90 days | Prevents carryover effects but excludes patients switching from liraglutide | | HbA1c ≥ 6.5% | Reinforces the non-diabetic focus; borderline patients were excluded even without a diabetes diagnosis | | Psychiatric disorders deemed unstable | Standard safety exclusion, but removes a population with high obesity prevalence |

The trial population was 81% female and 83% White. Mean age was 46 years. These demographics are important for external validity. Real-world obesity populations are more diverse, older on average, and more likely to have type 2 diabetes. The American Gastroenterological Association's 2022 obesity guideline emphasizes that trial populations often under-represent the patients most commonly seen in clinical practice.

The Estimand Framework: Two Ways to Read the Same Data

STEP-3 pre-specified two estimands, and this is where methodology discussions usually get skipped. Understanding the difference between them is essential to interpreting the result honestly.

Treatment policy estimand: Includes all randomized participants regardless of whether they completed treatment or adhered to the protocol. If a patient stopped semaglutide at week 20 and regained weight, that regain is counted in the semaglutide arm's result. This reflects what happens in a real healthcare system where some patients discontinue.

Trial product estimand: Estimates the treatment effect assuming all participants remained on their assigned drug for the full 68 weeks. This uses a mixed model for repeated measures (MMRM) that handles missing data under a missing-at-random assumption. It answers the question: what would the weight loss be if everyone stayed on the drug?

The primary analysis used the treatment policy estimand. The headline −16.0% figure comes from this analysis. The trial product estimand yielded −16.8% for semaglutide, a slightly larger effect as expected (because it mathematically removes the dilution from dropouts who regained weight).

Here is why this matters for clinical translation:

| Estimand | Semaglutide + IBT | Placebo + IBT | Difference | What it answers | |---|---|---|---|---| | Treatment policy | −16.0% | −5.7% | −10.3 pp | What happens in a population prescribed this regimen? | | Trial product | −16.8% | −6.2% | −10.6 pp | What happens if the patient stays on the drug the full course? |

The gap between the two estimands is modest in STEP-3 (about 0.8 percentage points), which indicates that treatment discontinuation was relatively low and weight regain among discontinuers was limited within the trial's timeframe. In trials with higher dropout rates, this gap widens considerably.

Statistical Approach and Multiplicity Control

The primary endpoint was confirmed using a closed testing procedure to control for multiplicity across two co-primary endpoints (percent weight change and the proportion achieving ≥5% weight loss). The trial used a gatekeeping strategy: the percent weight change endpoint had to achieve significance before the categorical endpoint was formally tested. Both achieved p < 0.001.

Missing data handling deserves specific attention. The treatment policy estimand used multiple imputation based on data from participants in the same treatment arm who discontinued treatment early. This approach assumes that a patient who drops out will follow a weight trajectory similar to other dropouts in the same arm, not similar to completers. It is a conservative assumption that pulls the semaglutide arm's result toward a smaller effect.

The trial product estimand used an MMRM with treatment, stratification factors, visit, and treatment-by-visit interaction as fixed effects, with baseline body weight as a covariate. This standard approach is well-validated for longitudinal weight data but depends on the missing-at-random assumption holding. If patients who dropped out had systematically worse (or better) outcomes than predicted by their observed data, the estimate could be biased in either direction.

The 8-Week Low-Calorie Run-In: Methodological Feature or Confound?

The initial 8-week low-calorie diet (LCD) phase is a distinctive feature of STEP-3 that does not appear in the other STEP trials. Participants consumed 1,000 to 1,200 kcal/day using meal replacements before randomization body weight was measured at week 0 (after the LCD phase was complete).

This means the baseline weight in STEP-3 already reflects some diet-induced weight loss. The 16% reduction reported is on top of whatever weight was lost during the LCD phase. Some participants may have lost 3 to 5% during those 8 weeks. The total weight loss from pre-LCD baseline to week 68 would therefore be larger than 16%.

The LCD phase also overlapped with the semaglutide dose-escalation period (weeks 0 to 16). During escalation, semaglutide's appetite-suppressive effects are building. The combination of caloric restriction and emerging GLP-1 activity may explain the steep early weight-loss curve in STEP-3 relative to STEP-1, where no LCD phase was used.

Whether this design feature enhances or limits generalizability depends on clinical context. Many obesity medicine practices do use initial LCD phases to build momentum before starting pharmacotherapy. For those clinicians, STEP-3 maps closely to their workflow. For primary care physicians who prescribe semaglutide without structured dietary intervention, STEP-1 or STEP-5 (long-term data) may be more relevant reference points.

Adverse Events: What the Numbers Actually Show

Gastrointestinal events dominated the safety profile, consistent with the broader STEP program. In the semaglutide arm, 82.8% of participants reported at least one adverse event vs 63.2% in placebo. Nausea occurred in 53.4% of semaglutide participants vs 19.6% of placebo participants. Most GI events were mild to moderate and occurred during dose escalation.

Treatment discontinuation due to adverse events was 3.4% in the semaglutide group and 0% in the placebo group. This low discontinuation rate is notable and likely reflects the IBT component: regular counseling sessions may have helped participants manage side effects and persist through the nausea-heavy early weeks.

Limitations the Authors Acknowledged

The published paper and supplementary materials list several limitations:

The 68-week duration does not address long-term weight maintenance after treatment cessation. STEP-4 and STEP-5 partially fill this gap.
The IBT program is resource-intensive and not widely available outside academic obesity centers.
The 2:1 randomization ratio, while improving enrollment, reduces statistical power for detecting differences in rare adverse events.
The population was predominantly White and female, limiting generalizability to more diverse populations.
The LCD run-in means the absolute total weight loss from the patient's true starting weight is underreported.

Frequently asked questions

›

References

Wadden TA, Bailey TS, Billings LK, et al. Effect of subcutaneous semaglutide vs placebo as an adjunct to intensive behavioral therapy on body weight in adults with overweight or obesity: the STEP 3 randomized clinical trial. JAMA. 2021;325(14):1403-1413. PubMed
Wilding JPH, Batterham RL, Calanna S, et al. Once-weekly semaglutide in adults with overweight or obesity (STEP 1). N Engl J Med. 2021;384(11):989-1002. PubMed
FDA. Wegovy (semaglutide) prescribing information. Revised 2021. accessdata.fda.gov
Garvey WT, Batterham RL, Bhatt DL, et al. Two-year effect of semaglutide 2.4 mg vs placebo in adults with overweight or obesity (STEP 5). Nat Med. 2022;28(10):2083-2091. PubMed
Grunvald E, Shah R, Hernaez R, et al. AGA clinical practice guideline on pharmacological interventions for adults with obesity. Gastroenterology. 2022;163(5):1198-1225. PubMed