Inside the STEP-8 Methodology: What Most Summaries Skip

GLP-1 medication and metabolic health image for Inside the STEP-8 Methodology: What Most Summaries Skip

At a glance

| Parameter | Detail | |---|---| | N | 338 (randomized 3:1:3:1 across four arms) | | Intervention | Semaglutide 2.4 mg subcutaneous once weekly | | Active comparator | Liraglutide 3.0 mg subcutaneous once daily | | Placebo arms | Matched placebo for each active drug | | Duration | 68 weeks (16-week escalation + 52-week maintenance) | | Primary endpoint | Percent change in body weight from baseline to week 68 | | Key result | -15.8% (semaglutide) vs -6.4% (liraglutide); estimated treatment difference -9.4 percentage points (95% CI, -12.0 to -6.8) | | ClinicalTrials.gov | NCT04074161 |

Why the Design Matters More Than the Result

Most coverage of STEP-8 stops at the headline: semaglutide beat liraglutide by roughly 9 percentage points. That number is real, but the trial was engineered with specific design choices that clinicians should understand before translating it into prescribing decisions.

Three choices in particular deserve close attention: the randomization ratio, the blinding structure, and the estimand framework.

Randomization: A 3:1:3:1 Split and What It Means

STEP-8 randomized 338 adults across four arms in a 3:1:3:1 ratio:

  • Semaglutide 2.4 mg + placebo daily injection (n = 126)
  • Semaglutide-matched placebo weekly + placebo daily injection (n = 42)
  • Liraglutide 3.0 mg + placebo weekly injection (n = 127)
  • Liraglutide-matched placebo daily + placebo weekly injection (n = 43)

This is not a simple two-arm trial. Each participant received two injections (one weekly, one daily) to maintain blinding within each drug-placebo pair. The 3:1 ratio within each pair was chosen to maximize power for the active-vs-active comparison while still generating placebo-controlled safety data for each drug separately.

The practical consequence: each placebo arm had only ~42 participants. That is enough to confirm general placebo-subtracted efficacy patterns but too small for strong subgroup analyses within placebo arms. The trial was powered for the head-to-head comparison, not for independent placebo-controlled conclusions about either drug.

Blinding: Double-Blind Within Pairs, Open-Label Between Drugs

This is the single most important methodological nuance in STEP-8, and most summaries skip it entirely.

Participants were blinded to whether they received active drug or placebo within their assigned pair. A person in the semaglutide group did not know if the weekly injection was semaglutide or placebo. But the trial was open-label with respect to which drug pair a participant was assigned to. The injection devices for semaglutide (FlexTouch pen) and liraglutide (a different pen device) are visually distinct, and daily vs weekly dosing schedules differ.

This means a participant could reasonably infer whether they were in the semaglutide or liraglutide group based on injection frequency and device appearance, even though they could not tell if they received active drug within that group.

Why this matters clinically: Performance bias, the phenomenon where knowledge of treatment assignment changes behavior, is a real concern. If participants in the semaglutide arm believed they were receiving the "stronger" drug (press coverage of the STEP program was extensive by 2020-2021), adherence to lifestyle modifications might have differed between arms. The published paper acknowledges this as a limitation.

HealthRX Assessment Framework: Blinding Adequacy in STEP-8

| Layer | Blinded? | Risk | |---|---|---| | Patient: drug vs placebo within pair | Yes | Low | | Patient: semaglutide group vs liraglutide group | No (open-label) | Moderate performance bias | | Investigator: drug vs placebo within pair | Yes | Low | | Investigator: which drug pair | No | Moderate detection bias for subjective outcomes | | Outcome assessor (body weight) | Objective measure | Low regardless of blinding |

Because the primary endpoint (body weight on a scale) is an objective measurement, the open-label design between drug pairs has less impact on the primary result than it would for subjective endpoints like quality of life. Still, secondary endpoints such as patient-reported outcomes and adverse event reporting are more vulnerable.

Dose Escalation Asymmetry

Both drugs followed their respective FDA-approved escalation schedules:

  • Semaglutide: 0.25 mg weekly for 4 weeks, then 0.5 mg for 4 weeks, then 1.0 mg for 4 weeks, then 1.7 mg for 4 weeks, then the target 2.4 mg from week 16 onward. Total escalation: 16 weeks.
  • Liraglutide: 0.6 mg daily for 1 week, then increases of 0.6 mg per week until reaching 3.0 mg by approximately week 4-5. Total escalation: ~4 weeks.

This means semaglutide participants spent their first 16 weeks on sub-therapeutic doses, while liraglutide participants reached full dose within roughly 5 weeks. By week 16, the liraglutide group had been on the maintenance dose for ~11 weeks while semaglutide participants were just arriving at 2.4 mg.

This asymmetry does not invalidate the 68-week result, since 52 weeks of full-dose semaglutide is ample time for weight trajectory separation. But it does mean that early time-point comparisons (weeks 4-20) favor liraglutide's head start at target dose. The published weight-change curves show the separation between drugs widening primarily after week 20, consistent with semaglutide needing time at full dose.

Inclusion and Exclusion Criteria: Who Was Actually Studied

STEP-8 enrolled adults aged 18 and older with:

  • BMI ≥30, or BMI ≥27 with at least one weight-related comorbidity (hypertension, dyslipidemia, obstructive sleep apnea, cardiovascular disease)
  • At least one self-reported unsuccessful dietary effort to lose weight
  • HbA1c <6.5% (people with diagnosed type 2 diabetes were excluded)

Key exclusions:

  • Prior bariatric surgery
  • Use of GLP-1 receptor agonists within 90 days
  • Body weight change of more than 5 kg within 90 days before screening
  • History of pancreatitis or personal/family history of medullary thyroid carcinoma or MEN2

The diabetes exclusion is important for clinical translation. Semaglutide 2.4 mg and liraglutide 3.0 mg are both used in people with type 2 diabetes in clinical practice, but STEP-8 excluded this population entirely. The relative efficacy gap between the two drugs may differ in patients with diabetes, given known differences in GLP-1 receptor agonist response by glycemic status. STEP-2 (semaglutide in type 2 diabetes) showed a smaller absolute weight loss than STEP-1 (without diabetes), and no head-to-head data exist for these drugs in that population.

The Estimand Framework: Two Numbers, Two Questions

STEP-8 used the ICH E9(R1) estimand framework, reporting two co-primary estimands:

  1. Treatment policy estimand: Weight change regardless of treatment discontinuation or rescue intervention. This answers: "What happens to patients assigned to this drug in real practice, including those who stop?"
  2. Trial product estimand: Weight change assuming all participants remained on treatment for the full 68 weeks without intercurrent events. This answers: "How much weight do you lose if you actually stay on the drug?"

Results under both estimands:

| Estimand | Semaglutide | Liraglutide | Difference | |---|---|---|---| | Treatment policy | -15.8% | -6.4% | -9.4 pp (95% CI: -12.0 to -6.8) | | Trial product | -17.4% | -7.1% | -10.3 pp |

The gap between estimands reflects the ~13% discontinuation rate in the semaglutide arm and ~28% in the liraglutide arm. The higher liraglutide discontinuation rate means the treatment policy estimand (which includes data from people who stopped) narrows the gap slightly compared to the per-protocol-like trial product estimand.

This matters for formulary decisions: the treatment policy estimand is more relevant for health plans estimating population-level outcomes, while the trial product estimand is more relevant for counseling an individual patient who asks "how much weight will I lose if I stick with this?"

Statistical Approach

The primary analysis used a mixed model for repeated measures (MMRM) for the trial product estimand and a pattern-mixture model using multiple imputation for the treatment policy estimand, following methods described in the trial protocol supplement.

Superiority of semaglutide over liraglutide was tested at a two-sided alpha of 0.05 with multiplicity adjustment for two co-primary endpoints (percent weight change and the proportion achieving ≥5% weight loss). A step-down procedure controlled the family-wise error rate.

Sample size: 126 per active arm provided >90% power to detect a 5-percentage-point difference in body weight change, assuming a standard deviation of ~10% based on prior STEP trials. The observed difference of 9.4 percentage points was nearly double what the trial was powered to detect, making the statistical result unambiguous (P < .001).

Adverse Events and Tolerability Differences

GI adverse events were common in both arms but more frequent and more persistent with semaglutide:

| Event | Semaglutide (n=126) | Liraglutide (n=127) | |---|---|---| | Nausea | 44% | 38% | | Diarrhea | 30% | 21% | | Constipation | 26% | 16% | | Vomiting | 24% | 14% | | Discontinued due to AEs | 3.2% | 12.6% |

The seemingly paradoxical finding: semaglutide caused more GI symptoms but fewer discontinuations. This likely reflects the slower 16-week dose escalation for semaglutide compared to liraglutide's rapid 4-5 week ramp. Participants on liraglutide hit full dose (and peak GI side effects) faster, leading to more early dropouts before tolerance could develop. The Wegovy prescribing information specifies the gradual escalation schedule precisely for this reason.

Limitations the Authors Acknowledged

The STEP-8 publication lists several limitations:

  • Open-label design between drug groups (discussed above)
  • Relatively short treatment duration; no data on whether the efficacy gap persists beyond 68 weeks
  • Limited racial and ethnic diversity: approximately 75% of participants were White
  • Exclusion of people with type 2 diabetes limits generalizability to a major target population for both drugs
  • No assessment of whether switching from liraglutide to semaglutide yields additional benefit (a common clinical question)
  • Funded by Novo Nordisk, which manufactures both drugs but markets semaglutide (Wegovy) as the newer, higher-revenue product

Clinical Translation: What This Trial Does and Does Not Tell You

It does tell you: In adults without diabetes who have obesity or overweight with comorbidities, semaglutide 2.4 mg weekly produces roughly 2.5 times the weight loss of liraglutide 3.0 mg daily over 68 weeks. The result is statistically and clinically significant by any reasonable threshold.

It does not tell you: Whether patients already responding well to liraglutide should switch. Whether the gap holds in people with type 2 diabetes. Whether cost-per-kilogram-lost favors semaglutide given the price difference. Or whether tirzepatide, which was not yet approved when STEP-8 enrolled, would change the comparison entirely. The SURMOUNT-1 trial later showed tirzepatide achieving 20-22% weight loss, shifting the competitive context substantially.

Frequently asked questions

References

  • Rubino DM, Greenway FL, Khalid U, et al. Effect of weekly subcutaneous semaglutide vs daily liraglutide on body weight in adults with overweight or obesity without diabetes: the STEP 8 randomized clinical trial. JAMA. 2022;327(2):138-150. PubMed
  • Davies M, Færch L, Jeppesen OK, et al. Semaglutide 2.4 mg once a week in adults with overweight or obesity, and type 2 diabetes (STEP 2): a randomised, double-blind, double-dummy, placebo-controlled, phase 3 trial. Lancet. 2021;397(10278):971-984. PubMed
  • Jastreboff AM, Aronne LJ, Ahmad NN, et al. Tirzepatide once weekly for the treatment of obesity. N Engl J Med. 2022;387(4):327-340. PubMed
  • ICH E9(R1) Expert Working Group. ICH E9(R1) addendum on estimands and sensitivity analysis in clinical trials. PubMed
  • Wegovy (semaglutide) prescribing information. Novo Nordisk. FDA Label
  • Saxenda (liraglutide) prescribing information. Novo Nordisk. FDA Label