Honest Criticisms and Limitations of the BEGONIA Trial

Clinical medical image for trials begonia: Honest Criticisms and Limitations of the BEGONIA Trial

At a glance

| Detail | Value | |---|---| | N | 949 premenopausal women | | Intervention | Flibanserin 100 mg nightly | | Comparator | Placebo | | Duration | 24 weeks | | Primary endpoint | Change in satisfying sexual events (SSEs) and sexual desire score (eDiary) | | Key result | Modest but statistically significant improvement in SSEs and desire vs. placebo |

Why This Trial Matters, and Why Its Flaws Matter More

The BEGONIA trial was one of three key Phase III studies (alongside DAISY and VIOLET) that formed the clinical package for flibanserin's FDA approval. Because regulatory decisions hinge on the totality of evidence, the specific weaknesses of each trial deserve scrutiny. BEGONIA enrolled 949 premenopausal women with generalized, acquired hypoactive sexual desire disorder (HSDD) and randomized them to flibanserin 100 mg or placebo at bedtime for 24 weeks. The trial reported statistically significant improvements in the co-primary endpoints of satisfying sexual events and daily desire scores measured by electronic diary.

But statistical significance is not synonymous with clinical meaningfulness. The gap between the drug arm and placebo arm was narrow, the placebo response was substantial, and the trial design carried several features that limit how confidently clinicians can apply these results to everyday patients.

Enrollment Biases: Who Actually Got In

The HealthRX Generalizability Checklist for BEGONIA

When evaluating whether a trial's enrolled population matches real-world patients, we apply five filters. BEGONIA falls short on most of them.

| Filter | BEGONIA Performance | Concern | |---|---|---| | Age range | 18-50, premenopausal only | Excludes the large postmenopausal HSDD population | | Psychiatric comorbidity | Active depression, anxiety disorders excluded | Many women with HSDD have comorbid mood disorders | | Medication use | Excluded women on SSRIs, SNRIs, and other centrally acting drugs | SSRI-induced sexual dysfunction is a major real-world driver of low desire | | Relationship status | Required a stable, monogamous partner | Single women and those in newer relationships were not represented | | Baseline severity | Required HSDD diagnosis per DSM-IV-TR criteria with <3 SSEs per month | Floor effect may compress detectable improvement |

The BEGONIA protocol required participants to be in a stable, communicative sexual relationship for at least one year. This is a meaningful filter. Women who are single, in new partnerships, or in relationships with significant conflict were not studied. Yet clinicians prescribing Addyi encounter exactly these patients.

The exclusion of women taking antidepressants deserves particular emphasis. Selective serotonin reuptake inhibitors are among the most commonly prescribed medications in the United States, and SSRI-associated sexual dysfunction is one of the most frequent reasons women present with desire complaints. Excluding this group means the trial population may not reflect the typical patient a clinician sees when considering flibanserin.

The Placebo Response Problem

One of the most striking features across all flibanserin trials, including BEGONIA, is the size of the placebo response. In BEGONIA, placebo-treated women showed meaningful improvements in both SSEs and desire scores. The drug-placebo difference, while statistically significant, was approximately 0.5 to 1.0 additional SSEs per month.

This raises a practical question: if a woman gains roughly 2 additional satisfying sexual events per month on placebo and roughly 2.5 to 3 on drug, is the incremental benefit of the active medication worth the side-effect burden? The FDA's own advisory committee debated this point extensively during both the 2013 rejection and the eventual 2015 approval.

The large placebo effect also suggests that trial participation itself (regular clinic visits, daily diary completion, conversations about sexuality with research staff) may function as a therapeutic intervention. This confound is not unique to flibanserin trials, but it is especially relevant for a condition where psychosocial factors play a central role.

Statistical Caveats Worth Noting

Multiple Endpoints and Multiplicity Adjustment

BEGONIA used co-primary endpoints: change in SSEs and change in desire score measured via eDiary. The trial also reported several secondary endpoints including the Female Sexual Function Index (FSFI) desire domain, the Female Sexual Distress Scale-Revised (FSDS-R) Item 13, and Patient Global Impression of Improvement.

When trials measure multiple outcomes, the risk of finding at least one statistically significant result by chance increases. BEGONIA used a hierarchical testing procedure to control for multiplicity. This is appropriate methodology. However, the clinical effect sizes on the co-primary endpoints were small. Cohen's d values for the drug-placebo differences were in the range that most behavioral scientists would classify as "small effects."

Responder Analyses vs. Mean Changes

The BEGONIA publication reports mean changes, but responder analyses tell a different story. Not every woman on flibanserin improved, and not every woman on placebo stayed the same. The proportion of women who experienced what they would consider a meaningful improvement was higher in the flibanserin arm, but the absolute difference in responder rates was modest. A number-needed-to-treat (NNT) calculation based on the pooled Phase III data suggests that roughly 8 to 10 women need to be treated for one additional woman to experience a clinically meaningful benefit beyond placebo.

For comparison, sildenafil for erectile dysfunction showed NNT values in the range of 2 to 4 in its key trials. The difference in treatment magnitude between these two sexual dysfunction drugs is substantial.

Follow-Up Duration: 24 Weeks Is Not Enough

BEGONIA lasted 24 weeks. For a medication intended for chronic, ongoing use, six months of data leaves several questions unanswered.

  • Durability: Does the benefit persist at 12, 18, or 24 months? Open-label extension data from the ROSE study suggested benefit maintenance, but open-label extensions lack the rigor of blinded comparisons, and dropout rates in those extensions were high.
  • Tolerance: Does the central serotonergic mechanism of flibanserin lead to tolerance with chronic use, as seen with some other centrally acting agents?
  • Long-term safety: The interaction between flibanserin and alcohol (hypotension, syncope) was characterized in short-term studies, but long-term patterns of real-world use, where patients may not adhere to the alcohol restriction, were not captured in the 24-week window.

The Addyi prescribing information carries a boxed warning about the alcohol interaction. This warning was added based on pharmacokinetic interaction studies, not based on long-term outcome data from the key trials themselves.

Conflict-of-Interest Considerations

BEGONIA was sponsored by Boehringer Ingelheim, which originally developed flibanserin as an antidepressant before repositioning it for HSDD. When the FDA rejected flibanserin in 2010 and again in 2013, the rights were acquired by Sprout Pharmaceuticals, which ultimately secured approval in 2015.

Several of the trial's authors disclosed financial relationships with the sponsor, including consulting fees, advisory board participation, and research funding. This does not automatically invalidate the findings, but it is relevant context. Industry-sponsored trials in psychiatry and sexual medicine have historically shown a pattern of larger effect sizes compared to independently funded studies, a phenomenon documented across therapeutic areas in a 2003 BMJ systematic review.

The advocacy campaign surrounding flibanserin's approval, branded as "Even the Score," framed the FDA's initial rejections as gender bias. Several commentators, including those in a 2016 analysis in the Journal of Medical Ethics, argued that this campaign conflated drug approval with gender equity and may have created public pressure that influenced the regulatory process.

What Post-Publication Commentary Surfaced

After the BEGONIA and companion trial results were published, several points of criticism appeared in the peer-reviewed literature.

Clinical significance thresholds. Multiple commentators questioned whether a mean increase of approximately 0.5 SSEs per month above placebo meets any reasonable threshold for clinical significance. The International Society for the Study of Women's Sexual Health (ISSWSH) issued guidance supporting flibanserin's utility, but other professional groups were more cautious.

Side-effect burden. In BEGONIA, the most common adverse events in the flibanserin arm included dizziness (11.4%), somnolence (11.7%), nausea (10.4%), and fatigue (7.2%). These rates were meaningfully higher than placebo. For a medication producing a small incremental benefit, this side-effect profile shifts the risk-benefit calculus.

Diary compliance and data quality. The use of electronic diaries for desire measurement is subject to compliance concerns. If participants fill in diary entries retrospectively (a known issue in eDiary-based trials), the data may reflect recall bias rather than real-time experience. BEGONIA reported compliance rates but did not publish detailed analyses of how timing of diary completion affected outcomes.

Putting BEGONIA in Context

| Feature | BEGONIA | DAISY | VIOLET | |---|---|---|---| | Population | Premenopausal | Premenopausal | Premenopausal | | N | 949 | 1,187 | 1,101 | | Duration | 24 weeks | 24 weeks | 24 weeks | | SSE drug-placebo difference | ~0.5-1.0/month | ~0.5-1.0/month | ~0.5-1.0/month | | Desire score improvement | Statistically significant | Statistically significant | Statistically significant |

The consistency of small effect sizes across all three key trials is itself informative. It suggests the effect is real but modest. Consistency does not address the question of whether the magnitude of benefit justifies a daily medication with a boxed warning and meaningful side effects.

The Bottom Line for Clinicians

BEGONIA demonstrated that flibanserin is not a placebo. It produced real, measurable improvements in sexual desire and satisfying sexual events in a carefully selected premenopausal population. But the trial's exclusion criteria, short duration, large placebo response, small absolute effect sizes, and industry sponsorship collectively mean that clinicians should set realistic expectations when discussing Addyi with patients. The women most likely to benefit may not look like the women in the trial, and the women in the trial showed improvements that, while statistically significant, were clinically modest.

Frequently asked questions

References

  1. Thorp J, Simon J, Dattani D, et al. Treatment of hypoactive sexual desire disorder in premenopausal women: efficacy of flibanserin in the BEGONIA trial. J Sex Med. 2014;11(5):1231-1240. PubMed
  2. U.S. Food and Drug Administration. Addyi (flibanserin) prescribing information. 2015. FDA Label
  3. Lexchin J, Bero LA, Djulbegovic B, Clark O. Pharmaceutical industry sponsorship and research outcome and quality: systematic review. BMJ. 2003;326(7400):1167-1170. PubMed
  4. Jaspers L, Feys F, Bramer WM, et al. Efficacy and safety of flibanserin for the treatment of hypoactive sexual desire disorder in women: a systematic review and meta-analysis. JAMA Intern Med. 2016;176(4):453-462. PubMed
  5. Goldstein I, Kim NN, Clayton AH, et al. Hypoactive sexual desire disorder: International Society for the Study of Women's Sexual Health expert consensus panel review. Mayo Clin Proc. 2017;92(1):114-128. PubMed