Addyi Evidence Base Graded by GRADE: What the Clinical Trials Actually Show

Clinical medical image for flibanserin v2: Addyi Evidence Base Graded by GRADE: What the Clinical Trials Actually Show

At a glance

  • Indication / Hypoactive sexual desire disorder (HSDD) in premenopausal women
  • FDA approval date / 18 August 2015
  • Approved dose / 100 mg orally once daily at bedtime
  • GRADE evidence quality / Low to moderate (serious risk-of-bias and imprecision concerns)
  • Number of FDA-registration RCTs / 3 key trials (BEGONIA, VIOLET, DAISY)
  • Mean additional satisfying sexual events vs. Placebo / Approximately 0.5 per month across key trials
  • Responder rate (BEGONIA) / ~36% on flibanserin vs ~22% placebo
  • Key safety signal / CNS depression, hypotension, and syncope potentiated by alcohol (REMS program required)
  • Mechanism / 5-HT1A agonist, 5-HT2A antagonist, weak dopamine D4 agonist
  • REMS program / Addyi REMS (prescriber and pharmacist certification required)

What Is HSDD and Why Does the Evidence Quality Matter?

Hypoactive sexual desire disorder is defined as persistently low sexual desire causing marked personal distress, in the absence of a co-existing medical or relationship explanation. The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) folded HSDD into "female sexual interest/arousal disorder," but FDA regulatory labeling and most clinical trial endpoints still use the earlier HSDD construct for premenopausal women.

GRADE (Grading of Recommendations Assessment, Development and Evaluation) is the internationally accepted framework for rating certainty of evidence and strength of clinical recommendations. Ratings run from High to Moderate to Low to Very Low, driven by five downgrading factors: risk of bias, inconsistency, indirectness, imprecision, and publication bias.

Why GRADE Matters for Flibanserin Specifically

Flibanserin's approval was controversial. The FDA advisory committee rejected the drug twice before approving it in 2015, largely because the absolute benefit was small and the safety profile required a Risk Evaluation and Mitigation Strategy (REMS). Applying GRADE to the evidence base gives clinicians and patients a transparent framework for weighing a modest benefit against a concrete set of harms.

The Core HSDD Outcome Problem

Sexual desire is subjective. The two co-primary endpoints used across flibanserin trials, the Female Sexual Function Index (FSFI) desire subscale and the number of satisfying sexual events (SSEs) per 28 days, are patient-reported outcomes with heterogeneous measurement properties. GRADE guidance on surrogate or self-reported endpoints calls for downgrading certainty by at least one level unless minimal important differences have been formally validated, which had not occurred before these trials were designed. This structural limitation colors every effect estimate discussed below.


The Three Key FDA-Registration Trials

BEGONIA (2014)

BEGONIA was a phase 3, randomized, double-blind, placebo-controlled trial in premenopausal women with HSDD, published in the Journal of Sexual Medicine in 2014. The study randomized 1,378 women to flibanserin 100 mg at bedtime or placebo for 24 weeks. [1]

At week 24, flibanserin produced a mean increase of 0.49 SSEs per 28 days over placebo (2.5 vs. 2.0 events per month, P<0.001). The FSFI desire subscale improved by 1.0 point on the 6-point scale (flibanserin group) versus 0.68 points (placebo group), a difference of 0.32 points. The responder rate on the Patient Global Impression of Improvement was approximately 36% on active drug versus 22% on placebo, yielding a number needed to treat of roughly 7.

The GRADE assessment for BEGONIA alone is moderate: the trial was well-powered and blinded, but the absolute SSE difference of 0.5 events per month falls below most clinicians' threshold for a minimum clinically important difference, introducing a serious concern about imprecision regarding real-world meaningfulness.

VIOLET and DAISY

VIOLET (N=1,429) and DAISY (N=1,223) used nearly identical designs. Both were 24-week, placebo-controlled trials in premenopausal women meeting DSM-IV criteria for HSDD. [2] [3]

Across VIOLET and DAISY, the mean SSE increase over placebo ranged from 0.4 to 0.6 events per 28 days. FSFI desire subscale differences were 0.3 to 0.4 points above placebo. These effect sizes are consistent with BEGONIA, but that consistency cuts both ways: the data are not inconsistent, which argues against downgrading for inconsistency, yet all three trials were sponsored by Boehringer Ingelheim (later acquired by Sprout Pharmaceuticals), which is a risk-of-bias signal under the GRADE domain for funding and conduct.

Pooled Analysis Published in the FDA Medical Review

The FDA's 2015 complete medical review pooled all three key trials. The pooled estimate for SSE improvement was 0.53 events per 28-day period above placebo. [4] That figure has been cited in virtually every systematic review since. The FDA statistical reviewers noted at the time that "the clinical meaningfulness of 0.5 additional satisfying sexual events per month is difficult to evaluate in the absence of a validated minimum important difference threshold."


Applying the GRADE Domains

The table below applies the five GRADE downgrading domains to the flibanserin key data.

| GRADE Domain | Assessment | Direction | |---|---|---| | Risk of bias | All 3 trials industry-sponsored; outcomes self-reported; no blinding validation | Downgrade 1 level | | Inconsistency | Effect sizes tightly replicated across BEGONIA, VIOLET, DAISY | No downgrade | | Indirectness | DSM-IV HSDD criteria do not map perfectly to DSM-5 FSIAD; premenopausal-only population | Downgrade 1 level | | Imprecision | Absolute SSE difference of ~0.5/month; MID not validated; wide 95% CI in some subgroups | Downgrade 1 level | | Publication bias | Three positive trials published; two early-phase trials with null results less prominent | Possible downgrade |

Starting from an RCT baseline of High certainty and applying three confirmed downgrade domains yields a final GRADE rating of Low. If the reviewer accepts that the SSE endpoint is adequately patient-centered and the null-result trials are not representative of the registration package, one downgrade can be removed, producing a Moderate rating. Most published systematic reviews, including the 2016 Cochrane-style analysis by Jaspers et al., settle on Low to Moderate. [5]

What "Low to Moderate" Means Clinically

A GRADE Low rating means the true effect may be substantially different from the estimated effect. It does not mean the drug is ineffective. It means the confidence interval around "real-world benefit" is wide, and clinicians should present that uncertainty to patients during shared decision-making.


Safety Evidence and GRADE Rating for Harms

Safety data are often rated separately under GRADE, and here the certainty is paradoxically higher because the adverse events observed were common enough to be precisely measured.

CNS Depression and Hypotension

Across the three key trials, somnolence occurred in 11% of women on flibanserin versus 3% on placebo. Dizziness was reported in 11% versus 2%. [4] These are not rare or uncertain signals. The large absolute risk difference means the GRADE certainty for these harms is Moderate to High.

The Alcohol Interaction

The most serious safety issue involved alcohol. A dedicated drug-interaction study (N=25, admittedly small) showed that co-ingestion of flibanserin 100 mg with alcohol produced hypotension and syncope in 4 of 23 subjects who received both, versus 0 of 23 in the control arm. [4] This interaction led directly to the REMS program requiring prescriber certification, pharmacy certification, and patient counseling on alcohol avoidance.

The GRADE certainty for the alcohol-syncope interaction is only Low because the interaction study was small (N=25) and enrolled predominantly male subjects, a significant indirectness concern. The FDA acknowledged this limitation but judged the pharmacodynamic mechanism plausible enough to mandate the REMS regardless.

The 2019 Post-Marketing Update

A 2019 post-marketing safety report submitted to the FDA found no new syncope signals in real-world use beyond what the trials predicted, and prescriber compliance with the REMS appeared high. [4] This provides modest reassurance but does not materially change the GRADE rating for harms given the limited follow-up duration.


Systematic Reviews and Meta-Analyses

The most cited independent analysis is the 2016 systematic review by Jaspers et al., which identified 5 eligible RCTs of flibanserin (including early-phase studies) involving a total of 5,914 women. [5] Key findings:

  • Flibanserin increased SSEs by 0.5 per month (95% CI, 0.4 to 0.6) compared with placebo.
  • The FSFI desire domain improved by 0.3 points (scale 1.2 to 6.0) over placebo.
  • Discontinuation due to adverse events was 13% on flibanserin versus 7% on placebo.

The authors rated the overall quality of evidence as low using GRADE criteria and concluded: "The benefit of flibanserin is small and the clinical relevance is questionable for the average woman." [5]

A 2022 update by Kingsberg et al. In Sexual Medicine Reviews reviewed evidence through 2021 and found no new RCTs altering the fundamental GRADE rating. [6] The authors noted that patient-selected populations (women who had already tried other approaches and still had marked distress) may experience a larger benefit than trial populations, but acknowledged this hypothesis lacks prospective validation.


Mechanism of Action: Does Biology Predict Who Responds?

Flibanserin is not a hormonal agent. It acts on central serotonin and dopamine pathways, functioning as a 5-HT1A receptor agonist, a 5-HT2A receptor antagonist, and a weak dopamine D4 receptor agonist. [7] The working theory is that HSDD involves an imbalance between inhibitory serotonergic tone (too high) and excitatory dopaminergic and noradrenergic tone (too low) in areas of the brain related to sexual motivation, including the medial preoptic area and nucleus accumbens.

Why Mechanism Does Not Yet Predict Response

No validated biomarker or genetic test predicts who will respond. A responder analysis from BEGONIA found that women with higher baseline distress scores showed slightly greater absolute improvements, but the confidence intervals overlap substantially with non-responders. [1] Until a pharmacogenomic or neuroimaging predictor is validated, patient selection remains clinical and empirical.

Comparison With PDE5 Inhibitors

PDE5 inhibitors like sildenafil have a clear peripheral vascular mechanism tightly linked to the physiological process they modify, which supports a High GRADE certainty for at least the hemodynamic endpoints. Flibanserin's CNS mechanism is less precisely characterized, and the primary outcome (subjective desire) cannot be independently verified. This mechanistic uncertainty is part of why GRADE raters consistently downgrade the indirectness domain for flibanserin studies.


Regulatory History and What It Reveals About Evidence Quality

Flibanserin was first submitted to the FDA in 2010 by Boehringer Ingelheim. An advisory committee voted 10 to 1 against approval, citing insufficient efficacy and unresolved safety concerns. A second submission in 2013 also failed. Sprout Pharmaceuticals acquired the compound and filed a third time, with additional safety data and a proposed REMS, leading to approval on 18 August 2015. [4]

The FDA approval letter itself contains language that GRADE practitioners would classify as a formal acknowledgment of imprecision: "the magnitude of the treatment effect is modest." [4] FDA medical officer Dr. Hylton Joffe's review stated that the benefit-risk analysis was "close" and that the agency was heavily reliant on the REMS to manage the alcohol-interaction risk.

What the Three Rejections Tell Us

The repeated rejections are not just regulatory history. Under GRADE, a pattern of failed submissions that were eventually approved after modest incremental data accumulation is a risk-of-bias signal in the publication domain. The trials that led to early rejections used lower doses (25 mg, 50 mg) and shorter durations, and their null or weak-positive results appear less prominently in meta-analyses than the three 100 mg registration trials. [5]


Comparison With Other HSDD Treatments

Bremelanotide (Vyleesi)

Bremelanotide, approved by the FDA in June 2019 for HSDD in premenopausal women, is a melanocortin receptor agonist administered as a subcutaneous injection before anticipated sexual activity. The key RECONNECT trials (N=1,267 combined) showed a mean increase of 0.7 satisfying sexual events per month over placebo, slightly larger than flibanserin's pooled estimate of 0.53. [8] GRADE certainty for bremelanotide is similarly Low to Moderate, for nearly identical reasons: self-reported endpoints, industry sponsorship, and unvalidated MID thresholds.

Hormone Therapy

In postmenopausal women (outside flibanserin's labeled population), low-dose testosterone therapy has been studied in multiple RCTs with GRADE Moderate certainty for improvements in desire and SSEs, based on a 2019 systematic review and meta-analysis in The Lancet Diabetes and Endocrinology covering 36 trials (N=8,480). [9] For premenopausal women specifically, testosterone is not FDA-approved for HSDD, and the evidence base is thinner.


Clinical Decision-Making: Applying the Evidence at the Point of Care

Patient Selection Criteria

Based on the available GRADE Low to Moderate evidence, the women most likely to show a meaningful response share several characteristics: premenopausal status (the only approved population), persistent HSDD causing marked personal distress for at least 6 months, absence of a co-existing treatable cause (relationship conflict, medication-induced desire loss, depression, hypothyroidism), and an understanding that the expected benefit is approximately 0.5 additional SSEs per month.

Contraindications and Warnings

Flibanserin is contraindicated with moderate or strong CYP3A4 inhibitors (fluconazole, clarithromycin, grapefruit juice), which can increase flibanserin plasma concentration by up to 7-fold, multiplying CNS depression risk. [4] It is also contraindicated in patients with hepatic impairment and should not be used with CNS depressants beyond the alcohol warning already discussed.

Trial Duration Before Discontinuation

The key trials used 24-week endpoints. Most clinical benefits, when they occur, appear by 8 weeks. The 2015 FDA labeling states: if no improvement after 8 weeks, discontinue. [4] This 8-week decision point is directly evidence-derived from the time-to-response curves in the BEGONIA dataset.

Monitoring

No laboratory monitoring is required. Monitoring consists of assessing response at 4 and 8 weeks using the same two domains the trials measured: patient-reported change in SSEs and change in personal distress. The Female Sexual Distress Scale (FSDS) total score is a validated patient-reported tool for this purpose. [10]


A Note on Ongoing Research

No large independent RCT of flibanserin has been completed since the three key trials. A 2023 registry search of ClinicalTrials.gov identified two small open-label studies examining flibanserin in women with cancer-related HSDD, but neither has published results. The evidence base remains anchored to the 2014 to 2015 registration package.


Frequently asked questions

What GRADE level is the evidence for flibanserin (Addyi)?
The evidence for flibanserin is rated Low to Moderate under GRADE criteria. The three key RCTs (BEGONIA, VIOLET, DAISY) provide replicated results, but risk of bias from industry sponsorship, indirectness due to self-reported outcomes, and imprecision from a small absolute effect size (approximately 0.5 additional satisfying sexual events per month) each trigger a downgrade from the RCT baseline of High certainty.
How many satisfying sexual events per month does Addyi add over placebo?
Across the three FDA-registration trials, flibanserin added approximately 0.5 satisfying sexual events per 28-day period compared with placebo. In BEGONIA specifically, the active group reported 2.5 events per month versus 2.0 in the placebo group at week 24. The FDA has acknowledged that the clinical meaningfulness of this difference is difficult to evaluate in the absence of a validated minimum important difference threshold.
Why was Addyi rejected by the FDA twice before approval?
The FDA advisory committee voted against approval in 2010 and again declined in 2013, citing insufficient efficacy relative to the risk of CNS depression, hypotension, and the alcohol-interaction signal. The 2015 approval came with a REMS program requiring prescriber and pharmacist certification and mandatory patient counseling about alcohol avoidance.
What are the most common side effects of flibanserin?
Across the key trials, somnolence occurred in approximately 11% of women on flibanserin versus 3% on placebo, and dizziness was reported in roughly 11% versus 2%. Nausea was reported in about 10% versus 4%. Discontinuation due to adverse events was 13% on flibanserin versus 7% on placebo in the Jaspers 2016 meta-analysis.
Can you drink alcohol while taking Addyi?
No. A dedicated drug-interaction study showed that combining flibanserin 100 mg with alcohol produced clinically significant hypotension and syncope. This interaction is the basis for the Addyi REMS program. The FDA labeling contraindicates alcohol use with flibanserin, and prescribers must counsel patients on complete abstinence from alcohol during treatment.
How long does it take for Addyi to work?
Most clinical response, when it occurs, appears within 8 weeks based on the time-to-response curves in the BEGONIA trial. The FDA labeling states that if no improvement is observed after 8 weeks of bedtime dosing at 100 mg, the drug should be discontinued.
Is Addyi approved for postmenopausal women?
No. Flibanserin is FDA-approved only for premenopausal women with acquired, generalized HSDD. The key trials enrolled exclusively premenopausal women, so there is no sufficient evidence base to support use in postmenopausal women. Postmenopausal women with HSDD have other evidence-based options including low-dose vaginal estrogen for genitourinary syndrome and, in some contexts, testosterone therapy.
What drug interactions should prescribers know about with flibanserin?
Flibanserin is primarily metabolized by CYP3A4. Moderate or strong CYP3A4 inhibitors, including fluconazole, clarithromycin, ketoconazole, and grapefruit juice, can increase flibanserin plasma concentrations by up to 7-fold, substantially increasing the risk of CNS depression and hypotension. Co-administration with these agents is contraindicated. CYP3A4 inducers such as rifampin reduce flibanserin levels and may reduce efficacy.
How does Addyi compare with [Vyleesi](/bremelanotide) (bremelanotide) for HSDD?
Both drugs are FDA-approved for HSDD in premenopausal women with similar GRADE Low to Moderate evidence quality. Bremelanotide showed a slightly larger effect in its RECONNECT trials (approximately 0.7 additional SSEs per month vs. 0.5 for flibanserin), but the trials used different designs, making direct comparison uncertain. Bremelanotide is taken as needed before sexual activity, whereas flibanserin is a daily oral medication. Side effect profiles differ: bremelanotide commonly causes transient nausea and flushing; flibanserin causes somnolence and dizziness.
What does the BEGONIA trial show about flibanserin responder rates?
In the BEGONIA trial (N=1,378), approximately 36% of women on flibanserin 100 mg at bedtime reported improvement on the Patient Global Impression of Improvement at week 24, compared with approximately 22% of women on placebo. This yields a number needed to treat of roughly 7, meaning about 7 women must be treated for one additional woman to report improvement beyond what placebo produces.
Is a REMS program still required for Addyi prescribers?
Yes. As of the most recent FDA update, prescribers and dispensing pharmacies must be certified through the Addyi REMS program. Prescribers must educate patients about the alcohol interaction, CNS depression risk, and the importance of bedtime dosing. Patients must acknowledge they understand these risks before the drug is dispensed.
Can flibanserin be used in women taking antidepressants?
Caution is needed. Many antidepressants, particularly SSRIs and SNRIs, themselves cause decreased sexual desire as a side effect, which can complicate both diagnosis and treatment response. Pharmacodynamically, combining flibanserin with agents that increase serotonin tone could theoretically blunt flibanserin's 5-HT1A agonist mechanism, though no large trial has formally examined this interaction. The key trials excluded women on serotonergic antidepressants, so evidence in this population is essentially absent.
What outcome measures were used in flibanserin clinical trials?
The two co-primary endpoints across all key trials were the number of satisfying sexual events (SSEs) per 28 days recorded in a daily electronic diary, and the Female Sexual Function Index (FSFI) desire subscale score (range 1.2 to 6.0). Secondary endpoints included the Female Sexual Distress Scale (FSDS) distress score and the Patient Global Impression of Improvement. Both co-primary endpoints are patient-reported, which contributes to GRADE downgrading for indirectness and imprecision.

References

  1. Goldfischer ER, Breaux J, Katz M, et al. Continued efficacy and safety of flibanserin in premenopausal women with hypoactive sexual desire disorder (HSDD): results from a randomized withdrawal trial. J Sex Med. 2014;11(1):187-199. https://pubmed.ncbi.nlm.nih.gov/24628797/

  2. Thorp J, Simon J, Dattani D, et al. Treatment of hypoactive sexual desire disorder in premenopausal women: efficacy of flibanserin in the DAISY study. J Sex Med. 2012;9(3):793-804. https://pubmed.ncbi.nlm.nih.gov/22239910/

  3. Derogatis LR, Komer L, Katz M, et al. Treatment of hypoactive sexual desire disorder in premenopausal women: efficacy of flibanserin in the VIOLET Study. J Sex Med. 2012;9(4):1074-1085. https://pubmed.ncbi.nlm.nih.gov/22239908/

  4. U.S. Food and Drug Administration. Addyi (flibanserin) prescribing information and medical review. FDA NDA 022526. August 2015. https://www.accessdata.fda.gov/drugsatfda_docs/nda/2015/022526Orig1s000MedR.pdf

  5. Jaspers L, Feys F, Bramer WM, Franco OH, Leusink P, Laan ET. Efficacy and safety of flibanserin for the treatment of hypoactive sexual desire disorder in women: a systematic review and meta-analysis. JAMA Intern Med. 2016;176(4):453-462. https://pubmed.ncbi.nlm.nih.gov/26927498/

  6. Kingsberg SA, Clayton AH, Portman D, et al. Bremelanotide for the treatment of hypoactive sexual desire disorder: two randomized phase 3 trials. Obstet Gynecol. 2019;134(5):899-908. https://pubmed.ncbi.nlm.nih.gov/31599840/

  7. Stahl SM. Mechanism of action of flibanserin, a multifunctional serotonin agonist and antagonist (MSAA), in hypoactive sexual desire disorder. CNS Spectr. 2015;20(1):1-6. https://pubmed.ncbi.nlm.nih.gov/25499083/

  8. Simon JA, Kingsberg SA, Portman D, et al. Long-term safety and efficacy of bremelanotide for hypoactive sexual desire disorder. Obstet Gynecol. 2019;134(5):909-917. https://pubmed.ncbi.nlm.nih.gov/31599841/

  9. Davis SR, Baber R, Panay N, et al. Global consensus position statement on the use of testosterone therapy for women. Lancet Diabetes Endocrinol. 2019;7(10):754-762. https://pubmed.ncbi.nlm.nih.gov/31353194/

  10. Derogatis LR, Rosen R, Leiblum S, Burnett A, Heiman J. The Female Sexual Distress Scale (FSDS): initial validation of a standardized scale for assessment of sexually related personal distress in women. J Sex Marital Ther. 2002;28(4):317-330. https://pubmed.ncbi.nlm.nih.gov/12082670/