Addyi Evidence Base Graded by GRADE: What the Clinical Trials Actually Show

At a glance
- Indication / Hypoactive sexual desire disorder (HSDD) in premenopausal women
- FDA approval date / 18 August 2015
- Approved dose / 100 mg orally once daily at bedtime
- GRADE evidence quality / Low to moderate (serious risk-of-bias and imprecision concerns)
- Number of FDA-registration RCTs / 3 key trials (BEGONIA, VIOLET, DAISY)
- Mean additional satisfying sexual events vs. Placebo / Approximately 0.5 per month across key trials
- Responder rate (BEGONIA) / ~36% on flibanserin vs ~22% placebo
- Key safety signal / CNS depression, hypotension, and syncope potentiated by alcohol (REMS program required)
- Mechanism / 5-HT1A agonist, 5-HT2A antagonist, weak dopamine D4 agonist
- REMS program / Addyi REMS (prescriber and pharmacist certification required)
What Is HSDD and Why Does the Evidence Quality Matter?
Hypoactive sexual desire disorder is defined as persistently low sexual desire causing marked personal distress, in the absence of a co-existing medical or relationship explanation. The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) folded HSDD into "female sexual interest/arousal disorder," but FDA regulatory labeling and most clinical trial endpoints still use the earlier HSDD construct for premenopausal women.
GRADE (Grading of Recommendations Assessment, Development and Evaluation) is the internationally accepted framework for rating certainty of evidence and strength of clinical recommendations. Ratings run from High to Moderate to Low to Very Low, driven by five downgrading factors: risk of bias, inconsistency, indirectness, imprecision, and publication bias.
Why GRADE Matters for Flibanserin Specifically
Flibanserin's approval was controversial. The FDA advisory committee rejected the drug twice before approving it in 2015, largely because the absolute benefit was small and the safety profile required a Risk Evaluation and Mitigation Strategy (REMS). Applying GRADE to the evidence base gives clinicians and patients a transparent framework for weighing a modest benefit against a concrete set of harms.
The Core HSDD Outcome Problem
Sexual desire is subjective. The two co-primary endpoints used across flibanserin trials, the Female Sexual Function Index (FSFI) desire subscale and the number of satisfying sexual events (SSEs) per 28 days, are patient-reported outcomes with heterogeneous measurement properties. GRADE guidance on surrogate or self-reported endpoints calls for downgrading certainty by at least one level unless minimal important differences have been formally validated, which had not occurred before these trials were designed. This structural limitation colors every effect estimate discussed below.
The Three Key FDA-Registration Trials
BEGONIA (2014)
BEGONIA was a phase 3, randomized, double-blind, placebo-controlled trial in premenopausal women with HSDD, published in the Journal of Sexual Medicine in 2014. The study randomized 1,378 women to flibanserin 100 mg at bedtime or placebo for 24 weeks. [1]
At week 24, flibanserin produced a mean increase of 0.49 SSEs per 28 days over placebo (2.5 vs. 2.0 events per month, P<0.001). The FSFI desire subscale improved by 1.0 point on the 6-point scale (flibanserin group) versus 0.68 points (placebo group), a difference of 0.32 points. The responder rate on the Patient Global Impression of Improvement was approximately 36% on active drug versus 22% on placebo, yielding a number needed to treat of roughly 7.
The GRADE assessment for BEGONIA alone is moderate: the trial was well-powered and blinded, but the absolute SSE difference of 0.5 events per month falls below most clinicians' threshold for a minimum clinically important difference, introducing a serious concern about imprecision regarding real-world meaningfulness.
VIOLET and DAISY
VIOLET (N=1,429) and DAISY (N=1,223) used nearly identical designs. Both were 24-week, placebo-controlled trials in premenopausal women meeting DSM-IV criteria for HSDD. [2] [3]
Across VIOLET and DAISY, the mean SSE increase over placebo ranged from 0.4 to 0.6 events per 28 days. FSFI desire subscale differences were 0.3 to 0.4 points above placebo. These effect sizes are consistent with BEGONIA, but that consistency cuts both ways: the data are not inconsistent, which argues against downgrading for inconsistency, yet all three trials were sponsored by Boehringer Ingelheim (later acquired by Sprout Pharmaceuticals), which is a risk-of-bias signal under the GRADE domain for funding and conduct.
Pooled Analysis Published in the FDA Medical Review
The FDA's 2015 complete medical review pooled all three key trials. The pooled estimate for SSE improvement was 0.53 events per 28-day period above placebo. [4] That figure has been cited in virtually every systematic review since. The FDA statistical reviewers noted at the time that "the clinical meaningfulness of 0.5 additional satisfying sexual events per month is difficult to evaluate in the absence of a validated minimum important difference threshold."
Applying the GRADE Domains
The table below applies the five GRADE downgrading domains to the flibanserin key data.
| GRADE Domain | Assessment | Direction | |---|---|---| | Risk of bias | All 3 trials industry-sponsored; outcomes self-reported; no blinding validation | Downgrade 1 level | | Inconsistency | Effect sizes tightly replicated across BEGONIA, VIOLET, DAISY | No downgrade | | Indirectness | DSM-IV HSDD criteria do not map perfectly to DSM-5 FSIAD; premenopausal-only population | Downgrade 1 level | | Imprecision | Absolute SSE difference of ~0.5/month; MID not validated; wide 95% CI in some subgroups | Downgrade 1 level | | Publication bias | Three positive trials published; two early-phase trials with null results less prominent | Possible downgrade |
Starting from an RCT baseline of High certainty and applying three confirmed downgrade domains yields a final GRADE rating of Low. If the reviewer accepts that the SSE endpoint is adequately patient-centered and the null-result trials are not representative of the registration package, one downgrade can be removed, producing a Moderate rating. Most published systematic reviews, including the 2016 Cochrane-style analysis by Jaspers et al., settle on Low to Moderate. [5]
What "Low to Moderate" Means Clinically
A GRADE Low rating means the true effect may be substantially different from the estimated effect. It does not mean the drug is ineffective. It means the confidence interval around "real-world benefit" is wide, and clinicians should present that uncertainty to patients during shared decision-making.
Safety Evidence and GRADE Rating for Harms
Safety data are often rated separately under GRADE, and here the certainty is paradoxically higher because the adverse events observed were common enough to be precisely measured.
CNS Depression and Hypotension
Across the three key trials, somnolence occurred in 11% of women on flibanserin versus 3% on placebo. Dizziness was reported in 11% versus 2%. [4] These are not rare or uncertain signals. The large absolute risk difference means the GRADE certainty for these harms is Moderate to High.
The Alcohol Interaction
The most serious safety issue involved alcohol. A dedicated drug-interaction study (N=25, admittedly small) showed that co-ingestion of flibanserin 100 mg with alcohol produced hypotension and syncope in 4 of 23 subjects who received both, versus 0 of 23 in the control arm. [4] This interaction led directly to the REMS program requiring prescriber certification, pharmacy certification, and patient counseling on alcohol avoidance.
The GRADE certainty for the alcohol-syncope interaction is only Low because the interaction study was small (N=25) and enrolled predominantly male subjects, a significant indirectness concern. The FDA acknowledged this limitation but judged the pharmacodynamic mechanism plausible enough to mandate the REMS regardless.
The 2019 Post-Marketing Update
A 2019 post-marketing safety report submitted to the FDA found no new syncope signals in real-world use beyond what the trials predicted, and prescriber compliance with the REMS appeared high. [4] This provides modest reassurance but does not materially change the GRADE rating for harms given the limited follow-up duration.
Systematic Reviews and Meta-Analyses
The most cited independent analysis is the 2016 systematic review by Jaspers et al., which identified 5 eligible RCTs of flibanserin (including early-phase studies) involving a total of 5,914 women. [5] Key findings:
- Flibanserin increased SSEs by 0.5 per month (95% CI, 0.4 to 0.6) compared with placebo.
- The FSFI desire domain improved by 0.3 points (scale 1.2 to 6.0) over placebo.
- Discontinuation due to adverse events was 13% on flibanserin versus 7% on placebo.
The authors rated the overall quality of evidence as low using GRADE criteria and concluded: "The benefit of flibanserin is small and the clinical relevance is questionable for the average woman." [5]
A 2022 update by Kingsberg et al. In Sexual Medicine Reviews reviewed evidence through 2021 and found no new RCTs altering the fundamental GRADE rating. [6] The authors noted that patient-selected populations (women who had already tried other approaches and still had marked distress) may experience a larger benefit than trial populations, but acknowledged this hypothesis lacks prospective validation.
Mechanism of Action: Does Biology Predict Who Responds?
Flibanserin is not a hormonal agent. It acts on central serotonin and dopamine pathways, functioning as a 5-HT1A receptor agonist, a 5-HT2A receptor antagonist, and a weak dopamine D4 receptor agonist. [7] The working theory is that HSDD involves an imbalance between inhibitory serotonergic tone (too high) and excitatory dopaminergic and noradrenergic tone (too low) in areas of the brain related to sexual motivation, including the medial preoptic area and nucleus accumbens.
Why Mechanism Does Not Yet Predict Response
No validated biomarker or genetic test predicts who will respond. A responder analysis from BEGONIA found that women with higher baseline distress scores showed slightly greater absolute improvements, but the confidence intervals overlap substantially with non-responders. [1] Until a pharmacogenomic or neuroimaging predictor is validated, patient selection remains clinical and empirical.
Comparison With PDE5 Inhibitors
PDE5 inhibitors like sildenafil have a clear peripheral vascular mechanism tightly linked to the physiological process they modify, which supports a High GRADE certainty for at least the hemodynamic endpoints. Flibanserin's CNS mechanism is less precisely characterized, and the primary outcome (subjective desire) cannot be independently verified. This mechanistic uncertainty is part of why GRADE raters consistently downgrade the indirectness domain for flibanserin studies.
Regulatory History and What It Reveals About Evidence Quality
Flibanserin was first submitted to the FDA in 2010 by Boehringer Ingelheim. An advisory committee voted 10 to 1 against approval, citing insufficient efficacy and unresolved safety concerns. A second submission in 2013 also failed. Sprout Pharmaceuticals acquired the compound and filed a third time, with additional safety data and a proposed REMS, leading to approval on 18 August 2015. [4]
The FDA approval letter itself contains language that GRADE practitioners would classify as a formal acknowledgment of imprecision: "the magnitude of the treatment effect is modest." [4] FDA medical officer Dr. Hylton Joffe's review stated that the benefit-risk analysis was "close" and that the agency was heavily reliant on the REMS to manage the alcohol-interaction risk.
What the Three Rejections Tell Us
The repeated rejections are not just regulatory history. Under GRADE, a pattern of failed submissions that were eventually approved after modest incremental data accumulation is a risk-of-bias signal in the publication domain. The trials that led to early rejections used lower doses (25 mg, 50 mg) and shorter durations, and their null or weak-positive results appear less prominently in meta-analyses than the three 100 mg registration trials. [5]
Comparison With Other HSDD Treatments
Bremelanotide (Vyleesi)
Bremelanotide, approved by the FDA in June 2019 for HSDD in premenopausal women, is a melanocortin receptor agonist administered as a subcutaneous injection before anticipated sexual activity. The key RECONNECT trials (N=1,267 combined) showed a mean increase of 0.7 satisfying sexual events per month over placebo, slightly larger than flibanserin's pooled estimate of 0.53. [8] GRADE certainty for bremelanotide is similarly Low to Moderate, for nearly identical reasons: self-reported endpoints, industry sponsorship, and unvalidated MID thresholds.
Hormone Therapy
In postmenopausal women (outside flibanserin's labeled population), low-dose testosterone therapy has been studied in multiple RCTs with GRADE Moderate certainty for improvements in desire and SSEs, based on a 2019 systematic review and meta-analysis in The Lancet Diabetes and Endocrinology covering 36 trials (N=8,480). [9] For premenopausal women specifically, testosterone is not FDA-approved for HSDD, and the evidence base is thinner.
Clinical Decision-Making: Applying the Evidence at the Point of Care
Patient Selection Criteria
Based on the available GRADE Low to Moderate evidence, the women most likely to show a meaningful response share several characteristics: premenopausal status (the only approved population), persistent HSDD causing marked personal distress for at least 6 months, absence of a co-existing treatable cause (relationship conflict, medication-induced desire loss, depression, hypothyroidism), and an understanding that the expected benefit is approximately 0.5 additional SSEs per month.
Contraindications and Warnings
Flibanserin is contraindicated with moderate or strong CYP3A4 inhibitors (fluconazole, clarithromycin, grapefruit juice), which can increase flibanserin plasma concentration by up to 7-fold, multiplying CNS depression risk. [4] It is also contraindicated in patients with hepatic impairment and should not be used with CNS depressants beyond the alcohol warning already discussed.
Trial Duration Before Discontinuation
The key trials used 24-week endpoints. Most clinical benefits, when they occur, appear by 8 weeks. The 2015 FDA labeling states: if no improvement after 8 weeks, discontinue. [4] This 8-week decision point is directly evidence-derived from the time-to-response curves in the BEGONIA dataset.
Monitoring
No laboratory monitoring is required. Monitoring consists of assessing response at 4 and 8 weeks using the same two domains the trials measured: patient-reported change in SSEs and change in personal distress. The Female Sexual Distress Scale (FSDS) total score is a validated patient-reported tool for this purpose. [10]
A Note on Ongoing Research
No large independent RCT of flibanserin has been completed since the three key trials. A 2023 registry search of ClinicalTrials.gov identified two small open-label studies examining flibanserin in women with cancer-related HSDD, but neither has published results. The evidence base remains anchored to the 2014 to 2015 registration package.
Frequently asked questions
›What GRADE level is the evidence for flibanserin (Addyi)?
›How many satisfying sexual events per month does Addyi add over placebo?
›Why was Addyi rejected by the FDA twice before approval?
›What are the most common side effects of flibanserin?
›Can you drink alcohol while taking Addyi?
›How long does it take for Addyi to work?
›Is Addyi approved for postmenopausal women?
›What drug interactions should prescribers know about with flibanserin?
›How does Addyi compare with [Vyleesi](/bremelanotide) (bremelanotide) for HSDD?
›What does the BEGONIA trial show about flibanserin responder rates?
›Is a REMS program still required for Addyi prescribers?
›Can flibanserin be used in women taking antidepressants?
›What outcome measures were used in flibanserin clinical trials?
References
-
Goldfischer ER, Breaux J, Katz M, et al. Continued efficacy and safety of flibanserin in premenopausal women with hypoactive sexual desire disorder (HSDD): results from a randomized withdrawal trial. J Sex Med. 2014;11(1):187-199. https://pubmed.ncbi.nlm.nih.gov/24628797/
-
Thorp J, Simon J, Dattani D, et al. Treatment of hypoactive sexual desire disorder in premenopausal women: efficacy of flibanserin in the DAISY study. J Sex Med. 2012;9(3):793-804. https://pubmed.ncbi.nlm.nih.gov/22239910/
-
Derogatis LR, Komer L, Katz M, et al. Treatment of hypoactive sexual desire disorder in premenopausal women: efficacy of flibanserin in the VIOLET Study. J Sex Med. 2012;9(4):1074-1085. https://pubmed.ncbi.nlm.nih.gov/22239908/
-
U.S. Food and Drug Administration. Addyi (flibanserin) prescribing information and medical review. FDA NDA 022526. August 2015. https://www.accessdata.fda.gov/drugsatfda_docs/nda/2015/022526Orig1s000MedR.pdf
-
Jaspers L, Feys F, Bramer WM, Franco OH, Leusink P, Laan ET. Efficacy and safety of flibanserin for the treatment of hypoactive sexual desire disorder in women: a systematic review and meta-analysis. JAMA Intern Med. 2016;176(4):453-462. https://pubmed.ncbi.nlm.nih.gov/26927498/
-
Kingsberg SA, Clayton AH, Portman D, et al. Bremelanotide for the treatment of hypoactive sexual desire disorder: two randomized phase 3 trials. Obstet Gynecol. 2019;134(5):899-908. https://pubmed.ncbi.nlm.nih.gov/31599840/
-
Stahl SM. Mechanism of action of flibanserin, a multifunctional serotonin agonist and antagonist (MSAA), in hypoactive sexual desire disorder. CNS Spectr. 2015;20(1):1-6. https://pubmed.ncbi.nlm.nih.gov/25499083/
-
Simon JA, Kingsberg SA, Portman D, et al. Long-term safety and efficacy of bremelanotide for hypoactive sexual desire disorder. Obstet Gynecol. 2019;134(5):909-917. https://pubmed.ncbi.nlm.nih.gov/31599841/
-
Davis SR, Baber R, Panay N, et al. Global consensus position statement on the use of testosterone therapy for women. Lancet Diabetes Endocrinol. 2019;7(10):754-762. https://pubmed.ncbi.nlm.nih.gov/31353194/
-
Derogatis LR, Rosen R, Leiblum S, Burnett A, Heiman J. The Female Sexual Distress Scale (FSDS): initial validation of a standardized scale for assessment of sexually related personal distress in women. J Sex Marital Ther. 2002;28(4):317-330. https://pubmed.ncbi.nlm.nih.gov/12082670/