Honest Criticisms and Limitations of the Testosterone Trials (T-Trials) Trial

Hormone therapy clinical care image for Honest Criticisms and Limitations of the Testosterone Trials (T-Trials) Trial

What Are the Real Limitations of the Testosterone Trials (T-Trials)?

At a glance

| Parameter | Detail | |-----------|--------| | N | 790 men aged 65+ with low testosterone (<275 ng/dL) | | Intervention | Transdermal testosterone gel (AndroGel 1.62%), dose-titrated to mid-normal range | | Comparator | Placebo gel | | Duration | 12 months | | Primary endpoints | Sexual function (PDQ-Q4), physical function (6-minute walk), vitality (FACIT-Fatigue) | | Key result | Significant improvement in sexual function and walking distance; modest vitality gains |

Trial Architecture: What Was Actually Tested

The Testosterone Trials were coordinated across 12 U.S. academic medical centers from 2010 to 2014. The study enrolled men 65 years or older who had a serum testosterone below 275 ng/dL on two morning draws and who reported symptoms consistent with one of three functional domains: sexual dysfunction, reduced vitality, or limited physical mobility.

The design was unusual. Rather than a single primary endpoint, the T-Trials comprised seven nested sub-trials (Sexual Function, Physical Function, Vitality, Cognitive Function, Anemia, Bone, and Cardiovascular). The first three were powered as co-primary endpoints in the initial NEJM publication. The remaining four reported separately.

This architecture matters for interpretation. Each sub-trial had its own inclusion criteria layered on top of the common testosterone threshold, creating overlapping but non-identical cohorts within the 790 participants.

Enrollment Biases and Who Was Excluded

The HealthRX Enrollment Bias Framework for the T-Trials

We identify five categories of selection pressure that shaped who entered this trial and, by extension, who the results apply to:

1. Age floor without ceiling pressure. All participants were 65+, but the mean age was 72. Men in their late 50s and early 60s, who represent the largest TRT-prescribing demographic in clinical practice, were entirely absent.

2. Testosterone threshold artifact. The <275 ng/dL cutoff on two morning samples excluded men with borderline values (275-350 ng/dL) who may be symptomatic. The Endocrine Society guidelines later set their threshold at <300 ng/dL, meaning the T-Trials population was slightly more hypogonadal than the guideline-defined group.

3. Cardiovascular exclusion. Men with recent MI, stroke, or uncontrolled heart failure were excluded. This creates a survivorship bias: the enrolled cohort was healthier than the average 72-year-old man with low testosterone. Post-hoc cardiovascular findings from this trial cannot be projected onto higher-risk patients.

4. BMI and comorbidity filtering. Although obesity was common (mean BMI ~30), men with severe obesity (BMI >40) or uncontrolled diabetes (A1c >8%) were generally excluded by site investigators during screening, though no hard protocol cutoff existed for BMI.

5. Volunteerism bias. Recruitment required motivation to attend 9+ clinic visits over 12 months. Men with depression, cognitive decline, or limited mobility (the very populations who might benefit most from TRT) were underrepresented.

Duration: Twelve Months Is Not Long Enough

The T-Trials lasted one year. For a therapy many clinicians prescribe indefinitely, this constrains interpretation in three ways.

First, prostate safety. The FDA label for testosterone products warns about prostate monitoring. Twelve months is insufficient to detect clinically meaningful changes in prostate cancer incidence, which typically requires 3-5 years of observation.

Second, cardiovascular risk. The subsequent TRAVERSE trial (2023, N=5,246) was specifically designed to address the cardiovascular question with a median follow-up of 33 months. TRAVERSE found non-inferiority but a signal for increased atrial fibrillation and pulmonary embolism. The T-Trials' 12-month window was too narrow to detect these events with statistical confidence.

Third, durability of benefit. The T-Trials publication showed sexual function improvements were maximal at 3 months and sustained at 12 months. Whether these gains persist at 24 or 36 months, or whether tachyphylaxis occurs, remains unknown from this dataset.

Statistical Design Caveats

Multiple Co-Primary Endpoints Without Multiplicity Adjustment

The three co-primary endpoints (sexual function, physical function, vitality) were evaluated at alpha = 0.05 each, without formal correction for multiple comparisons. The investigators justified this by noting the endpoints were pre-specified and that each addressed a distinct clinical domain. Critics pointed out that with three bites at the apple, the family-wise error rate approached 14%.

Effect Sizes in Context

| Endpoint | Treatment effect | Clinical significance threshold | |----------|-----------------|-------------------------------| | PDQ-Q4 (sexual desire) | +0.58 points (0-4 scale) | Not pre-defined; MCID uncertain | | 6-minute walk distance | +6.3 meters | Established MCID: 20-30 meters | | FACIT-Fatigue | +2.41 points | Established MCID: 3-4 points |

The walking distance improvement of 6.3 meters fell well below the accepted minimal clinically important difference for the 6-minute walk test. The vitality score similarly did not reach established MCID thresholds. Only sexual function showed an effect size that most participants would notice subjectively, as reported in the original T-Trials results.

Responder Analysis Was Post-Hoc

The publication reports mean changes but does not include pre-specified responder analyses. How many individual men experienced meaningful improvement versus how many showed no change is unclear from the primary report. Subsequent sub-analyses addressed this partially, but without the statistical rigor of a pre-specified threshold.

Generalizability Gaps

The enrolled population was 88% white, drawn exclusively from U.S. academic centers. Racial and ethnic minorities, who may have different testosterone-symptom relationships and different baseline cardiovascular risk profiles, were underrepresented.

The intervention (AndroGel 1.62%) represents one specific delivery system. Injections (testosterone cypionate or enanthate), which account for the majority of TRT prescriptions by volume, were not studied. Pharmacokinetic differences between gels and injections (steady-state vs. peaks and troughs) may produce different symptomatic and safety profiles.

Men with known causes of hypogonadism (pituitary tumors, Klinefelter syndrome, prior orchidectomy) were excluded. The trial specifically targeted "age-related" low testosterone, sometimes called "late-onset hypogonadism," a category whose validity as a distinct clinical entity remains debated in European Association of Urology guidelines.

Conflict of Interest and Funding

The T-Trials were funded by the National Institute on Aging, with testosterone gel provided by AbbVie (manufacturer of AndroGel). AbbVie had no role in study design, data collection, or analysis according to the published disclosures.

However, several investigators disclosed consulting fees or grants from AbbVie and other testosterone manufacturers. Letters to the editor in NEJM following publication raised the concern that the trial's framing emphasized the positive sexual function result while the null physical function and borderline vitality findings received comparatively less attention in the abstract and conclusions.

The NIA funding insulates the T-Trials from the most common criticism of industry-sponsored TRT research, but the product supply relationship and individual investigator conflicts warranted transparency, which the authors did provide in supplementary materials.

What Subsequent Commentary Identified

Post-publication letters and editorials raised several additional points:

The accompanying NEJM editorial noted that the trial was "not designed or powered to assess the long-term risks" and cautioned against interpreting the results as evidence that TRT is safe.

A 2017 sub-study from the T-Trials (the Cardiovascular Trial) found that testosterone increased coronary artery plaque volume measured by CT angiography. This finding, published in JAMA, introduced uncertainty about cardiovascular safety that the 12-month primary endpoints could not address.

The bone density sub-trial showed improvements in volumetric BMD, but without fracture endpoints. Whether the BMD gains translate to fracture prevention in this population remains speculative.

What This Means for Clinical Practice

The T-Trials remain the most rigorous evidence that testosterone gel improves sexual function in older hypogonadal men over 12 months. That finding is real. But prescribers should understand what the trial did not establish: long-term safety, benefit in men under 65, benefit from injectable formulations, clinically meaningful improvements in physical function or energy, and applicability to racially diverse populations.

The trial is best understood as proof-of-concept for symptomatic relief in a carefully selected population, not as a blanket endorsement of TRT in aging men with borderline testosterone levels.

Frequently asked questions

References

  1. Snyder PJ, Bhasin S, Cunningham GR, et al. Effects of Testosterone Treatment in Older Men. N Engl J Med. 2016;374(7):611-624. https://pubmed.ncbi.nlm.nih.gov/26886521/

  2. Budoff MJ, Ellenberg SS, Lewis CE, et al. Testosterone Treatment and Coronary Artery Plaque Volume in Older Men With Low Testosterone. JAMA. 2017;317(7):708-716. https://pubmed.ncbi.nlm.nih.gov/28241244/

  3. Bhasin S, Brito JP, Cunningham GR, et al. Testosterone Therapy in Men With Hypogonadism: An Endocrine Society Clinical Practice Guideline. J Clin Endocrinol Metab. 2018;103(5):1715-1744. https://pubmed.ncbi.nlm.nih.gov/29562364/

  4. Lincoff AM, Bhasin S, Flevaris P, et al. Cardiovascular Safety of Testosterone-Replacement Therapy. N Engl J Med. 2023;389(2):107-117. https://pubmed.ncbi.nlm.nih.gov/37334136/

  5. FDA. AndroGel (testosterone gel) 1.62% Prescribing Information. 2018. https://www.accessdata.fda.gov/drugsatfda_docs/label/2018/021229s052lbl.pdf