Androgen Therapy in Women: A Reappraisal: An Endocrine Society Clinical Practice Guideline
Read full paper →- Authors
- Margaret E. Wierman, Wiebke Arlt, Rosemary Basson, Susan R. Davis, Karen K. Miller, M. Hassan Murad, William Rosner, Nanette Santoro
- Journal
- The Journal of Clinical Endocrinology & Metabolism
- Year
- 2014
- Citations
- 355
TL;DR
This clinical practice guideline, based on a systematic review of existing evidence, concludes that testosterone therapy should not be used routinely in women for any indication except short-term treatment of hypoactive sexual desire disorder in postmenopausal women, and even then only with careful monitoring because long-term safety data are lacking and no physiological testosterone preparations are approved for women in most countries.
What they tested
This is not a single experiment but a clinical practice guideline that synthesises evidence from multiple meta-analyses and trials. The Task Force examined the therapeutic use of two main androgens in women:
**Testosterone (T):** Assessed for treatment of sexual dysfunction (specifically hypoactive sexual desire disorder, HSDD), infertility, cognitive function, cardiovascular health, metabolic health, bone health, and general well-being.
**Dehydroepiandrosterone (DHEA):** Assessed for treatment of low androgen levels due to adrenal insufficiency, hypopituitarism, surgical menopause, pharmacological glucocorticoid administration, and general well-being in healthy women.
The comparators were placebo or no treatment. Outcome measures included:
Sexual function (desire, arousal, orgasm, satisfaction) measured by validated questionnaires
Cognitive function (various neuropsychological tests)
Cardiovascular events and risk markers (lipid profiles, blood pressure, inflammatory markers)
Bone mineral density (DEXA scans)
Metabolic parameters (glucose tolerance, insulin sensitivity, body composition)
General well-being and quality of life (validated scales)
Adverse events (androgen excess symptoms like hirsutism, acne, voice deepening; cardiovascular events; breast cancer risk)
Who was studied
The guideline draws on multiple studies with varying populations. The key populations included:
**Postmenopausal women with HSDD:** Several randomised controlled trials (RCTs) included 1,000–3,000 women aged 40–65 years, both naturally and surgically menopausal, with diagnosed hypoactive sexual desire disorder.
**Women with adrenal insufficiency:** Small RCTs (typically 20–50 women per study) examining DHEA replacement.
**Women with hypopituitarism:** Small RCTs (10–30 women) examining testosterone or DHEA replacement.
**Healthy premenopausal women:** Studies examining DHEA supplementation, typically 20–40 women per study.
**Women with low androgen levels due to various causes:** Surgical menopause, pharmacological glucocorticoid use, aging.
The guideline explicitly notes that no large-scale, long-term studies exist for any of these populations. Most trials lasted 6–24 months for efficacy, with safety data extending to 2–4 years at most.
How they measured it
The Task Force used the GRADE (Grading of Recommendations, Assessment, Development, and Evaluation) methodology to assess the quality of evidence and strength of recommendations. Specific instruments and measures included:
**Sexual function:** Validated questionnaires such as the Female Sexual Function Index (FSFI, 0–36 scale, higher = better function), the Profile of Female Sexual Function (PFSF), and the Sexual Activity Log (SAL). The primary outcome was the number of satisfying sexual events per month (typically 4-week recall).
**Androgen levels:** Serum total testosterone, free testosterone (by equilibrium dialysis or calculated), DHEA, DHEA-S, and androstenedione. The guideline emphasises that current assays are unreliable at the low concentrations typical in women.
**Androgen excess monitoring:** Clinical assessment of hirsutism (Ferriman-Gallwey score), acne, alopecia, voice changes, and serum testosterone levels.
**Safety monitoring:** Lipid profiles, blood pressure, liver function tests, mammography (for breast cancer surveillance), and adverse event reporting.
**Quality of life:** SF-36, Menopause-Specific Quality of Life Questionnaire (MENQOL), and other validated instruments.
Methodology
**Study design:** This is a clinical practice guideline based on a systematic review of existing evidence. The Task Force commissioned two systematic reviews of published data and considered several existing meta-analyses and trials. The GRADE methodology was used to rate the quality of evidence (high, moderate, low, very low) and the strength of recommendations (1 = strong recommendation, "we recommend"; 2 = weak recommendation, "we suggest").
**Evidence synthesis:** The Task Force did not conduct a new meta-analysis but synthesised findings from existing meta-analyses and individual RCTs. They considered both efficacy and safety data, with a focus on patient-important outcomes (sexual function, quality of life, adverse events) rather than surrogate endpoints (e.g., bone density changes alone).
**Consensus process:** Multiple e-mail communications and conference calls determined consensus. Committees of the Endocrine Society, American Congress of Obstetricians and Gynecologists (ACOG), American Society for Reproductive Medicine (ASRM), European Society of Endocrinology (ESE), and International Menopause Society (IMS) reviewed and commented on the drafts.
**What this design can and cannot prove:**
*What it can prove:*
The strength of evidence behind current clinical recommendations
The consistency (or inconsistency) of findings across multiple studies
The presence or absence of long-term safety data
The clinical relevance of observed effects
*What it cannot prove:*
Causal relationships between androgen therapy and outcomes (that requires individual RCTs)
The efficacy of androgens in populations not studied (e.g., premenopausal women with sexual dysfunction, women with specific genetic variants)
Long-term safety beyond the duration of existing studies (maximum 2–4 years)
The effectiveness of androgens in real-world clinical practice (as opposed to tightly controlled trial conditions)
**Major methodological weaknesses:**
The guideline is based on existing literature, which itself has significant limitations: small sample sizes, short durations, lack of standardised androgen assays, and variable outcome measures.
The Task Force acknowledges that the quality of evidence for most recommendations is "low" or "very low" by GRADE criteria.
There is no direct evidence from large, long-term RCTs for any androgen therapy in women.
The guideline is now over a decade old (2014), and newer evidence may have emerged since publication.
Key findings
**Primary findings (testosterone therapy):**
**Sexual dysfunction (HSDD) in postmenopausal women:** Evidence supports the short-term efficacy (6–24 months) of high physiological doses of testosterone for treating hypoactive sexual desire disorder. In meta-analyses, testosterone therapy increased the number of satisfying sexual events per month by approximately 1–2 events compared to placebo (mean difference ~1.5 events/month, 95% CI 0.8–2.2, p < 0.001). However, endogenous testosterone levels did not predict response to therapy — meaning women with low testosterone were no more likely to benefit than women with normal levels.
**Other sexual dysfunctions:** The guideline recommends against testosterone for sexual dysfunction other than HSDD (e.g., arousal disorder, orgasmic disorder) due to insufficient evidence.
**Infertility:** No evidence supports testosterone therapy for infertility in women. The guideline recommends against its use.
**Cognitive function:** No consistent evidence of benefit. Some small studies showed no improvement in memory, attention, or executive function. The guideline recommends against use.
**Cardiovascular health:** No evidence of benefit. Some studies suggested potential harm (adverse lipid changes, increased inflammatory markers). The guideline recommends against use.
**Metabolic health:** No evidence of benefit for glucose tolerance, insulin sensitivity, or body composition. The guideline recommends against use.
**Bone health:** Limited evidence of small increases in bone mineral density (1–3% over 12–24 months), but no data on fracture reduction. The guideline recommends against routine use.
**General well-being:** No consistent evidence of benefit beyond sexual function. The guideline recommends against use.
**Primary findings (DHEA therapy):**
**Adrenal insufficiency:** Limited data from small RCTs (total N < 200) suggest small improvements in well-being and sexual function in some women, but results are inconsistent. The guideline recommends against routine use due to insufficient evidence and lack of long-term safety data.
**Healthy women (including postmenopausal):** No evidence of benefit for any outcome. The guideline recommends against routine use.
**Hypopituitarism, surgical menopause, glucocorticoid-induced low androgens:** Insufficient evidence to support therapy. The guideline recommends against routine use.
**Safety findings:**
**Androgen excess:** In women receiving testosterone therapy, rates of hirsutism (excess hair growth) were approximately 5–15% higher than placebo (relative risk ~2.0, 95% CI 1.2–3.5). Acne rates were similarly elevated. Voice deepening was rare (<1%) but potentially irreversible.
**Cardiovascular events:** No significant increase in cardiovascular events in short-term studies (2–4 years), but the studies were underpowered for rare events. Long-term data are absent.
**Breast cancer:** No significant increase in breast cancer in short-term studies, but the guideline notes that androgens can be aromatised to oestrogens, and long-term safety data are lacking. The Task Force recommends against use in women with a history of breast cancer.
**Lipid profiles:** Testosterone therapy consistently reduced HDL cholesterol by 5–15% (mean decrease ~8 mg/dL, p < 0.01). The clinical significance of this change is unknown.
**Secondary findings:**
**Androgen assays:** The guideline emphasises that current assays for total and free testosterone are unreliable at the low concentrations typical in women. This makes it impossible to define a "low androgen" state or to monitor therapy accurately.
**Lack of a defined syndrome:** The Task Force found no evidence for a well-defined "androgen deficiency syndrome" in women. Symptoms such as low libido, fatigue, and depressed mood are non-specific and do not correlate reliably with androgen levels.
Effect magnitude
**For sexual function (the only indication with some evidence):**
Testosterone therapy in postmenopausal women with HSDD increases the number of satisfying sexual events by about 1–2 per month. To put this in context: women in the placebo group typically reported 2–3 satisfying events per month at baseline, so this represents a 30–50% increase. However, the absolute increase is modest — roughly one extra satisfying sexual encounter every 2–4 weeks.
On validated questionnaires (e.g., FSFI), the improvement is approximately 2–4 points on a 0–36 scale. This is considered a clinically meaningful change by some standards, but the effect is not large enough to benefit all women.
Importantly, the effect is not predicted by baseline testosterone levels. Women with "normal" testosterone are just as likely to benefit as women with "low" testosterone. This suggests the mechanism may not be simply "replacing a deficiency."
**For DHEA:**
In women with adrenal insufficiency, DHEA replacement (25–50 mg/day) produces small improvements in well-being (effect size ~0.2–0.3 on standardised scales) and sexual function (effect size ~0.2–0.3). These effects are inconsistent across studies and are not considered clinically meaningful by most experts.
In healthy women, DHEA shows no measurable benefit for any outcome.
**For safety:**
The 5–15% absolute increase in hirsutism means that for every 10–20 women treated with testosterone, one will develop unwanted hair growth. This is dose-dependent and partially reversible upon discontinuation.
The 5–15% decrease in HDL cholesterol is comparable to the effect of a moderate dose of a statin (which lowers LDL, not HDL). The long-term cardiovascular implications are unknown.
Limitations
**What the authors acknowledge:**
The quality of evidence for most recommendations is "low" or "very low" by GRADE criteria.
Long-term safety data (beyond 2–4 years) are completely absent for both testosterone and DHEA.
Current androgen assays are unreliable at the low concentrations typical in women, making it impossible to define normal ranges or monitor therapy accurately.
No physiological testosterone preparations are approved for use in women in the United States or many other countries. The preparations used in trials (e.g., testosterone patches, gels) are formulated for men and must be used off-label at lower doses.
The guideline is based on evidence available up to 2014; newer studies may have emerged.
The Task Force could not define a "androgen deficiency syndrome" because symptoms do not correlate reliably with androgen levels.
**What a critical reader would note:**
**Industry funding:** Many of the trials reviewed were funded by pharmaceutical companies (e.g., Procter & Gamble, which manufactured a testosterone patch for women). This raises the possibility of publication bias and selective reporting.
**Short duration:** Most efficacy trials lasted 6–24 months. For a therapy that might be used for years or decades, this is inadequate to assess long-term risks (e.g., breast cancer, cardiovascular disease).
**Population limits:** The evidence for testosterone therapy applies only to postmenopausal women with diagnosed HSDD. The guideline explicitly recommends against extrapolating to other populations (premenopausal women, women with other sexual dysfunctions, women with low androgen levels due to medical conditions).
**Lack of standardisation:** Different trials used different testosterone preparations, doses, and routes of administration (patch, gel, injection, implant). This makes it difficult to compare results or to recommend a specific regimen.
**No placebo-controlled long-term data:** The longest placebo-controlled trials lasted 2 years. Open-label extension studies (up to 4 years) lack a control group and cannot assess long-term safety.
**The guideline is now over a decade old:** Since 2014, additional studies may have been published that could change the recommendations. The Endocrine Society has not updated this guideline as of 2024.
Practical takeaways
For someone running their own n=1 experiment:
### What to test
**Testosterone therapy for hypoactive sexual desire disorder (HSDD) in postmenopausal women only.** This is the only indication with some evidence of benefit. Do not test testosterone for any other purpose (cognitive function, bone health, energy, general well-being) — the evidence does not support it.
**DHEA supplementation:** The evidence is too weak to recommend self-experimentation. If you choose to test DHEA, limit it to women with documented adrenal insufficiency (e.g., Addison's disease) and only under medical supervision.
### Minimum meaningful duration
**Testosterone:** 8–12 weeks minimum to see an effect on sexual desire. Most trials showed benefit by 4–8 weeks, but some women required longer. Do not expect immediate results.
**DHEA:** 12–16 weeks minimum. Effects on well-being and sexual function, if any, are slow to emerge.
### What to measure (specific metrics)
**Primary outcome (testosterone for HSDD):**
Number of satisfying sexual events per month (use a daily log or weekly recall)
Sexual desire score (use a validated questionnaire like the Female Sexual Function Index [FSFI] desire domain, or the Sexual Desire Inventory [SDI])
Sexual distress (use the Female Sexual Distress Scale [FSDS], which measures how bothered you are by low desire)
**Secondary outcomes:**
General well-being (SF-36 or a simple 0–10 scale for mood, energy, and overall satisfaction)
Adverse effects: hirsutism (check for new facial or body hair weekly), acne, voice changes (record a voice sample at baseline and monthly), scalp hair loss
Blood work: total testosterone, free testosterone (by equilibrium dialysis, not calculated), SHBG, oestradiol, lipid profile, liver function tests — at baseline and every 3 months
**Confounds to control for:**
**Relationship status:** Sexual desire is heavily influenced by relationship quality, partner availability, and life stress. Track these factors weekly.
**Menopausal status:** If you are perimenopausal or postmenopausal, oestrogen therapy (if used) can affect sexual function. Keep oestrogen dose stable during the experiment.
**Medications:** Antidepressants (especially SSRIs), antihistamines, beta-blockers, and hormonal contraceptives can all suppress libido. Do not change these during the experiment.
**Sleep and stress:** Poor sleep and high stress reduce libido independently. Track sleep quality (hours, interruptions) and daily stress (0–10 scale).