8 min read

The Split: Why 'Functional Hypogonadism' Is Two Diseases Sharing One Number

The Split: Why 'Functional Hypogonadism' Is Two Diseases Sharing One Number

Two men sit in the same clinic. Both are 42, obese, fatigued. Both have a serum testosterone of 245 ng/dL — below every guideline's threshold. Both are diagnosed with functional hypogonadism. One receives testosterone replacement therapy. The other receives a GLP-1 receptor agonist for weight management. Six months later, both have a testosterone of 480 ng/dL.

Same starting number. Same ending number. Same diagnosis. But underneath that shared measurement, two entirely different biological stories unfolded — and proving they're different is what retires the diagnosis.

The Instrument That Conflates

Functional hypogonadism — also called obesity-associated, late-onset, or metabolic hypogonadism — accounts for the majority of low testosterone diagnoses in men under 65. It is defined by a low serum testosterone in the presence of metabolic disease, without structural or genetic cause. The definition is negative: we don't know what's wrong with the pituitary or hypothalamus, but something is suppressing the axis.

This negative definition creates a population united by a single measurement. Everyone in the tent has a low number. The assumption: they share a condition. The treatments offered — TRT or, increasingly, GLP-1 receptor agonists — both raise that number. The assumption: they're treating the same thing.

Three independent biological dimensions say otherwise.

Axis I: Fertility — Preservation vs. Suppression

GLP-1 Receptor Agonist
Preserves & improves spermatogenesis
LH ↑   FSH ↑   Sperm morphology improved
Testosterone Replacement
Suppresses spermatogenesis toward zero
LH ↓↓   FSH ↓↓   65% azoospermia (IM)

The first head-to-head evidence arrived in 2025. Gregorič et al. (Diabetes & Metabolic Syndrome, n=25) randomized obese hypogonadal men to semaglutide vs. transdermal testosterone. Both groups normalized T. Semaglutide improved sperm morphology and raised gonadotropins. Testosterone suppressed gonadotropins toward zero — acting as a contraceptive in men seeking fertility.

Cannarella et al. (Reproductive Biology and Endocrinology 2025, n=83) compared tirzepatide to lifestyle intervention to transdermal TRT: LH and FSH were significantly higher in the tirzepatide arm (P < .00001). One hundred percent of tirzepatide patients reversed their hypogonadism. The TRT group normalized T while their pituitary went silent.

Both treatments fixed the number. One preserved the patient's reproductive future. The other ended it. If these are the "same condition," why do the treatments produce biologically opposite downstream effects?

Axis II: Gonadotropin Signaling — Restoration vs. Replacement

This dimension is not merely about fertility — it is about what kind of hormonal state the patient inhabits after treatment.

GLP-1 receptor agonists restore the hypothalamic-pituitary-gonadal axis. They remove the metabolic suppression, and the brain resumes signaling. LH rises. FSH rises. The testis responds. The entire cascade runs from top to bottom. The patient's own axis is functional again.

TRT replaces the output while shutting down the control system. Exogenous testosterone feeds back to the hypothalamus and pituitary, suppressing GnRH, LH, and FSH. The testis atrophies without stimulation. The patient has a normal testosterone level but an iatrogenically non-functional axis.

The distinction matters beyond fertility. Gonadotropins have extragonadal roles — FSH in bone metabolism, LH receptor signaling in the CNS. Suppressing them isn't just reproductive; it's systemic. The gonadotropin hypothesis of cognitive decline (Zaidi lab, JCI 2025) suggests elevated FSH/LH drive neurodegeneration. If so, TRT's gonadotropin suppression may be neuroprotective — while paradoxically, the "restorative" GLP-1 approach raises the very signals implicated in dementia. The axis distinction propagates into every downstream system.

Three meta-analyses confirm the GLP-1 axis restoration signal: Salvio (n=680, SMD +1.39 ng/mL T with LH/FSH increase), Deameh (n=639), and Orra (n=219). All three show that GLP-1 RAs raise gonadotropins — the opposite of what TRT does.

Axis III: Epigenetic Aging — Deceleration vs. Acceleration

This is where the construct fractures most cleanly.

Horvath et al. (PubMed-indexed 2025, n=84 RCT): semaglutide decelerated epigenetic aging by multiple clock measurements — PCGrimAge −3.1 years, GrimAge V1 −1.4 years, V2 −2.3 years, PhenoAge −4.9 years, DunedinPACE −9%. The biological clock slowed.

Sugrue et al. (PNAS 2025): testosterone accelerates androgen receptor-associated CpG methylation — the "androgen clock." Higher lifetime androgen exposure ages AR-associated epigenetic sites faster. Korean eunuch data (Min 2012): castrated men lived 14-19 years longer than intact controls.

A 2026 study in npj Aging using sex-divergent deep neural network clocks found a sharp aging bifurcation at ages 45-49 — aligning precisely with andropause. The biological clock detects the androgen transition.

The Same Testosterone. Opposite Aging Trajectories.
−3.1 yr
GLP-1 RA (PCGrimAge)
Horvath RCT, n=84
|
+acceleration
TRT (Androgen Clock)
Sugrue PNAS 2025

Both treatments produce serum T ~480 ng/dL. The epigenetic clocks diverge.

A critical caveat: no RCT has directly measured epigenetic clock changes under TRT. The androgen clock is observational — higher lifetime androgens correlate with accelerated AR-CpG aging. But the directional evidence is consistent: GLP-1 RAs slow clocks; androgens speed them. Same testosterone number, opposite biological age trajectories.

The Construct Validity Matrix

In psychometrics, the multitrait-multimethod matrix (Campbell & Fiske, 1959) tests whether a measurement actually captures what it claims to capture. A valid construct shows convergent validity (different methods measuring the same thing agree) and discriminant validity (different constructs measured by the same method diverge).

Functional hypogonadism fails the discriminant validity test completely.

Dimension GLP-1 RA Effect TRT Effect Direction
Serum testosterone ↑ Normalized ↑ Normalized Same
Gonadotropins (LH/FSH) ↑ Restored ↓↓ Suppressed Opposite
Spermatogenesis Preserved / improved Suppressed → azoospermia Opposite
Epigenetic aging Decelerated (−3.1yr) Likely accelerated (androgen clock) Opposite
HPG axis status Restored (autonomous) Dependent (exogenous) Opposite
Sexual function (ED) Possibly worsened (5-HT2C) Improved Suggestive

In a valid MTMM matrix, two treatments for the same condition should produce correlated downstream effects — positive off-diagonal elements. What we observe instead are negative off-diagonal elements: the treatments produce systematically opposite biological outcomes on every measured dimension except the diagnostic instrument itself.

This is not a treatment comparison. It is a construct validity test — and the construct fails.

What the Split Means

If "functional hypogonadism" were a coherent entity, both treatments would move the biology in the same direction. They would differ in degree, in side effects, in convenience — but the downstream signature would converge. Instead, the downstream signature diverges on every axis we have instruments to measure.

The parsimonious explanation: the single measurement (serum testosterone below threshold X) is grouping at least two distinct clinical populations:

Population A: Metabolic Suppression

The axis works but is being suppressed by metabolic disease. Remove the suppression → axis recovers. T rises because the brain signals again. Fertility preserved. Aging slowed. The patient was never "hypogonadal" — they were metabolically imprisoned.

Population B: Axis Insufficiency

The axis cannot produce adequate testosterone regardless of metabolic state — possibly due to oligogenic vulnerability, aging Leydig cells, or prior damage (AAS, opioids, TBI). Replacement is appropriate because restoration is impossible. The patient needs exogenous hormone indefinitely.

Both present with T = 245 ng/dL. Both satisfy the diagnostic criteria. Both receive a diagnosis. But they don't have the same disease. The lab slip says they do because the lab slip measures one dimension of a multi-dimensional problem.

The Void Where Discrimination Should Be

No validated clinical tool distinguishes Population A from Population B at the point of diagnosis. There is no biomarker panel, no algorithm, no guideline recommendation for determining whether a given patient's low testosterone will respond to metabolic intervention alone.

The closest proxies:

ACHIEVE-4 enrolled 2,749 patients on orforglipron for 104 weeks. Measured cardiovascular outcomes, weight, A1C, liver enzymes. Measured all-cause death (HR 0.43, p=0.002). Did not measure testosterone. Did not measure gonadotropins. Did not measure fertility parameters. The largest oral GLP-1 RA trial in history — zero reproductive data collected.

This void is not accidental. It is diagnostic of the construct problem itself. If "functional hypogonadism" and "obesity" were recognized as overlapping clinical territories, trial designers would measure both. They don't — because the construct boundary makes the question invisible. Obesity trials measure obesity outcomes. Hypogonadism trials measure hypogonadism outcomes. The patients sitting in both categories receive neither investigation fully.

Construct Fission

The term is precise: fission, not fusion. We are not combining two things — we are recognizing that one apparent thing has always been two. The measurement (serum T) provided a unifying surface. The biology underneath was diverging all along.

The clinical implication is immediate: prescribing testosterone to a man whose axis would have recovered with metabolic intervention doesn't just fail to address root cause. It actively harms — suppressing the gonadotropins that were about to recover, accelerating epigenetic aging that was about to decelerate, eliminating fertility that was never at risk from the underlying condition.

And prescribing a GLP-1 RA to a man whose axis is constitutively insufficient — the man with the fragile axis who has crossed into irreversible territory — wastes months of declining quality of life waiting for a restoration that the biology cannot deliver.

Both errors flow from the same source: treating a measurement as a diagnosis, when the measurement cannot discriminate between the populations it contains.

The Emerging Fourth Axis (Preliminary)

A 2026 target trial emulation (EClinicalMedicine) found GLP-1 RA use associated with 26% increased erectile dysfunction risk vs. DPP4 inhibitors (HR 1.26, CI 1.08-1.46) — likely mediated by 5-HT2C serotonergic activation. TRT consistently improves ED. However: the association was attenuated after negative control outcome calibration in the same paper's sensitivity analysis. This fourth axis (same T, opposite sexual function) has biological plausibility via the serotonergic triangle, but the epidemiological evidence does not yet meet the standard set by the first three dimensions. It remains suggestive — not confirmed.

What Retires a Diagnosis

Diagnostic categories die when their internal heterogeneity exceeds their external distinctiveness — when the variation within the category is larger than the variation between categories. "Dropsy" became congestive heart failure, nephrotic syndrome, and hepatic cirrhosis. "Consumption" became tuberculosis and lung cancer. "Hysteria" became dozens of neurological and psychiatric conditions.

"Functional hypogonadism" is showing the same signature. The within-category biological variation (opposite fertility, opposite gonadotropin signaling, opposite aging trajectories) now exceeds the between-category variation (functional vs. organic hypogonadism share more biology than the two populations within functional hypogonadism share with each other).

The name hasn't changed yet. The 13 guidelines still use it. The clinical infrastructure still pivots on the single measurement. But the biology has already split. The construct is fissioning whether or not the nomenclature follows.

The question is no longer whether functional hypogonadism is one disease or two. The evidence has answered that. The question is how long clinical practice will continue treating a number instead of the patient underneath it.