Effect of testosterone use on bone mineral density in men : a systematic review and meta-analysis of randomized controlled trials

Title Author(s) Citation Issued Date URL Rights Effect of testosterone use on bone mineral density in men : a systematic review and meta-analysi...
Author: Adela Fowler
0 downloads 1 Views 1007KB Size
Title

Author(s)

Citation

Issued Date

URL

Rights

Effect of testosterone use on bone mineral density in men : a systematic review and meta-analysis of randomized controlled trials Scarborough, Olivia Mary Scarborough, O. M.. (2015). Effect of testosterone use on bone mineral density in men : a systematic review and meta-analysis of randomized controlled trials. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5662758 2015

http://hdl.handle.net/10722/221791

The author retains all proprietary rights, (such as patent rights) and the right to use in future works.

Abstract of dissertation entitled “Effect of Testosterone Use on Bone Mineral Density in Men: A Systematic Review and Meta-Analysis of Randomized Controlled Trials”

Submitted by

Olivia Scarborough

For the degree of Master of Public Health at the University of Hong Kong in August 2015 Background The use of testosterone replacement therapy as treatment for clinical androgen deficiency has been growing in popularity. Low androgen levels are related to lower bone density in men, so testosterone might protect bone health. Bone mineral density is a biomarker for osteoporosis, which is associated with fractures. One systematic review and two previous meta-analyses have assessed the effect of testosterone replacement on bone health, one in 2005 and the other in 2006. This study provides an update focusing on the group who are increasingly using testosterone, i.e., older men. This study provides an up-to-date meta-analysis from randomized controlled trials of the effect of testosterone replacement on bone mineral density in middle-aged to older men. Methods A search of PubMed for randomized placebo-controlled trials was conducted until May of 2015, using the search term “(androgen OR testosterone) AND bone AND men AND trial” restricted to studies in English. A bibliographic search yielded no additional trials. Studies included in this meta-analysis were chosen using the following pre-specified inclusion and exclusion criteria. Trials were included if they were randomized, lasted at least 6 months, published before May of 2015, included men over 40 years with androgen deficiency, measured bone mineral density in g/cm3, and included at least one group given testosterone alone through oral, transdermal, or intramuscular means. Trials were excluded if they were not published in English or did not report adequate data for analysis. Quality of the studies was assessed using the

Jadad scale. Sensitivity analysis was done excluding low quality studies. Meta-regression was performed on pre-determined subgroups based on testosterone dosage, duration of trials, baseline testosterone status, method of testosterone administration, and glucocorticoid use. Results The search of PubMed yielded 390 studies, of which 13 met the inclusion criteria. In an analysis of all studies, bone density of the lumbar spine increased by 0.023 g/cm3 (95% confidence interval (CI) 0.009 to 0.037) on testosterone, but bone mineral density of the femoral neck did not (0.006 g/cm3 (95% CI -0.001 to 0.013)). Heterogeneity across studies was high. Sensitivity analysis based on trials with a higher Jadad score gave a smaller effect of testosterone on lumbar spine bone density (0.014, 95% CI 0.0004 to 0.030) but the estimate for the femoral neck (0.008, 95% CI 0.0003 to 0.016) was slightly larger. Meta-regression showed method of testosterone administration explained a large amount of heterogeneity, and that intramuscular injections were associated with the largest increase in bone mineral density when compared to oral or transdermal testosterone. Conclusions This updated meta-analysis indicates that testosterone treatment has little effect on bone mineral density, especially after taking study quality into account. The conclusions of this study are greatly limited by the lack of fracture outcome. Given the very recent warnings from regulators about adverse cardiovascular events associated with testosterone replacement, testosterone does not appear to be a good option for the prevention or treatment of osteoporosis in men.

“Effect of Testosterone Use on Bone Mineral Density in Men: A Systematic Review and Meta-Analysis of Randomized Controlled Trials”

By

Olivia Scarborough B.A., University of Pittsburgh

A dissertation submitted in partial fulfillment of the requirements for the degree of Master of Public Health at the University of Hong Kong August 2015

Declaration I declare that the dissertation and the research work thereof represents my own work, except where due acknowledgement is made, and that it has not been previously included in a thesis, dissertation or report submitted to this University or to any other institution for a degree, diploma or other qualification.

Signed _______________________________________________________ Olivia Scarborough

i

Acknowledgements I want to thank my family for their unending support throughout the writing of this dissertation. They kept me sane and motivated through the entire process. I am extremely thankful for the guidance provided by my supervisor, Dr. Catherine Mary Schooling. Many thanks also go to Dr. Cindy Lin for her patient support.

ii

Table of Contents Declaration

……………………………………………………………………………………

i

Acknowledgments

…………………………………………………………………………

ii

Table of Contents

…………………………………………………………………………

iii

List of Figures ……………………………………………………………………………………

vi

Abbreviations ……………………………………………………………………………………… viii

1 Introduction .................................................................................................................................1 1.1 Osteoporosis in men ...............................................................................................................1 1.1.1 Epidemiology ..................................................................................................................1 1.1.2 Characteristics and symptoms .........................................................................................1 1.1.3 Diagnosis .........................................................................................................................2 1.1.4 Treatment options ............................................................................................................3 1.2 Hypogonadism .......................................................................................................................5 1.2.1 Testosterone definition ....................................................................................................5 1.2.2 Characteristics and symptoms .........................................................................................5 1.2.3 Diagnosis .........................................................................................................................6 1.3 Testosterone replacement therapy ..........................................................................................7 1.3.1 Overview of current use ..................................................................................................7 1.3.2 Types of testosterone administration ...............................................................................8 1.3.2.1 Intramuscular injections ...........................................................................................8 1.3.2.2 Oral agents ...............................................................................................................8 1.3.2.3 Transdermal gel or patch .........................................................................................9 1.3.3 Adverse effects ................................................................................................................9 1.4 Importance of osteoporosis and TRT to public health ........................................................10 1.4.1 The burden of fractures .................................................................................................10 1.4.2 Lack of research on male osteoporosis prevention........................................................10 1.4.3 TRT in men versus HRT in women ..............................................................................10 1.5 Literature review ..................................................................................................................11 1.5.1 Systematic reviews and meta-analyses ..........................................................................11 2 Objectives...................................................................................................................................13

vi

3 Methods ......................................................................................................................................14 3.1 Data sources and searches ....................................................................................................14 3.2 Study selection .....................................................................................................................14 3.2.1 Inclusion criteria ............................................................................................................14 3.2.2 Exclusion criteria ...........................................................................................................14 3.3 Data extraction and quality assessment ................................................................................14 3.3.1 Extraction ......................................................................................................................14 3.3.2 Quality assessment .......................................................................................................15 3.4 Data synthesis and analysis .................................................................................................14 3.4.1 Effect measures ............................................................................................................15 3.4.1.1 Estimating standard deviations of mean change from baseline ............................16 3.4.2 Assessment of publication bias ....................................................................................17 3.4.3 Assessment of heterogeneity ........................................................................................18 3.4.3.1 Sensitivity analysis.................................................................................................18 3.4.1.1 Meta-regression .....................................................................................................19 3.5 Ethics approval ....................................................................................................................20 4 Results ........................................................................................................................................21 4.1 Trial selection ......................................................................................................................21 4.1.1 Elimination ...................................................................................................................22 4.1.2 Flowchart ......................................................................................................................22 4.2 Characteristics of included trials .........................................................................................22 4.2.1 Publication year ............................................................................................................22 4.2.2 Setting ...........................................................................................................................23 4.2.3 Duration ........................................................................................................................23 4.2.4 Age of participants .......................................................................................................23 4.2.5 Intervention group ........................................................................................................23 4.2.6 Control group ................................................................................................................23 4.2.7 Testosterone administration ..........................................................................................24 4.2.8 Health status .................................................................................................................24 4.2.9 Loss to follow-up ..........................................................................................................25 4.2.10 Funding and affliation ................................................................................................25

4.2.11 Trial registration .........................................................................................................25 4.3 Data extraction and quality assessment ...............................................................................25 4.3.1 Missing values calculation ...........................................................................................25 4.3.2 Jadad appraisal ..............................................................................................................26 4.4 Data synthesis and analysis ..................................................................................................27 4.4.1 Total pooled effect of testosterone on BMD of the lumbar spine ................................27 4.4.2 Total pooled effect of testosterone on BMD of the femoral neck ................................28 4.4.3 Funnel plots ..................................................................................................................28 4.4.4 Sensitivity analysis .......................................................................................................29 4.4.4.1 Low quality trials ..................................................................................................29 4.4.4.2 Trials that excluded data .......................................................................................30 4.4.5 Meta-regression ............................................................................................................31 4.4.5.1 Random-effects model with one covariate ...........................................................31 4.4.5.1.1 Dose .......................................................................................................31 4.4.5.1.2 Duration .................................................................................................32 4.4.5.1.3 Method of testosterone administration ...................................................32 4.4.5.1.4 Serum testosterone level at baseline ......................................................32 4.4.5.1.5 Glucocorticoid use .................................................................................33 5 Discussion...................................................................................................................................34 5.1 Statement of findings ..........................................................................................................34 5.2 Past research .........................................................................................................................34 5.3 Future research .....................................................................................................................35 5.4 Limitations and strengths .....................................................................................................35 5.5 Implication for public health ................................................................................................38 6 Conclusion .................................................................................................................................40 7 Appendix ....................................................................................................................................41 7.1 Data extraction table.............................................................................................................41 7.2 Jadad scores ..........................................................................................................................42 7.3 Differing effects of three difference methods of testosterone treatment on BMD ..............43 8 References ..................................................................................................................................44

Figures Figure 1 T-value calculation for exposure (E) and control (C) groups Figure 2 Standard deviation calculation from 95% confidence interval Figure 3 Standard deviation calculation from standard error of the mean Figure 4 Correlation coefficient of exposure (E) Figure 5 Correlation coefficient of control (C) Figure 6 Standard deviation of the mean change-from-baseline for exposure (E) Figure 7 Standard deviation of the mean change-from-baseline for control (C) Figure 8 Calculation of I2 Figure 9 Data selection process Figure 10 Pooled effect of testosterone on lumbar spine BMD Figure 11 Pooled effect of testosterone on femoral neck BMD Figure 12 Funnel plot of the lumbar spine and femoral neck BMD measurements Figure 13 Sensitivity analysis for pooled testosterone effect on lumbar spine BMD Figure 14 Sensitivity analysis for pooled testosterone effect on femoral neck BMD Figure 15 Sensitivity analysis on trials that did not report lumbar spine BMD outcome Figure 16 Sensitivity analysis on trials that did not report femoral neck BMD outcome

Abbreviations AD

androgen deficiency

ADAM

androgen deficiency in the aging male

ADT

androgen deprivation therapy

AIDS

acquired immunodeficiency syndrome

ASA

American Society of Andrology

BMD

bone mineral density

CI

confidence interval

Corr

correlation coefficient

CV

cardiovascular

CVD

cardiovascular disease

DXA

dual-energy X-ray absorptiometry

E2

estradiol

EAA

European Association for Andrology

EAU

European Association of Urology

ES

Endocrine Society

FDA

United States Food and Drug Administration

FRAX

fracture risk assessment tool

FSH

follicle-stimulating hormone

g/cm3

grams per centimeter cubed

GH

growth hormone

GnRH

gonadotropin-releasing hormone

HC

Health Canada

HIV

human immunodeficiency virus

HRT

hormone replacement therapy

ISA

International Society of Andrology

ISCD

International Society for Clinical Densitometry

ISSA

International Society for Study of the Aging Male

L

lumbar

LH

luteinizing hormone

LOH

late-onset hypogonadism

vii

ng/dL

nanograms per deciliter

NIH

National Institutes of Health

NNT

number needed to treat

PRISMA

preferred reporting items for systematic reviews and meta-analyses

PSA

prostate-specific antigen

RCT

randomized controlled trial

SEM

standard error of the mean

SD

standard deviation

SHBG

sex-binding globulin

TC

testosterone cyponiate

TE

testosterone enanthate

TRT

testosterone replacement therapy

TU

testosterone undecanoate

USPSTF

United States Preventive Services Task Force

WHO

World Health Organization

viii

1 Introduction 1.1 Osteoporosis in Men 1.1.1 Epidemiology Osteoporosis is a skeletal disease that presents with low bone density and bone deterioration that leads to frailty1. The onset of disease generally occurs with advanced age and generally stays asymptomatic until fractures occur, thus making it difficult to diagnose early1. Osteoporosis affects more than 75 million people globally1. The largest burden of osteoporotic fractures for men and women aged over 50 years lies in Europe and the Western Pacific (China, Japan, the Republic of Korea, Australia, and New Zealand), at 34.8% and 28.6% respectively. The Western Pacific (2.9%) and Africa (0.8%) have the lowest prevalence of osteoporotic fractures2. In Europe and the Americas, the majority of people with osteoporosis are women, due to their lower peak bone mass, hormonal changes during menopause, and higher life expectancy1. This sex gap is not seen, however, in Asia and Africa1. Osteoporosis in men is also a concern despite the lower prevalence because there are certain sex-specific risks. Death from hip fracture is more common in men than women3. Osteoporosis also presents later in life among men, so increasing life span may increase the prevalence of osteoporosis in men3. Age-adjusted osteoporotic hip fracture incidence is much higher in Caucasian men and women than any other ethnic/racial group1. With economic development in Asia and parts of Africa, however, rates of osteoporotic hip fractures are growing in those areas, possibly because of an association with increased life span1. By the year 2050, it is estimated that 50% of all osteoporotic fractures will occur in Asia4. Reasons for these differences and changes are largely unknown. Genetics and environment are thought to play a role in fracture risk, but alcohol use, exercise levels, migration status, smoking, and obesity do not explain these trends4. Vitamin D deficiency may explain the increased risk in northern countries due to a lack of winter sunlight4. There is a lack of epidemiological studies of osteoporosis and fracture rates in India and other parts of Asia4. As such, a large proportion of the global population is not covered in the study of osteoporosis and associated fractures. 1.1.2 Characteristics and Symptoms There are two categories of primary osteoporosis. Type 1 in men may present with vertebral fractures in middle-age3. Hip fractures and trabecular bone (the weaker, “cancellous” 1

bone located in the ends of long bones, within vertebrae and near joints) loss typically occur in men aged over 70 years in the more common type 23. The cause of primary osteoporosis is usually age related (labeled as senile osteoporosis) or unknown (idiopathic osteoporosis)5. The majority of all osteoporosis in men is secondary5. Many risk factors may contribute to secondary osteoporosis, such as lifestyle, other diseases, or medications5. Smoking, alcohol use, lack of calcium intake, and lack of exercise are all lifestyle habits that are associated with osteoporosis5. Conditions that are associated with secondary osteoporosis include hypogonadism, gastrointestinal disorders, hypercalciuria, and any disorder that leads to reduced mobility5. Low concentrations of growth hormone, estradiol and insulin-like growth factor 1 may also contribute to decreased bone mass6. Taking glucocorticoids are the biggest medication-related risk factor5. Glucocorticoids are prescribed as treatment for various autoimmune and inflammatory diseases. The Canadian Multicenter Osteoporosis Study found that glucocorticoid use is associated with fragility fractures in both men and women in a 10-year prospective cohort study7. Because osteoporosis is mostly asymptomatic, the disease is not truly evident until it is at a dangerous state5. Fragility of the bone can eventually lead to fractures of the wrist, spine and, most seriously, the hip5. Fractures in any of these locations can be disabling or even fatal (usually for hip fracture)5. Low bone density in the femoral neck, trochanter, Ward’s triangle, and proximal femur are all predictors for hip fracture risk, better than density of the lumbar spine and forearm8. 1.1.3 Diagnosis The World Health Organization (WHO)’s official definition of osteoporosis in women is bone mineral density (BMD) T-score 2.5 standard deviations or more below the average value for a young, healthy individual1. BMD T-scores between 1 and 2.5 standard deviations are commonly diagnosed as osteopenia1.There is no official WHO criteria for diagnosing osteoporosis in men based on their BMD measurement5. The International Society for Clinical Densitometry (ISCD) recommends using a non-race adjusted database to determine the BMD average of a young healthy male9. This method is controversial, however, so there is no general consensus on how to diagnose osteoporosis in men based on their BMD measurements. A physician may use a range of methods to diagnose osteoporosis, including x-rays, blood and urine tests, or a dual-energy x-ray absorptiometry (DXA) test5. Total bone mass is a combination of the density and volume of a specific bone1. BMD is the mean volume of the 2

measured bony tissue on the surface of the bone1. This estimated mean is reported as grams of hydroxyapatite (a mineral form of calcium apatite) per centimeters cubed (g/cm3)1. DXA is the mostly widely accepted measurement tool for BMD2. BMD values are usually based on the given T-score2. BMD is the primary variable used in diagnosing osteoporosis5. Women are more likely to receive screening DXA tests for BMD, whereas men generally receive the test only after seeking relief for back pain5. These DXA tests have high specificity but generally low sensitivity for detection of the risk of fracture2. Therefore, the WHO recommends not diagnosing fracture risk with just BMD results alone, but rather a combination of multiple risk factors2. Fracture risk is recommended to be analyzed according to the WHO fracture assessment tool (FRAX)10. The WHO determined that along with BMD, diagnosis of fracture risk should be analyzed based on body mass index, family history, prior fragility fractures, alcohol use, smoking, use of glucocorticoids, and presence of rheumatoid arthritis10. FRAX is a particularly useful tool in low-income areas that do not have ready access to DXA equipment10. Despite these detailed definitions, determining if a fracture actually resulted from osteoporosis can be difficult. Low-energy fractures (i.e. not resulting from high impact) are generally more associated with low bone mineral density than high-energy fractures1. If the fracture occurs in the spine, wrist, or hip after age 50, they are commonly attributed to osteoporosis1. 1.1.4 Treatment Options There are a series of possible treatment options developed with osteoporosis in women as the primary target. Few trials have analyzed the prevention of osteoporotic fractures in men, but the United States Food and Drug Association (FDA) has approved the use of bisphosphonates, studied primarily for their effectiveness in treating osteoporosis in women, for men3. Calcium and Vitamin D supplements have been researched as a possible treatment option because calcium absorption decreases with age. A large (n=1,471) 5-year randomized controlled trial with 10-year follow-up of post-menopausal women in New Zealand11 found no effect of calcium supplements on total osteoporotic fractures. Vitamin D was shown to prevent hip fractures in elderly women living in care homes, but not in women who lived in the community11. Cardiovascular (CV) risk increased during calcium supplementation but did not continue after treatment was discontinued at the five year point11. 3

The most commonly used and clinically supported method of treating osteoporosis is bisphosphonates12. Alendronate was the first of these drugs approved by the FDA in 1995 for the use of treating osteoporosis in postmenopausal women13. These drugs work by inhibiting bone resorption by suppressing osteoclast function12. Bisphosphonates are attracted to hydroxyapatite, which allows them to manage the active osteoclasts on the bone’s surface13. Newer bisphosphonates contain nitrogen, making them more effective than the original drugs13. Bisphosphonates are taken orally or intravenously. The former method of administration results in quicker suppression of bone resorption, and can remain effective for an extended period14. Risedronate, zoledronic acid, alendronate, and teriparatide are all approved for use in men by the FDA3. Observational studies on bisphosphonates have shown associations with increased bone density in men undergoing androgen deprivation therapy (ADT)3. A randomized, placebocontrolled trial found zoledronic acid increases BMD in men undergoing ADT15. Prolonged used of bisphosphonates, however, may lead to microdamage of bone and atypical femoral fractures16, along with possible gastrointestinal events, hypocalcemia, and osteonecrosis of the jaw13. Debate exists on how much medical attention should be given to the early, nonsymptomatic stage of osteoporosis. There has been a recent increase in treating this condition with the stated goal of preventing future fractures17. Increased strategic marketing from pharmaceutical companies makes bisposphanates look attractive for early fracture prevention. This is done by reporting reductions in relative risk, but omitting number needed to treat (NNT). The US Preventative Task Force (USPSTF) found the NNT to prevent one vertebral fracture in patients prescribed bisphosphanate alendronate for four years is 6018. In other trials, the NNT was as high as 13317. Incomplete advertisement of adverse effects may also occur. Meta-analyses of randomized controlled trials found a series of adverse events resulting from bisphosphantes, including atrial fibrillation, osteonecrosis of the jaw, musculoskeletal pain, arthritis, and gastrointestinal events18. Information about the harms of bisphosphanates may also be incomplete due to the amount of pharmaceutical company funding of most of the original drug trials17. Careful examination of whether the benefits outweigh the risks is therefore particularly important in the early stages of osteoporosis.

4

1.2 Hypogonadism 1.2.1 Testosterone definition Testosterone is the most prevalent androgen in men, produced in the testes, adrenal glands or locally19. The hormone either acts directly on targeted tissues via the androgen receptor or via the androgen receptor after conversion to dihydrotestosterone. Testosterone may also be converted to estradiol (E2) in fat cells using the aromatase enzyme19. Concentrations of testosterone are highest in the morning, and vary throughout the day. Most serum testosterone is bound to either sex-binding globulin (SHBG) or albumin. The 1%-4% that is not is called “free” testosterone. Free and albumin bound testosterone combined are labeled “bioavailable” testosterone19. 1.2.1 Characteristics and Symptoms Hypogonadism is clinically classified by a combination of low serum testosterone and a variety of symptoms, including decreased bone mineral density, sexual function and lean body mass, along with depression and anemia19. Primary hypogonadism occurs as a result of testicular failure and is marked by a decrease in sperm production and testosterone. Established causes include Leydig cell hypoplasia, Klinefelter syndrome, anorchia, and distruption of testicular function19. Secondary hypogonadism usually results from reduced production of pituitary gonadotropins19. Lowered gonadotropin-releasing hormone (GnRH) secretion and pituitary failure are associated with secondary hypogonadism, and usually presents without abnormalities in testicular function19. An age-related reduction in serum testosterone and its associated symptoms is commonly called late-onset hypogonadism (LOH) in the current clinical literature20. LOH is not a disease, but rather a label given to an observed phenomenon. Observational human and animal (elderly brown Norway rats) studies found that Leydig cells are less likely to respond to gonadotropin stimulation and that GnRH secretion decreased with increased age20. An “andropause hypothesis” has resulted from these and similar observational studies, stating that there may be similarity between LOH and menopause21. However, it is unclear whether androgen deficiency is an inevitable age-related condition or the result of obesity and ill health. Although there is a lack of established causal relation, there are comorbidities strongly associated with low testosterone levels, found in observational studies. These conditions include sexual dysfunction, musculoskeletal disorders, diabetes, obesity, neuropsychological conditions, 5

muscle wasting in men with human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS), and cardiovascular disease (CVD) among others19. These conditions often overlap with normal ageing processes however, so clear delineation of symptoms in androgen deficiency is difficult to establish22. 1.2.2 Diagnosis Normal levels of serum testosterone are determined by the 95% distribution of healthy males in the relevant population19. A serum testosterone level equal to or below 300 nanograms per deciliter (ng/dL) or bioavailable testosterone level below 70 ng/dL are often used to diagnose hypogonadism19. General consensus from the American Society of Andrology (ASA), International Society of Andrology (ISA), the European Association for Andrology (EAA), the European Association of Urology (EAU), and the International Society for Study of the Aging Male (ISSA) recommends that treatment is not required for men with total testosterone levels over 350 ng/dL, but men should receive testosterone replacement therapy (TRT) if their levels are below 230 ng/dL23. The FDA defines low serum testosterone as levels less than 300ng/dL confirmed on at least two separate occasions19. Diagnosing based on serum testosterone levels alone, however, may not have high sensitivity24. Surveys have been developed to assist in the sensitivity of diagnosis. The Saint Louis University Androgen Deficiency in the Aging Male, or ADAM, questionnaire was developed by Morley et al.24 in 2000 to assess the presence of hypogonadism in men. The questionnaire has been widely used since its introduction. The survey is a 10 item set of yes or no questions about the respondent’s libido, energy level, strength, height, mood, erections, sports-playing ability, fatigue, and work performance24. The Endocrine Society’s clinical practice guideline of 2010 recommends prescribing testosterone therapy only to men with consistently biological and symptomatic androgen deficiency25. Luteinizing hormone (LH) concentrations and follicle-stimulating hormone (FSH) should be used to distinguish between primary and secondary hypogonadism. Use of testosterone therapy based on age is not recommended, and should be carefully evaluated on a case-by-case basis. The Endocrine Society recommends consideration of testosterone therapy in men undergoing extended glucocorticoid treatment and men with HIV to improve lean body mass, bone density, and muscle strength25.

6

When undergoing ADT with GnRH agonists for conditions such as metastatic prostate cancer, there is an associated drop in bone mineral density in the lateral spine, forearm, and hip26. However, this observation has not been confirmed in randomized controlled trials. Men with Klinefelter syndrome are at high risk for hypogonadism and generally have lower BMD27. Androgens, however, do not appear to play a major role in the decreased bone mass, but rather are attributable to other factors of the disease27. 1.3 Testosterone Replacement Therapy 1.3.1 Overview of Current Use Testosterone is culturally seen as the essential source of masculinity, strength and virility. This viewpoint has led to an increase in use of testosterone for men in the West at a rate faster than clinical research to support its safety6. From the year 2000-2011, testosterone sales rose steadily globally, especially for transdermal products28. This rise does not correspond with the prevalence of pathological androgen deficiency, even if under diagnosis is considered, indicating that testosterone may be prescribed unnecessarily in many cases28. For the past 50 years, the FDA has approved testosterone as a means to raise serum levels to the eugonadal range19. However, recent recommendations and restrictions have been made more specific based on evidence of adverse effects. Men must have certain medical conditions and confirmed low testosterone to be considered for testosterone treatment29. These medical conditions must affect testosterone levels through a disorder in the testicles, brain, or pituitary gland29. The FDA now requires labeling that clearly states the approved uses of testosterone and associated risks of heart attack and stroke29. Patients taking testosterone are advised by the FDA to seek medical attention if they experience trouble breathing, chest pain, weakness on one side of the body, and/or slurred speech29. A meta-analysis of 31 placebo-controlled trials by Borst et al.30 examined the difference in dihydrotestosterone and testosterone levels based on method of administration. They found a difference between the pooled effect of intramuscular and transdermal administration on both concentrations, but were unable to make robust conclusions about oral administration due to lack of data. This indicates there may be different outcomes may vary based on route of administration, and so investigation into differences may be useful in explaining differences across studies.

7

Basurto et al.31 found that both testosterone and estradiol concentrations increase after testosterone treatment for 12 months. Although the role of estrogens on bone composition in women is fairly established, it is mostly unknown how much estrogen contributes to bone health in men32. Observational studies show a correlation between E2 and BMD33,34. Without a randomized controlled trial of estrogen therapy or a Mendelian randomization study, the contribution of E2 to bone metabolism and structure is largely unknown. The evidence supports a need to assess whether the transformation of testosterone to E2 is partially or largely responsible for the associated changes in body composition. 1.3.2 Types of Testosterone Administration 1.3.2.1 Intramuscular Injections Injections of esters such as cypionate and enanthate usually range from 100 to 250 mg every two to three weeks35. These injections usually result in a high peak of testosterone soon after the treatment, and then a drop to low levels before the next injection. This fluctuation can lead to overstimulation of erythropoiesis, mood swings, and change in sexual function35. Testosterone undecanoate (TU), previously only available in oral form, can now be injected with 12 to 14 weeks between each treatment, which lowers the fluctuations seen in the former treatments35. Patients overly sensitive to side effects should avoid injections due to possible severity of associated complications. The clinical dosage of intramuscular testosterone enanthate (TE) is 100-250mg every 2-3 weeks, and for testosterone cyponiate (TC), 200mg every 2 weeks. TU is given as 1000mg injections every 10-12 weeks. Testosterone implants are administered in sets of 4 pellets containing 200mg of testosterone each every 5-6 months35. 1.3.2.2 Oral Agents TU is the only safe oral form available for testosterone treatment. The agent is absorbed directly from the intestine and included in chylomicrons after consuming a meal. High doses are required to reach the desired level of testosterone. Other forms of oral treatment have resulted in metabolic issues and severe liver toxicity35. TU 40 mg capsules are clinically advised to be taken 2-4 times per day35. A randomized-controlled trial done by Bouloux et al.36demonstrated that there is a dosespecific effect of testosterone on BMD of the lumbar spine, total hip, and trochanter when taking oral TU at 80, 160, or 240mg per day. Change in BMD in the testosterone group versus the

8

placebo group was only detectable when participants took oral TU at 240mg per day, with the exception of the trochanter. 1.3.2.3 Transdermal Gel or Patch This form of T supplementation comes in the form of a skin patch or gel. The temporary and easily removable nature of this treatment allows easy cessation of treatment if adverse effects appear. The biggest concern involving this form of treatment is the daily application and possible effects on anyone who comes into physical contact with the user35. One T patch is clinically advised to be placed on the scrotum per day. If applied to non-genital areas, 1-2 patches per can be used. 5-10g of gel (50-100 mg of T) per day is the advised amount35. T gel is not used on genital skin35. Transdermal gels and patches are currently the most prescribed form of testosterone administration19. 1.3.3 Adverse Effects TRT’s effect on prostate cancer is currently debated because ADT slows the progression of prostate cancer37. Logically, this may mean that TRT could speed development of prostate cancer. Current evidence on the effect of TRT on prostate cancer comes from retrospective studies. A study by Kaplan and Hu38 published in 2013 reported results from a 15-year 150,000 American men retrospective cohort study. They found no correlation of TRT with prostate cancer, but the overall use of TRT was low. A randomized controlled trial or Mendelian randomization study would needed to determine if TRT has an adverse effect on prostate cancer. Two randomized controlled trials39,40 have shown that finasteride (a drug used for treatment of benign prostatic hyperplasia) paired with testosterone prevents increase in prostate volume when compared to testosterone treatment alone. The association between testosterone and CVD is also currently not definitively established41. Observational studies show high testosterone associated with CVD, but Mendelian randomization studies and meta-analyses of RCTs show an opposite result41. No large randomized-controlled trial has been conducted to provide a definitive answer on the connection between CVD and TRT41. Another recent meta-analysis of RCTs by Borst et al.30 found that oral TRT specifically increased CVD risk. Transdermal and intramuscular administration were found to have no significant effect on CVD risk. This suggests that physiological changes may rely on the method of testosterone administration. 9

Health Canada and the U.S. Food and Drug Administration have recently issued warnings about the use of testosterone because of the associated cardiovascular risk29,42. These warnings were issued because testosterone is known to increase blood pressure and fluid retention, and is associated with heart attack, stroke, and blood clots. Product labels on testosterone supplements in Canada and the United States are now required to identify these risks. In addition, testosterone prescription is only recommended in hypogonadal men that have low serum testosterone because of disorders of the brain, pituitary gland, or testicles and have corresponding pathological symptoms29,42. 1.4 Importance of Osteoporosis and TRT to Public Health 1.4.1 The Burden of Fractures About one third of the elderly population experience a fall per year, 5% of these falls result in some type of fracture, and 1% in a hip fracture1. Hip fractures are the most dangerous, and although they usually result from a fall, they can also happen spontaneously5. Lifetime risk of hip fracture is around 14%-20% among women in the US and Europe5. In the US, the lifetime risk for Caucasian men aged over 50 years is 13%1. In most of the Western world there are considerably more women who experience osteoporotic fractures, but the ratio is much smaller in other parts of the world5. Hip fractures are often painful, and can require long stays in hospital5. Full recovery is uncommon, so morbidity and mortality post-fracture is considerably high, often exacerbated by comorbidities5. The resulting disabilities can lead to a need for extended care. Vertebral and wrist fractures are less severe, and generally do not lead to long periods of hospitalization5. Hip fractures account for the majority of the monetary costs of osteoporosis5. 1.4.2 Lack of Research on Male Osteoporosis Prevention In the US, treatment and prevention research is mainly focused on women, leading to lack of attention on osteoporosis in men3. Medicare reimbursement for DXA tests are not always available for men, clinical trials are generally focused on women first, and screening programs for men are uncommon3. 1.4.3 TRT in Men versus HRT in Women When considering TRT as a possible option for prevention and treatment of osteoporosis, it is important to look at how hormone replacement therapy (HRT) has impacted women. The Women’s Health Initiative trials have given extensive insight on the benefits and adverse events 10

associated with HRT. Oral administration of estrogen plus progestin has been shown to be positively associated with fewer fractures, less colorectal cancer, and less diabetes, but also with more CVD events and breast cancer6. Overall, the effect of HRT on mortality is fairly neutral. 1.5 Literature review 1.5.1 Systematic reviews and meta-analyses A briefing document from 2014 by the U.S. Food and Drug Association19 compiled evidence from the available literature on the effect of testosterone on bone density, lean and fat mass, sexual and cognitive function, and CVD and other disease risks. Long-term impact of testosterone treatment could not be determined because of a lack of long duration trials. The pooled data found benefit of TRT on bone health and body composition, but less so for sexual function and mood. The conclusions on CVD risk indicated a need of caution in considering TRT for men with pre-existing CVD symptoms. Two previous meta-analyses of randomized-controlled trials by Isidori et al.43 and Tracz et al.44 have assessed the effect of testosterone on bone health. These studies examine slightly different research questions. Isidori et al. focused on a wider variety of outcomes in older men. Tracz et al. focused on exclusively on BMD in men of all ages. Isidori et al.’s 2005 study was a large, comprehensive study on fat mass, muscle strength, bone metabolism, and serum lipid profiles. The primary focus of this meta-analysis was on middle-aged men. When studying mean difference change in BMD of the lumbar spine in five trials, femoral neck in four trials, and trochanter and Ward’s triangle in three trials, Isidori et al. found only a small effect of testosterone on the lumbar spine. A larger result was found when the researchers pooled the effects of percent change from baseline in the lumbar spine. They found the effect size of testosterone on lumbar spine BMD favorably increased only after at least 12 months of TRT. The small number of trials assessed gives less power to detect the true pooled effect of testosterone on BMD, The 2006 meta-analysis published by Tracz et al. analyzed a total of eight trials that assessed the effect of testosterone treatment on BMD. The analysis focused strongly on men particularly at-risk for osteoporotic fractures. Their study found a small increase in lumbar spine bone density and a smaller increase in BMD in the femoral neck. Tracz et al. conducted subgroup analysis by glucocorticoid use, testosterone level, patient age, duration, and method of testosterone administration. Tests for interaction were run for each group, and the p-value was 11

greater than 0.05 in all subgroups except for method of administration. They found that intramuscular testosterone administration had a higher effect on lumbar spine BMD than transdermal treatment (P=0.009). There have been a considerable number of new RCTs examining the effect of testosterone on bone health. Some of these trials assessed oral administration of testosterone, something that had not been previously studied in an RCT when Tracz et al. and Isidori et al. conducted their meta-analyses. These additional trials also allow more meta-regression to better assess reasons for heterogeneity.

12

2 Objectives The a priori hypothesis of this systematic review and meta-analysis is that testosterone therapy increases bone mineral density of only the lumbar spine in older men. This is an update of two previous meta-analysis with similar research questions in 2005 by Isidori et al. and 2006 by Tracz et al. An update of previous analyses was done because the large number of recent trials may change the conclusions made by Tracz et al. and Isidori et al.. Sensitivity analysis and metaregression may also shed new light on sources of heterogeneity, bias and the effect of testosterone among different groups of men or methods of administration.

13

3 Methods This meta-analysis was conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines65. 3.1 Data sources and searches A search of PubMed was conducted in May 2015, using the search term “(testosterone OR androgen) AND bone AND men AND trial”. Article titles that were clearly irrelevant and studies listed as “reviews” were immediately excluded. For the other studies with titles that appeared possibly relevant, abstracts were examined. If the abstract gave indication that the trial did not study the desired exposure or outcome, was not randomized, did not contain a control group, or was not written in English, the study was excluded. The remaining studies were examined in their entirety, and were subsequently excluded if they did not pass the inclusion and exclusion criteria listed below. 3.2 Study selection 3.2.1 Inclusion criteria a. Randomized trials with a control group b. Study duration of at least 6 months c. Published before 2015 d. The baseline and outcome measures were reported as mean BMD in g/cm3 e. The study measured BMD of at least one vertebra of the lumbar spine f. The study had at least one treatment group given testosterone only g. Men with a mean age over 40 with androgen deficiency, because testosterone should be restricted to men with pathological testosterone deficiency h. Testosterone administered transdermally, orally, or through injections using clinically suggested dosage 3.2.2 Exclusion criteria: a. Articles not published in English b. Amount of data reported not sufficient for meta analysis 3.3 Data extraction and quality assessment 3.3.1 Extraction Extracted data from each study included year of publication, setting, duration, age of participants, characteristics of the intervention and control group, testosterone administration 14

method and dosage, BMD assessment method, health status of the participants, loss to follow-up rates, author funding and affiliations, and trial registration. Some trials studied multiple treatment groups. Data from these trials were extracted from the testosterone only, testosterone and placebo only, and placebo groups for the meta-analysis. The effects of the other study groups were compared to effect of testosterone alone. The power to perform statistical analysis of these groups was low however, due to the limited trials studying alternate groups. 3.3.2 Quality assessment In order to assess the quality of the trials, the Jadad scale for reporting randomized control trials45 was used. A score was calculated for each trial on a scale from zero to five, with a score of zero denoting a very poor quality study and five representing a strong quality study. There is little consensus on what number score denotes a poor quality study. All trials were included in the total analysis despite their Jadad score. Low scoring trials were excluded in the sensitivity analysis. The Jadad scale for reporting randomized control trials is calculated by answering yes or no questions related to randomization, blinding, and fate of the participants. One point is awarded if the trial mentions some form of the word randomization, one point if double-blinding is mentioned, and one point if there is a description of the participants that withdrew from the trial. If there were no withdrawals, the author should have included that in the report. One additional point is added if the randomization method is regarded as adequate, and one more additional point if the method of double-blinding was adequate. If a study reported inadequate randomization or blinding, then a point is deducted for each section. If randomization and blinding is mentioned in the article, but the method is not, the trial is given one point for each. 3.4 Data synthesis and analysis 3.4.1 Effect measures The “meta” and “metafor” packages of R 3.1.3 (R Development Core Team, Vienna, Austria) were used for this analysis. Inverse variance weighting and a DerSimonian-Laird random effects model was used to synthesize the mean differences of each trial. A random effects model assumes differences in effect estimates across studies due to a variety of different factors46. This model also accounts for the fact that there will be variance in the sampling error present in each study46. All of the heterogeneity among trials may not be fully accounted for, so 15

the random effects model assumes that studies are not the same. The pooled mean change and confidence interval of BMD after exposure to testosterone versus placebo represents the treatment effect. The pooled effect sizes of testosterone treatment on BMD of the lumbar spine and femoral neck were compiled, summarized and then presented using a forest plot. Bone mineral density of the lumbar spine (any combination of L1-L4 vertebrae) and femoral neck were chosen as the outcome because they were the most commonly measured outcomes in the chosen trials. 3.4.1.1 Estimating standard deviations of mean change from baseline All but one study47 did not report change from baseline standard deviation (SD). The mean change standard deviation was estimated based on the guidelines in the Cochrane Handbook for Systematic Reviews of Interventions48. One trial40 only reported the baseline and final means with 95% confidence intervals. The standard deviation was calculated by first calculating the t-value in Microsoft Excel for the exposure and control group using the “tinv” function and multiplying by two (figure 1). The tvalue was used instead of the standard value of 3.92 (obtained by multiplying 1.96 by 2) because the trial had a small population. “N” represents the number of participants in the group of interest. The resulting t-value was then used to find the SD (figure 2). t-valueE/C = 2 × tinv(1-0.95,N-1) Figure 1 – t-value calculation for exposure (E) and control (C) groups

SD = Figure 2 - standard deviation calculation from 95% confidence interval

Where standard error of the mean (SEM) was given, SD was calculated thusly, with “N” being the number of participants: SD = SEM × Figure 3 - standard deviation calculation from standard error of the mean

The correlation coefficient is usually not reported in trials. It is a value that describes the similarity between baseline and final outcomes across participants. Knowing the correlation coefficient is essential to calculating the change-from-baseline standard deviation. The exposure and control correlation coefficients for the lumbar spine and femoral neck measurements were determined separately using the change-from-baseline standard deviation values reported in Brockenbrough et al.’s trial (figure 4 & 5). CorrE/C is the correlation 16

coefficient for the exposure and control that was generally applied to all trials in this metaanalysis.

is the baseline standard deviation of the mean BMD of the exposed to

testosterone group reported in the Brockenbrough et al. trial. of the mean BMD measured at the end of the trial.

is the standard deviation is the standard deviation of the

mean BMD change-from-baseline. The same process was then used to determine the correlation coefficient for the control group. CorrE = Figure 4 - Correlation coefficient of exposure (E)

CorrC = Figure 5 – Correlation coefficient of control (C)

The correlation value used for calculating change-from-baseline standard deviation was the average of the correlation coefficient of the exposure and control (i.e.

).

Once the coefficient was calculated, the following equations were used to estimate the missing standard deviation of the change from baseline for both the exposure and control where missing in the other trials (figure 6 & 7). The change-from-baseline SD was calculated for exposure and control groups separately. From each trial, the SD of mean BMD was extracted at baseline and end of trial. SDE,change = Figure 6 - Standard deviation of the mean change-from-baseline for exposure (E)

SDC,change = Figure 7 – Standard deviation of the mean change-from-baseline for control (C)

This process was repeated for all trials missing standard deviations of change-frombaseline for the femoral neck and lumbar spine BMD separately using their respective correlation coefficients. 3.4.2 Assessment of publication bias Publication bias is a threat to the validity of systematic reviews and meta-analyses. It is a bias that occurs when a trial is or is not published based on the results49. The term can also be applied when trials are published multiple times or selectively once because of non-significant results in the other studies. Checking trial registers for unpublished studies can assist in finding 17

the former type of publication bias for trials started since trial registration became mandatory. Using funnel plots is the widely used method to detect overall publication bias. The Cochrane handbook recommends using a funnel plot to test for publication bias only if there are more than 10 trials. Proper power to discriminate between actual asymmetry and chance is only possible in larger meta-analyses. There were 14 trials that studied change in the lumbar spine, and 12 that studied change in femoral neck. Funnel plots were made for both the lumbar spine and femoral neck mean change to determine any publication bias. Because the outcome is continuous, Egger’s test50 was used to examine possible asymmetry. Any evidence of asymmetry was based on a p-value >0.1, because determining asymmetry based on the common p-value of 0.05 is deemed not appropriate. 3.4.3 Assessment of heterogeneity Heterogeneity is to be expected in meta-analyses because multiple studies conducted under multiple different circumstances are being assessed51. Knowledge of the amount and source of heterogeneity is important, however, when addressing the generalizability of a metaanalysis51. Cochran’s Q is a test statistic often used to assess the level of heterogeneity, but it is not very useful for determining if that heterogeneity is statistically significant, but has high power when studying many large studies51. An alternate approach (I2) more applicable to smaller studies was developed by Higgins et al., using the following formula (“Q” is Cochran’s Q and “df” is degrees of freedom): I2 = 100% × Figure 8 – Calculation of I2

I2 is presented as a percentage ranging from 0-100%. The test statistic gives an idea of how much the variation across studies is due to heterogeneity rather than chance. Low, medium, and high I2 values of heterogeneity are defined as 25%, 50%, and 75% respectively51. Interpretation of these values is subjective to the study, so there is no general consensus on what is an acceptable level of heterogeneity based on the I2 statistic. I2 was the chosen method to assess heterogeneity for this analysis. 3.4.3.1 Sensitivity analysis At the beginning of this study, it was assumed that some bias may arise from the quality of trials. A group based on Jadad score was made up of studies that received a score of four or higher to see if any heterogeneity was attributable to bias within studies due to poor study design 18

and/or methods. A meta-analysis of these studies was done to estimate the pooled effect of testosterone on the mean change in lumbar and femoral neck BMD. Two trials by Frederiksen et al.52 and Christmas et al.53 did not publish outcome data for BMD. Both trials cited the lack of significant findings as their reason for not giving the results. Without including the data from these trials in this meta-analysis, the effect of testosterone on BMD of the lumbar spine and femoral neck may be overestimated. Sensitivity analysis on these trials was conducted by including these two trials in the analysis with an effect size of 0. Frederiksen et al. claimed to measure hip, whole body, and spine BMD. It is assumed that the spine measurements were of the lumbar, but without further information, it cannot be assumed which bones were measured in the hip. Christmas et al. measured both the lumbar spine and femoral neck. Standard deviations for these measurements were chosen arbitrarily from similar trials. 3.4.3.2 Meta-regression In order to investigate further reasons for heterogeneity in this meta-analysis, an a priori hypothesis theorized that dose, duration, glucocorticoid use, baseline serum testosterone levels, and method of testosterone administration could all influence the level of heterogeneity based on the current evidence outlined in the introduction. Meta-regression was used to analyze these covariates. As a combination of meta-analysis linear regression techniques, meta-regression investigates whether there is a linear association among covariates. This method gives more robust answers than sub-group analysis alone, but does not allow causal conclusions54. Because it is essentially an observational analysis, a random effects model was used for the metaregression. Expert consensus suggests using one covariate for every 10 trials in order to address power concerns54. Six different models were used in this analysis. One included all covariates (dose, method, duration, glucocorticoid use, and androgen deficiency), the others examined each covariate separately. The outcome measure was the mean difference in BMD (lumbar spine and femoral neck were examined separately), obtained by subtracting the final mean change in the placebo group from the mean change in the testosterone group for each trial. The I2 and R2 for the fitted models were calculated in order to assess the remaining and explained heterogeneity respectively.

19

To examine if dose of testosterone administered affected the heterogeneity, daily dose was included as a covariate. When trials administered testosterone doses per week or month, the dosage was divided by 7 or 28, respectively to get a standardized daily dose. Analysis on duration of studies was conducted on trials that lasted for a year or less, to determine if studies with a particularly long duration affected the heterogeneity among trials. Trials were separated into two groups, with trials that lasted less than a year as the reference group. Trials that included participants using glucocorticoids were considered to be contributors to heterogeneity because corticosteroid use is associated with decreased bone mass and fractures7. Trials that included participants that were not taking glucocorticoids, were used as the reference group. Total serum testosterone at baseline was also considered in two groups. One group was comprised of trials studying men with androgen deficiency defined by being below the normal range for healthy men. The other group included trials with participants above these levels on average at baseline. This cut-off was chosen based on the commonly used diagnosis serum testosterone level for hypogonadism and/or symptomatic diagnosis based on ADAM scores. The reference group was trials that that included eugonadal men. Trials were also split into three groups based on method of testosterone administration: one that received testosterone orally, a group that was administered testosterone transdermally, and a final group of participants that received testosterone intramuscularly. 3.5 Ethical Approval Analysis of the data in this study does not require ethics committee approval, because all of the data has been previously published.

20

4 Results 4.1 Trial selection 4.1.1 Elimination The search of PubMed yielded 390 studies. 343 articles were discarded based on irrelevant title. Of the remaining 47 studies, one article was excluded because it was written in Polish, one study was a preliminary study design report, five studies were case-control and eight were cohort studies with no control. One trial studied participants randomized to receiving either testosterone alone or testosterone and growth hormone (GH) combined, and so did not give the effect of testosterone. Two trials randomized participants to different testosterone treatment doses or methods, but included no placebo group. One study investigated non-randomized treatment options for osteoporosis, and did not include a placebo group. Another study randomly assigned participants to different methods and doses of testosterone, but also did not include a placebo. One study included three treatment groups (GnRH analog, goserelin plus testosterone, or goserelin plus testosterone plus an aromatase inhibitor), and no placebo. In another study, participants were randomized to receive testosterone, recombinant growth hormone, or combined hormones. One study compared the effect of testosterone and estrogen, with no placebo. Five studies did not measure bone mineral density. One trial did not provide enough outcome data to accurately estimate the measurements of BMD for the testosterone and placebo groups. One study did not give testosterone as treatment. One article only studied women. For these reasons, another 33 were excluded. Two trials analyzed BMD at baseline for a testosterone and placebo group, but did not report their BMD outcome data due to insignificant results. The main goal of the Frederiksen et al. trial was to study the effect of testosterone on osteroprotegerin, as biomarker for CV risk, in androgen deficient men. BMD was measured as a secondary outcome. Christmas et al. studied the effects of growth hormone, sex steroid, growth hormone plus sex steroids, or placebo on bone health in elderly men and women. The effect of hormone replacement therapy on BMD of the female subjects was reported as beneficial in the distal radius, lumbar spine and femoral neck. The results of the BMD outcome in men were not reported because testosterone showed no beneficial effect. These trials were included in a sensitivity analysis with an effect size of zero. Overall 14 trials were included. The trials selected did not encompass all in the previous meta-analyses for the following reasons. 21

One trial included in Tracz et al.’s previous meta-analysis was excluded from this study because it did not fit the research question. In that trial, Fairfield et al. studied young eugonadal men with AIDS wasting syndrome. This did not fit the research question of this meta-analysis. There was also a lack in reporting of baseline and outcome measures of BMD. Six new trials have been added in this paper, for a total of 504 new participants added to the analysis. The previous meta-analysis by Tracz et al. included 365 participants. 4.1.2 Flowchart

Figure 9 - Data selection process

4.2 Characteristics of the included trials 4.2.1 Publication year Articles included in the meta-analysis were published between 1996 and 2014. Three of these trials were published before 2000 (two in 199655,56 and one in 199957. Two were published in 200158,59, one in 200360, one in 200440, two in 200647,61, two in 200831,62, one in 201063, one in 201336, and the last in 201439.

22

4.2.2 Setting The majority of the trials were conducted in the US39,40,47,57,58,63 or Europe36,55,59,61,62. The trial done by Bouloux et al. spanned seven European countries. One trial was done in Mexico31, one in Australia60, and one in New Zealand56. 4.2.3 Duration Almost all of the trials36,56,58,59,60,61,62,63 investigated the effect of testosterone over a one year period. One study47 lasted six months; one was carried out over nine months55. Two studies40,57 conducted their trials over a period of three years. 4.2.4 Age of participants The mean ages of the studied participants ranged from 40 to 76 years. Howell et al. contained the lowest age group studied (mean age 40.9, all participants were younger than 55), and the 2001 trial done by Kenny et al. studied the oldest group (mean age 75.5, all participants were older than 65). 4.2.5 Intervention group A total of 551 participants were randomly selected to receive some form of testosterone treatment among all the trials. Four trials36,55,60,61 did not specify their method of randomization. Method of blinding was unknown in four studies36,39,60,61, and one study59 was notably singleblinded. Crawford et al., Borst et al. and Amory et al. included additional treatment groups. Crawford et al. assessed how important aromatization is in androgen therapy by including a group taking only nandrolone, a testosterone analog that is minimally aromatizable. Nandrolone had no beneficial effect on BMD at all of the measured sites (femoral neck, lumbar spine and total body), indicating that aromatization of testosterone may be important when trying to improve bone health. Along with testosterone alone, Borst et al. also assessed the combined effect of testosterone and finasteride, and finasteride alone. Amory et al. also included a group that assessed the combined effect of finasteride and testosterone. Addition of finasteride did not change the BMD outcomes in either study, but finasteride was found to prevent prostate volume increase when compared to the testosterone only groups. 4.2.6 Control group 426 individuals were randomly assigned to the control group. All of the trials’ control groups except one56 were given a placebo that matched the testosterone treatment method in

23

appearance. Reid et al. did not administer sham injections to the control group because of the associated pain and discomfort. 4.2.7 Testosterone agent, type and dosage TRT was either given intramuscularly through injections, orally in the form of pills, or transdermally with patches or gel. Four trials administered intramuscular injections of TE in doses of 125mg per week39, 200mg every two weeks47, or 250mg per month31,55. One study injected mixed esters at a dose of 200mg every two weeks60. Two trials studied the effect of oral TU dosed at 80mg per day36,62, 160mg per day36, or 240mg per day36. The maximum dosage treatment group (240 mg/day) was used in the statistical analysis for the measures from Bouloux et al.’s trial. Four trials examined the effect of transdermal patches delivering a dose of 2.5mg per day59, 5mg per day58,59,61, or 6mg per day57. Two trials used testosterone gel dosed at 5mg per day63 or 100mg per day47. 4.2.8 Method of BMD measurement All 14 trials used DXA tests to measure BMD. Every study measured BMD of at least one vertebra of the lumbar spine. 12 trials measured BMD of the femoral neck31,36,40,47,55,56,57,58,59,60,61,63. Six trials31,36,40,59,60,63 measured all lumbar vertebrae (L1-L4). Two trials39,57 measured BMD of the L2-L4. Kenny et al. 2001 measured only the L1 vertebra. The rest of the trials55,61,62 did not specify which vertebrae of the spine they measured. 4.2.9 Health status Participants in nine trials31,36,40,47,56,58,61,62,63 were diagnosed with testosterone deficiency, usually defined as serum testosterone lower than 300ng/dl (10.4nmol/L) or a high score on the ADAM survey. Some trials40,57,62 used a slightly higher cut-off level for the measured serum testosterone. Three trials55,56,60 included participants on long-term glucocorticoid therapy. Brockenbrough et al. specifically studied the effect of testosterone on dialysis patients with chronic renal disease. The participants in Howell et al.’s trial all had mild Leydig cell dysfunction. All of the participants in Hall et al.’s trial had rheumatoid arthritis. They also included eight participants that had previously had at least one vertebral fracture. 24

4.2.10 Loss to follow-up The average loss to follow-up for all studies was 23.5%. Eight trials had loss to follow-up rates above 20%. The trials conducted by Kenny et al. in 2010 and Brockenbrough et al. lost 53% and 45% respectively at follow-up. Kenny et al. did not report adverse effects as the reason for high loss to follow-up, citing the similarities between the rates and reasons in the placebo and testosterone groups. 4.2.11 Funding and affiliations All trials, except two62,40, reported financial funding sources and/or conflict of interest. Much of the funding was received from research institutes, government agencies, or universities. Seven trials36,39,47,55,59,60,61 reported at least partial funding from pharmaceutical companies. 4.2.12 Registration Only four trials36,39,62,63 reported the trial registration number in the article. Of the nine trials that did not report trial registration, three31,47,61 were published after the year registration became mandatory for publication (2005). 4.3 Data extraction and quality assessment 4.3.1 Missing values calculation In the trial conducted by Crawford et al, the SEM was given for just the baseline. The SEM was converted to SD, using figure 3. Amory et al. gave the baseline and final measurements with 95% confidence intervals. Using the equations in figure 1 and 2, SDs were calculated for the baseline and final means. These estimated SDs were then used as the values in the formula shown in figure 6 and 7. The tvalues for exposure and control were estimated to be 4.24 and 4.22 respectively. The correlation coefficient for lumbar spine results were calculated for the exposure (0.97) and the control (0.94) using the fully reported results from the Brockenbrough et al study as the values SDE/C,baseline and SDE/C,final. The coefficient was chosen as the mean of these two values (0.955). This coefficient was then used to calculate the mean change standard deviation for 9 studies by inserting the baseline and final standard deviations into figure 6 and 7 for estimating SD for change from baseline. The same process was used for the missing change from baseline SD for the measurements of the femoral neck. The averaged correlation coefficient calculated from the Brockenbrough et al. study yielded a value of 0.98. 25

Borst et al. Howell et al., Crawford et al., and Bouloux et al. were all missing outcome data for the final SD. Studies with similar effect sizes and population were used for proxy values. Bouloux et al. adopted the SD from Snyder et al.. Howell et al and Crawford et al were given the SDs from the Hall et al. study. 4.3.2 Jadad appraisal Five studies36,56,59,60,61 received a score of three or lower. The majority of these studies were given a score of three simply because they did not report their method of randomization or blinding. This does not, however, indicate that their method were inadequate, so they were included in the study. Howell et al. received a score of three because their trial was only single-blinded. The patients received a matching placebo, but the staff knew who would receive each patch. The authors did report an adequate method of randomization and the fate of the withdrawals. The dropout rate was low (6%). Despite being single-blind, it was decided that the quality of the remainder of the trial was adequate enough to be included in this meta-analysis. Crawford et al. received a low score of two because they failed to report their method of randomization and double-blinding, and reasons for participant withdrawal. The trial did at least state the method of single-blinding (matching placebo injection), but did not report if the staff or physicians administering the injections were blinded. The authors gave the number of withdrawals, but not the reasoning, leaving the reader unsure if adverse effects were the cause. The method of blinding and randomization were assumed to be adequate enough for the inclusion in this meta-analysis, but was deemed a poor quality study in the subgroup analysis. The crossover trial conducted by Reid et al. was examined closer due its low score of two. Because of the crossover design, the control group and the testosterone group were the same, and the participants were told when they were receiving testosterone. Placebo injections were not used because of associated pain. The testosterone treatment also took place in one half of the group before they became the control, so it is difficult to determine if there were any lingering effects in the control phase attributable to the previous treatment. The randomization method was also not mentioned. This trial was included in the overall analysis, but removed during the sensitivity analysis of high quality trials.

26

4.4 Data synthesis and analysis 4.4.1 Total pooled effect of testosterone on BMD of the lumbar spine The forest plot for the effect of testosterone treatment on lumbar spine BMD for all trials is shown below (Figure 10).

Figure 10 - Pooled effect of testosterone on lumbar spine BMD

Baseline and outcome data are provided on the left hand side of figure 5 for the testosterone and control groups’ effects on BMD. The left hand side gives the mean difference, 95% confidence intervals, and weights for each trial. The squares indicate the mean difference, and the black lines show the corresponding confidence interval. The solid line at zero is where testosterone has no effects. The diamond and dotted line show the pooled effect sizes. A square that sits on the left side58,59,62 of the solid line indicates that the placebo had a more beneficial effect than testosterone, a square on the right side31,36,39,40,47,55,56,57,60,63 means the opposite. A black line that crosses the solid center line at zero indicates a non-significant effect of testosterone treatment on BMD. The pooled mean difference in the random effects model for lumbar spine BMD was 0.023 (95% CI 0.009 to 0.037). This indicates that testosterone is favorably associated with increased BMD when compared to the placebo. The heterogeneity was high (I2 = 67.9%) and the chi-squared test was significant, indicating substantial differences between studies, not due to chance.

27

4.4.2 Total pooled effect of testosterone on BMD of the femoral neck In the same manner as before for the lumbar spine measurements, the pooled effect of testosterone treatment on femoral neck BMD was compiled in a forest plot (figure 11). Two mean difference squares55,59 lie on the left side of the center zero line, three lie directly on the line31,61,63, and the rest36,40,47,56,57,58,60 favor testosterone on the right.

Figure 11 - Pooled effect of testosterone on femoral neck BMD

The pooled mean difference of testosterone effect was 0.006 (95% CI -0.001 to 0.013). The heterogeneity for measures of the femoral neck was also high at 56.5%. 4.4.3 Funnel Plots The funnel plots showed little evidence of publication bias for trials studying the effect of testosterone on BMD of the lumbar spine or femoral neck. Both plots (figure 12) showed some possible asymmetry.

Figure 12 - Funnel plot of the lumbar spine and femoral neck BMD measurements

28

Egger’s test was applied to test if any was indeed present. Both groups had a nonsignificant result with a p-value greater than >0.1 (0.22 for lumbar, 0.94 for femoral) showing no asymmetry among trials. This does not indicate definitively, however, that no publication bias is present. Visually, there appears to be a lack of small trials showing a negative effect of testosterone on BMD of the lumbar spine. 4.4.4 Sensitivity analysis 4.4.4.1 Low quality trials Sensitivity analysis excluding lower quality studies was conducted in order to investigate the source of the heterogeneity. Studies with a Jadad score below 4 were omitted from the analysis, then run as normal (figure 13 & 14).

Figure 13 - Sensitivity analysis for pooled testosterone effect on lumbar spine BMD

Figure 14 - Sensitivity analysis for pooled testosterone effect on femoral neck BMD

This lowered the I2 of the lumbar measurements to 58.8% (with a significant chi-square p-value) and femoral neck measurements to 48.2% (non-significant chi-squared p-value). With the low Jadad-scored trials omitted, testosterone had a small pooled effect on lumbar spine 29

BMD, but no effect on femoral BMD when compared to placebo. This analysis assumes that the Jadad score for each trial was accurate, but trials that did not report their randomization or blinding methods are difficult to score accurately. 4.4.4.2 Articles that excluded data For the two trials52,53 that measured BMD but did not report results, a sensitivity analysis was done including these trials with an effect size of zero, because both trials reported nonsignificant results. The standard deviations were estimated from other trials with similar characteristics. If the effect of testosterone was close to zero in both trials, the effect of testosterone in the original analysis would be overestimated, and that is confirmed from this sensitivity analysis.

Figure 15 - Sensitivity analysis on trials that did not report lumbar spine BMD outcome

The pooled results from adding Frederiksen et al. and Christmas et al. with an effect size of zero demonstrated a decrease in the effect of testosterone on lumbar spine BMD. The pooled effect is lower by 0.003 g/cm3. Heterogeneity is still equally high.

30

Figure 16 - Sensitivity analysis on trials that did not report femoral neck BMD outcome

Frederiksen et al. did not report measuring the femoral neck BMD, so only Christmas et al. was included in this part of the sensitivity analysis. The pooled effect of testosterone on femoral neck BMD decreased by 0.001 g/cm3. The I2 is still large. 4.4.5 Meta-regression 4.4.5.2 Random-effects model with one covariate Because only 14 trials were included in this meta-analysis, meta-regression models with one covariate have the best power to detect true associations. A model with more covariates would yield more false positives54. Each covariate (dose, duration, testosterone administration method, androgen deficiency and glucocorticoid use) was examined separately to identify which individual variables had the largest influence on the level of heterogeneity. I2 was calculated to assess the heterogeneity not explained by the covariate. R2 was used to assess how much the covariate explained the heterogeneity across all measurements. The estimate and its 95% confidence interval for the covariate were extracted from the model results. This estimate gives an indication of the effect of the covariate on mean change in BMD when compared to the reference group. 4.4.5.2.1 Dose There was considerable variation in the dosage across trials. In many circumstances, dose seemed connected to the chosen method of administration (i.e. oral, patch, injection). Dosage in five trials31,40,55,56,60 required standardization because they administered testosterone weekly or monthly. 31

The estimate for daily dose was 0 (95% CI 0 to 0, I2 = 69%, R2 = 0%) for the lumbar model and 0 (95% CI 0 to 0, I2 = 59%, R2 = 0%) for femoral neck. Dosage appears not to explain any of the heterogeneity. A larger daily dose does not seem to be correlated with an increase in BMD. This goes contrary to expectations if testosterone is assumed to have an effect on BMD. These results indicate a lack of a dose-response relationship between testosterone and BMD of both the lumbar spine and femoral neck. 4.4.5.2.2 Duration The trials conducted by Amory et al. and Snyder et al. had duration longer than one year. These two trials were compared to the other 12 trials that studies that lasted for a year or less. For the lumbar model, the estimate was 0.029 (95% CI -0.006 to 0.064, I2 = 62%, R2 = 2%) and 0.011 (95% CI -0.002 to 0.023, I2 = 38%, R2 = 39%) in the femoral model. From these estimates, it appears that duration more than one year has a small influence on mean change in BMD of both the lumbar spine and femoral neck. The R2 indicates that duration explains some heterogeneity in both sets of measurements, much more so in the femoral neck. Because there are only two trials that studied the effect of testosterone for more than one year, the power to explain heterogeneity or find a robust estimate is low. 4.4.5.2.3 Testosterone administration method Two36,62 trials studied the effect of oral testosterone, Six47,57,58,59,61,63 used a transdermal (patch or gel) method. The final six31,39,40,55,56,60 used intramuscular injections. Transdermal testosterone was used as the reference group. Oral compared with transdermal had an estimate of -0.011 (95% CI -0.040 to 0.018, I2 = 46%, R2 = 54%) in the lumbar model and -0.003 (95% CI -0.027 to 0.021, I2 = 63%, R2 = 0%) in the femoral model. When comparing intramuscular injections with transdermal application, the lumbar model estimate was 0.032 (95% CI 0.006 to 0.058) and 0.001 (95% CI -0.016 to 0.017) in the femoral neck model. Intramuscular injections are associated with higher mean lumbar spine BMD at the end of trial in this univariate model. Method of testosterone administration also seems to explain a lot of heterogeneity when examining the lumbar spine BMD as an outcome, but not at all for the femoral neck. 4.4.5.2.4 Serum testosterone levels at baseline Nine trials31,36,39,40,47,58,61,62,63 studied the change in lumbar BMD in participants that had clinically low total serum (≤300 ng/dl) or bioavailable (

Suggest Documents