Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Mathematics with Specialization in Statistics and Operations Research

New Mexico Institute of Mining and Technology Socorro, New Mexico December, 2013

To my mother Anne, my father Mark, my sister Emily, Aunt Pat, Aunt Mary, Uncle Tony, Mikenzie, and Jaislinn. Your constant love, continuous support, and astounding belief in me make everything possible.

Margaret A. Snell New Mexico Institute of Mining and Technology December, 2013

ABSTRACT

Modern advances in industrial and agricultural chemistry have resulted in increasing numbers of novel chemical compounds in the environment, compounds whose full effects on human health are still yet to be determined. Environmental toxins have been implicated in the development of obesity, as well as other chronic diseases such as diabetes and cancer. However, the exact relationships between these toxins and chronic illness are still largely unknown. In addition, with weight loss, persistent pollutants stored in adipose tissue may be released into body fluids and lead to health problems. In this study, National Health and Nutrition Examination Survey (NHANES) data collected by the CDC from 2001-2004 and 2007-2010 were used along with canonical correlation analysis, factor analysis, and principal component logistic regression. Relationships between toxins and obesity, diabetes, and cancer were examined, as well as associations between weight loss and persistent organic pollutant serum concentrations. An unexpected inverse relationship between both obesity and diabetes and the four parabens involved in the study was found and would be of particular interest to future research.

Keywords: obesity; environmental toxins; canonical correlation; factor analysis; principal component analysis logistic regression; NHANES

ACKNOWLEDGMENTS

I would like to thank Dr. Hossain, Dr. Makhnin, and Dr. Rogelj for their direction, continuing support and extreme patience. Also, I would like to thank Dr. Schaffer for all the help and guidance. This thesis was typeset with LATEX1 by the author.

1 The LAT

EX document preparation system was developed by Leslie Lamport as a special version of Donald Knuth’s TEX program for computer typesetting. TEX is a trademark of the American Mathematical Society. The LATEX macro package for the New Mexico Institute of Mining and Technology thesis format was written for the Tech Computer Center by John W. Shipman.

iii

CONTENTS

LIST OF TABLES

vii

LIST OF FIGURES

x

1. INTRODUCTION

1

2. BACKGROUND

3

2.1

TOXINS AND OBESITY . . . . . . . . . . . . . . . . . . . . . . . . .

3

2.2

TOXINS AND CANCER/DIABETES . . . . . . . . . . . . . . . . . .

4

2.3

WEIGHT LOSS, TOXIN RELEASE, AND PERSISTENT ORGANIC POLLUTANTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3. METHODS

5 7

3.1

CANONICAL CORRELATION ANALYSIS . . . . . . . . . . . . . .

7

3.2

INTERBATTERY FACTOR ANALYSIS . . . . . . . . . . . . . . . . .

8

3.3

LINEAR REGRESSION . . . . . . . . . . . . . . . . . . . . . . . . . .

9

3.4

DISTANCE CORRELATION . . . . . . . . . . . . . . . . . . . . . . .

9

3.5

PRINCIPAL COMPONENT LOGISTIC REGRESSION . . . . . . . .

10

4. DATA

11 iv

5. RESULTS 5.1

15

ENVIRONMENTAL TOXINS AND OBESITY . . . . . . . . . . . . .

15

5.1.1

Environmental pesticides . . . . . . . . . . . . . . . . . . . .

15

5.1.2

Environmental phenols . . . . . . . . . . . . . . . . . . . . .

19

5.1.3

Polyfluorinated compounds . . . . . . . . . . . . . . . . . . .

22

5.1.4

Urinary phthalates . . . . . . . . . . . . . . . . . . . . . . . .

26

5.1.5

Heavy metals . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

5.1.6

Linear regression . . . . . . . . . . . . . . . . . . . . . . . . .

34

5.2

TOXINS AND HEALTH OUTCOMES . . . . . . . . . . . . . . . . .

34

5.3

WEIGHT LOSS AND PERSISTENT ORGANIC POLLUTANTS . .

39

5.3.1

44

Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . .

6. DISCUSSION 6.1

6.2

45

PART I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

6.1.1

Environmental pesticides and obesity . . . . . . . . . . . . .

45

6.1.2

Environmental phenols and obesity . . . . . . . . . . . . . .

46

6.1.3

Polyfluorinated compounds and obesity . . . . . . . . . . .

46

6.1.4

Urinary phthalates and obesity . . . . . . . . . . . . . . . . .

46

6.1.5

Heavy metals and obesity . . . . . . . . . . . . . . . . . . . .

47

6.1.6

Linear regression . . . . . . . . . . . . . . . . . . . . . . . . .

47

PART II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

6.2.1

47

Pesticides, phenols, phthalates, and chronic illness . . . . . . v

6.3

PART III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

6.3.1

48

Persistent organic pollutants and weight loss . . . . . . . . .

7. CONCLUSION AND FUTURE WORK

49

A. SAS CODE

51

A.1 Canonical correlation analysis with obesity and toxins . . . . . . . .

51

A.2 Canonical correlation analysis with weight loss and POP concentration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. R CODE

58 61

B.1 Distance correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

B.2 PCA and logistic regression . . . . . . . . . . . . . . . . . . . . . . .

62

B.3 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

B.4 Correlations between POPs and weight loss/gain . . . . . . . . . .

66

REFERENCES

72

vi

LIST OF TABLES

4.1

Data from 2007-2010 for obesity and toxin portion of the study . . .

4.2

Data from 2007-2010 for chronic illness and toxin portion of the study 13

4.3

Data from 2001-2004 for persistent organic pollutants and weight change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.1

14

Correlations between obesity indicators and environmental pesticides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.2

12

16

Canonical correlation results for environmental pesticides and obesity indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

5.3

Rotated Correlation Loadings for Environmental Pesticides

. . . .

17

5.4

Canonical Redundancy Analysis for Environmental Pesticides . . .

18

5.5

Rotated Interbattery Factor Loadings for Environmental Pesticides and Obesity Indicators . . . . . . . . . . . . . . . . . . . . . . . . . .

18

5.6

Correlations between obesity indicators and environmental phenols 19

5.7

Canonical correlation results for environmental phenols and obesity indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

5.8

Rotated Correlation Loadings for environmental phenols . . . . . .

20

5.9

Canonical correlation results for environmental pesticides and obesity indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

21

5.10 Rotated Interbattery Factor Loadings for environmental phenols and obesity indicators . . . . . . . . . . . . . . . . . . . . . . . . . .

22

5.11 Correlations between obesity indicators and polyfluorinated compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

5.12 Canonical correlation results for polyfluorinated compounds and obesity indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

5.13 Rotated Correlation Loadings for polyfluorinated compounds . . .

24

5.14 Canonical redundancy analysis for polyfluorinated compounds . .

25

5.15 Rotated Interbattery Factor Loadings for polyfluorinated compounds and body measurements . . . . . . . . . . . . . . . . . . . . . . . . .

26

5.16 Correlations between obesity indicators and urinary phthalates . .

27

5.17 Canonical correlation results for urinary phthalates and obesity indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

5.18 Rotated Correlation Loadings for urinary phthalates . . . . . . . . .

29

5.19 Canonical Redundancy analysis for urinary phthalates . . . . . . .

30

5.20 Rotated Interbattery Factor Loadings for urinary phthalates and obesity indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

5.21 Correlations between obesity indicators and heavy metals . . . . .

32

5.22 Canonical correlation results for heavy metals and obesity indicators 32 5.23 Rotated correlation loadings for heavy metals . . . . . . . . . . . . .

33

5.24 Canonical redundancy analysis for heavy metals . . . . . . . . . . .

33

5.25 Rotated correlation loadings for heavy metals . . . . . . . . . . . . .

34

5.26 Linear regression results . . . . . . . . . . . . . . . . . . . . . . . . .

35

viii

5.27 Cumulative variance accounted for by principal components . . . .

35

5.28 Principal component loadings . . . . . . . . . . . . . . . . . . . . . .

37

5.29 Correlations between persistent organic pollutants and weight loss/gain 40 5.30 Correlations between persistent organic pollutants and weight change 42 5.31 Canonical correlation results for persistent organic pollutants and weight change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

5.32 Rotated Correlation Loadings for persistent organic pollutants and weight change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.33 Canonical redundancy analysis for persistent organic pollutants and weight change . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.34 Linear regression results . . . . . . . . . . . . . . . . . . . . . . . . .

44

ix

LIST OF FIGURES

2.1

The Obesity Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

5.1

Correlations between pesticides, phenols, and phthalates . . . . . .

36

5.2

Variance plot for principal components . . . . . . . . . . . . . . . .

37

5.3

Diabetes logistic regression results, with and without controlling for age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.4

38

Cancer logistic regression results, with and without controlling for age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x

39

This thesis is accepted on behalf of the faculty of the Institute by the following committee:

Anwar Hossain, Advisor

I release this document to the New Mexico Institute of Mining and Technology.

Margaret A. Snell

Date

CHAPTER 1 INTRODUCTION Obesity is currently a major health concern in the United States. As of April 2012, 35.7% of the adult population in the United States was considered obese [1]. Based on current trends, it is predicted that 86% of Americans will be overweight and over half will be obese by 2030 [2]. Obesity is a risk factor for a multitude of health problems, including heart disease, high blood pressure, stroke, Type 2 diabetes, cancer, and reproductive problems [3]. Not only does obesity predispose individuals for a multitude of chronic diseases, but it is also expensive. In 2008 alone, obesity-related medical costs were estimated to be $147 billion [1]. Since the prevalence, and therefore costs, of obesity are predicted to continue to increase, it is clear that obesity is a condition that merits study. Obesity is not only a concern in the United States. In addition to being a growing concern in other developed nations, the rates of obesity are also increasing in developing countries. It is estimated that the number of overweight people in the world now exceeds the number who are undernourished [4]. Due to the increasing severity of this epidemic and the large costs associated with it, obesity prevention has become a major public health priority in many countries, including the United States. Obesity is defined as excessive body fat, meaning greater than 25% body fat for men and greater than 30% body fat for women [4]. However, the factors involved in the development of obesity are complex and varied. Obesity is caused by a complex interaction of behavioral, environmental, and genetic factors [4]. Data has demonstrated that the current increase in obesity cannot be explained by changes in food intake and decrease in physical activity alone and, although there is a genetic component to obesity, genetics cannot have changed enough in the last decade to explain this epidemic [5]. Thus, this indicates that there changes in the environment which have played a part in the increase in obesity. Indeed, the increase in obesity coincides with increases in the levels of chemicals in the environment [5]. The first aim of this study is to look at the relationship between various types of environmental toxins and obesity. In addition to being implicated in increasing rates of obesity, many environmental toxins have been observed to be associated with other chronic dis1

eases. In the second part of this study, we look at how environmental toxins relate to the two important health outcomes of diabetes and cancer. While obesity has been a major concern, there may also be detrimental health outcomes resulting from weight loss. Since certain toxins are stored in fat, when individuals lose weight these toxins may be released into the body, where they may damage body organs and systems. Thus, in the third part of this study, we looked at the relationship between levels of persistent organic pesticides and both short- and long-term weight loss. This current study serves in part as a continuation of the previous work. A similar study was done previously using canonical correlation analysis with NHANES 1999-2000 data and looking at associations between environmental toxins, obesity, blood pressure, and height [6]. We add to these previous results by using more recent data sets that span a greater number of years while examining many of the same relationships. In addition, this study utilizes a greater number of methods to look at the relationships between variables and looks at additional relationships that have become relevant in obesity research since the last study, such as the relationship between weight loss and increased toxin levels in body fluids.

2

CHAPTER 2 BACKGROUND 2.1

TOXINS AND OBESITY

Body mass index (BMI) is defined as mass divided by height squared, yielding units of kilograms per meter-squared. It is a commonly used measure of body composition, as it is easy and inexpensive to obtain [1]. BMI measurements of greater than 25 are considered overweight/obese. Although BMI correlates with body fat, it is not always a good measure of body composition for all individuals, such as athletes, and waist circumference can also serve as a good predictor of obesity [1]. Thus, both BMI and waist circumference were used in this analysis as indicators of obesity. Environmental pesticides were the first group of toxins looked at. Relationships between pesticides and obesity have been seen in previous studies [7] [8]. Two main pesticides, 2,4-dichlorophenol and 2,5-dichlorophenol, were part of the selection of pesticides examined in this study. 2,4-dichlorophenol results from the environmental degradation of 2,4-diphenoxyacetic acid (an herbicide) and is primarily used to manufacture herbicides, as well as being used in water chlorination and wood bleaching [9]. Primary routes of exposure are through ingestion of contaminated water, skin contact, and inhalation [10]. 2,5dichlorophenol is a metabolite of 1,4-dichlorobenzene and is involved in dye and chemical synthesis, as well as in the production of resins and mothballs [9]. The primary exposure routes are through inhalation and skin contact [10]. A significant relationship between childhood obesity and high levels of urinary 2,5 dichlorophenol has been observed, although the relationship between childhood obesity and levels of urinary 2,4-dichlorophenol was not significant [8]. Phenols were the second type of environmental toxins analyzed. They are in the top fifty chemical volumes manufactured in the U.S. and are absorbed via inhalation, ingestion or skin contact [10]. Once these compounds are absorbed, they become widely distributed throughout the body. The top three uses of phenols are in resins used in the construction, automotive, and appliance industries, in polycarbonate plastics (primarily bisphenol A), and in the manufacture of nylon 6 and other synthetic fibers, although they can also be found in herbicides, 3

mothballs, and detergents [10]. Possible sources of contact include ingestion of contaminated water, inhalation of cigarette smoke and wood smoke, ingestion, and through skin contact, as these compounds are used in sunscreens, lotions, hand soap, toothpaste, and dental sealants [10]. A major phenol of interest in obesity research is bisphenol A (BPA), which is widely found in polycarbonate plastics, epoxy linings of metal food and drink cans, and dental sealants, and which is a known endocrine disruptor [10]. Research has indicated that exposure to bisphenol A has developmental effects on adipocytes (fat cells) and their functioning, thus indicating that this compound may have an influence on adult obesity [11]. Phthalates were also studied. These compounds are used primarily in soft plastics and are found in childrens toys, pacifiers, plastic food packaging, vinyl flooring, soap, shampoo, nail polish, and numerous household products. Phthalates are additives and thus can easily leach and transfer to food and air because they are not covalently bound to the plastics they are added to [12]. Phthalates are known endocrine disruptors that decrease testosterone biosynthesis [12]. Research has shown that obese children have higher blood concentrations of phthalates (thus implying greater exposure to these compounds) than non-obese children [13]. Research has also indicated that phthalates may change gene expression associated with the metabolism of fat [13]. Several phthalates have been found to bind to and activate receptors that are involved in promoting adipocyte maturation [12]. In addition, studies have found positive relationships between waist circumference, total fat mass, and mono-isobutyl phthalate in older women, as well as between several phthalates and waist circumference in men [12] [14]. In a study involving NHANES 1999-2002 data, relationships between phthalates, BMI, and waist circumference were observed [15]. Thus, past research has indicated that phthalates may play a role in the development of obesity. Finally, heavy metals were looked at. Exposure to heavy metals can occur through water, food, air, and commercial products, as well as working in industries such as mining. A previous study involving 99-02 NHANES data indicated an inverse relationship between cadmium/lead and BMI/waist circumference [16].

2.2

TOXINS AND CANCER/DIABETES

In the second part of this study, the relationship between toxins and both cancer and diabetes was examined. Diabetes is a chronic disease in which the body has decreased ability to use insulin, a decreased supply of insulin, or both. Over time, if this disease is not controlled, vital organs are damaged as a result of a buildup of glucose and fats in the blood [17]. Exposure to certain pesticides 4

has been shown to increase the odds of developing diabetes [18] [19] and a positive correlation between several phthalate metabolites, waist circumference, and insulin resistance in adult human males has also been observed [14]. Cancer is a group of diseases in which abnormal cells divide rapidly and invade various tissues [20]. In animal studies, carcinogenic effects have been observed after phthalate exposure [15]. Phenols, however, are not currently classified as carcinogenic.

2.3

WEIGHT LOSS, TOXIN RELEASE, AND PERSISTENT ORGANIC POLLUTANTS

In the third part of this study, the relationship between weight loss and the concentrations of persistent organic pollutants present was explored. PCBs are a main type of persistent organic pollutant. They were used in industry in the 20th century and, while they were banned in the 1970s in North America and Europe, they are still found widely in the population today because of the fact they are stored in the fat of living organisms and thus persist [21]. Since these compounds are lipid soluble and bioaccumulate in animal tissues, people with higher body mass indexes are more likely to store high levels of these pollutants in their body tissues [18]. Although it is generally accepted that weight loss reduces the risk for a myriad of diseases, these benefits are offset by possible deleterious health effects of weight loss. Adipose tissue can be thought of as having the role of protecting the body by storing lipophilic toxins away from organs [21]. Although weightloss has clear health benefits, it has been shown that weight loss can also result in the release of stored toxins and an increase in the concentrations of persistent pollutants in plasma and adipose tissue [22] [23]. This may cause the internal organs to come into contact with these pollutants as they are released into the blood stream [22]. In previous research, plasma concentration of organochlorine pesticides increased with weight loss in a manner related to the magnitude of the weight loss in individuals undergoing bariatric surgery [21]. Interestingly, for a given amount of weight loss, less obese individuals exhibit a greater increase of pollutants in the blood stream than those who are more obese [23]. This resulting increase in pollutants in body fluid following weight loss may play some part in what is known as the obesity paradox. The obesity paradox (see figure 1) refers to the phenomenon that, as age increases, the lowest mortality rate shifts towards heavier BMIs, implying that moderate (but not excessive) extra weight may actually be protective and beneficial at certain ages [24]. Since a majority of the population of the United States is not dealing with 5

V. Hughes, ”The Big Fat Truth”, Nature, pp. 428-430, 23 May 2013

Figure 2.1: The Obesity Paradox

starvation, energy conservation concerns cannot account for this apparent protective property of extra weight. Approximately 90% of intake of PCBs and other persistent organic pollutants are estimated to come from diet alone [22]. These pollutants are involved in reproductive disorders, breast cancer, immune and thyroid function impairments, and neurobehavioral problems [21].

6

CHAPTER 3 METHODS 3.1

CANONICAL CORRELATION ANALYSIS

Canonical correlation analysis is a multivariate regression method which assumes multivariate normality and a linear relationship between canonical variates and each set of variables [25]. In this method, we seek to replace the two sets of variables X and Y by t pairs of new variables:

(ξ i , ω j ); i = 1, 2, ..., t; t ≤ min(r, s) where, for j = 1, 2, ..., t, ξ j = g Tj X = g1j X1 + g2j X2 + ... + grj Xr ω j = h Tj Y = h1j Y1 + h2jY2 + ... + hsj Ys Assuming unit variances for both ξ and ω, we seek to maximize the correlation between ξ and ω given by corr (ξ, ω ) = g T Σ XY h This yields the first canonical variable pair (ξ 1 , ω1 ). Given this first pair, we seek a second variable pair (ξ 2 , ω2 ) such that we maximize corr (ξ, ω ) = g2 T Σ XY h2 and such that ξ 2 is uncorrelated with ξ 1 and ω2 is uncorrelated with ω1 . Thus, g2 T Σ XX g1 = h2 T ΣYY h1 = g2 T Σ XY h1 = h2 T ΣYX g1 = 0 This process is repeated to generate (ξ j , ω j ), where the pairs (ξ j , ω j ) are ranked in terms of their correlations, ξ j is uncorrelated with all previously derived ξ k , and ω j is uncorrelated with all previously derived ωk . The canonical correlations are given by: 7

ρ j = corr (ξ j , ω j ) =

gj T Σ XY hj 1

1

(gj T Σ XX gj ) 2 (hj T ΣYY hj ) 2

This type of analysis is used to identify and measure the associations between two sets of variables, especially when there are multiple intercorrelated outcome variables [25]. It produces a set of canonical variates which are orthogonal linear combinations of variables in each set and which best explain the variability within and between the sets. Canonical loadings are the same as the correlations between observed variables and canonical variables. In order to improve interpretability of loadings, the loadings were orthogonally rotated by multiplying by an orthogonal matrix that satisfies a certain criterion. The varimax orthogonal rotation was used, which is less likely to produce a general factor than the other main orthogonal rotation, quartimax [26]. Thus, we obtain a set of loadings with maximum variability [27]. When interpreting the results of canonical correlation analysis, there is an assumption that one set of variables is causing the response in the other set of variables [28].

3.2

INTERBATTERY FACTOR ANALYSIS

Interbattery factor analysis is a means of determining factors that are in common to two sets of variables [28]. The tests for the number of significant canonical correlations have been found to also be tests for the number of significant interbattery factors [28]. In canonical correlation analysis, the way in which weighted sums of original variables is looked at, while in interbattery factor analysis, it is assumed that the same factors influence variables in the two domains [28]. Thus, this approach assumes that there is an underlying factor that causes changes in both sets of variables and results hint at the two factors having a common cause [28]. As described by Huba, et al. [28], let E be the (r + s) x t matrix of canonical variate loadings and U be the t x 1 vector of canonical correlations. The matrix of interbattery factor loadings F is obtained from rescaling the canonical loadings by the square root of the canonical correlations: 1

F = EU 2

Interbattery factors were rotated using a common transformation matrix T. By rotating by a common transformation matrix after rescaling loadings, it is possible to direct attention to factors that are equally related to both sets of original variables [29]. 8

Various studies have taken factors with a magnitude greater than or equal to 0.30 to be significant [30] [31], while others have focused on factors above 0.20 [28]. In this study, loadings of 0.25 and higher were considered.

3.3

LINEAR REGRESSION

Canonical correlation analysis and interbattery factor analysis were used as variable selection models. To quantify the quality of the variables we obtained, multivariate linear regressions were run using the variables highlighted in the analyses. R2 values were obtained, where: Y = a0 + a1 x1 + a2 x2 + ... + an xn SSTotal = Σ(Y − Y¯ )2 SSResid = Σ(Y − f ( xi ))2 R2 =

SSTotal − SSResid SSTotal

The R2 value is the total response variation explained by the regression model .

3.4

DISTANCE CORRELATION

This is a non-parametric test of multivariate independence which is sensitive to all types of dependence structures in data and was developed by Maria Rizzo [32]. Because it is non-parametric, it does not assume any sort of distribution or structure for the data. Thus, it is not limited to linear relationships as with canonical correlation analysis. The distance correlation satisfies 0 ≤ R ≤ 1 and R = 0 only if X and Y are independent. In this study, a permutation test was used to determine independence. In this test, the rows of either X or Y data matrices are permuted. For each permutation, the distance correlation statistic is computed. The distance correlation statistic is computed by first taking the Euclidean norm to calculate all pairwise distances: a j,k = || X j − Xk || for j, k = 1, 2, ..., n b j,k = ||Yj − Yk || for j, k = 1, 2, ..., n 9

Then, the row and column means are subtracted and the grand means are added to a j,k and b j,k and the test statistic is given by Tn = ∑nj,k=1 A j,k Bj,k , where : A j,k = a j,k − a¯j. − a¯.k + a¯.. Bj,k = b j,k − b¯j. − b¯.k + b¯..

3.5

PRINCIPAL COMPONENT LOGISTIC REGRESSION

Principal component analysis is a technique in which a set of orthogonal linear projections ξ = g T X of a collection of correlated variables is derived. This is useful not only as a dimensionality-reduction technique, but also as a way to look at features of the data based on the loadings of the components. The first principal component ξ 1 = g1 T X is derived by maximizing: var (ξ 1 ) subject to g1T g1 = 1 Then, a second principal component is derived by maximizing the same constraints and such that this component is orthogonal to the first component. Thus, we are maximizing: var (ξ 2 ) subject to g2T g2 = 1 and g2T Σ XX g1T This process is repeated to obtain the set of principal components ordered in terms of the amount of variance they explain. Once principal components were obtained, there were used in lieu of original variables in logistic regression. Using these components as variables is useful because we can reduce the dimension and select only the best components to be used in the regression, as well as help with collinearity issues [33]. Logistic regression models are suitable for binary response variables and involve a generalized linear model with a log-odds link function: y = φ(b0 + b1 x1 + b2 x2 + ...) where φ( x ) =

10

ex 1 + ex

CHAPTER 4 DATA NHANES is the National Health and Nutrition Examination Survey, an ongoing survey conducted by the Centers for Disease Control and Prevention (CDC). The study was conducted using a multistage, probability sampling design in which sample segments of national counties in the U.S. were selected, then households and individuals from those households were selected randomly [34]. Institutionalized individuals, those in nursing homes, and members of the armed forces were excluded from the NHANES study. Laboratory data, which includes toxin concentration data, were from specimens obtained at Mobile Examination Centers. BMI and waist circumference data were obtained from the examination portion of the study. Information about diabetes, cancer, and weight loss was self-reported in the questionnaire portion of the NHANES study. In the first part of this study, data from the NHANES study from the years 2007-2010 was used [35] [36]. Data sets were modified to include only individuals 18 years old and older. The resulting age range was 18-80 years of age. Contents and sample sizes for each sub-group of data are given in the table below. Toxin data was obtained from the laboratory portion of NHANES, and body measurement data was obtained from the examination portion. In the second part of this study, we looked at the relationship between environmental toxins and the health outcomes of diabetes and cancer. Data from NHANES from the years 2007-2010 was used [35] [36]. Due to the manner in which data was obtained, the number of toxins used in this part of the study was limited to environmental pesticides, phenols, and phthalates. The age range for the participants was 18-80 years of age and there were N = 3765 observations. Toxin data was obtained from the laboratory portion of NHANES, and health outcome data was self-reported data from the questionnaire portion. Finally, in the third part of this study, we looked at the relationship between weight loss and persistent organic pollutants (POPs) found in the body. Data from the NHANES study from the years 2001-2004 was used [37] [38]. The years of data used in this study were limited by availability of data and the manner in which data was linked. For 2005-2008, persistent organic pollutant data was only obtained for a sub-sample of the participants and data is not linked to 11

Table 4.1: Data from 2007-2010 for obesity and toxin portion of the study Environmental Pesticides N = 3765

Environmental phenols N = 3765

Polyfluorinated compounds N = 3614

Urinary phthalates N = 3765

Heavy metals N = 3858

Obesity indicators

2,5-dichlorophenol O-Phenyl phenol 2,4-dichlorophenol 2,4,5-trichlorophenol 2,4,6-trichlorophenol Urinary 4-tert-octylphenol Urinary benzophenone-3 Urinary bisphenol A Urinary triclosan Butyl paraben Ethyl paraben Methyl paraben Propyl paraben 2-(N-ethyl-PFOSA) acetate Perfluorodecanoic acid Perfluorooctanoic acid Perfluorooctane sulfonic acid Perfluorohexane sulfonic acid 2-(N-methyl-PFOSA) acetate Perfluorobutane sulfonic acid Perfluoroheptanoic acid Perfluorononanoic acid Perfluorooctane sulfonamide Perfluoroundecanoic acid Perfluorododecanoic acid Mono(carboxynonyl) phthalate Mono(carboxyoctyl) phthalate Mono-2-ethyl-5-carboxypentyl phthalate Mono-n-butyl phthalate Mono-(3-carboxypropyl) phthalate Mono-cyclohexyl phthalate Mono-ethyl phthalate Mono-(2-ethyl-5-hydroxyhexyl) phthalate Mono-(2-ethyl)-hexyl phthalate Mono-n-methyl phthalate Mono-isononyl phthalate Mono-(2-ethyl-5-oxohexyl) phthalate Mono-n-octyl phthalate Mono-benzyl phthalate Mono-isobutyl phthalate Urinary barium Urinary beryllium Urinary cadmium Cobalt Cesium Molybdenum Lead Platinum Antimony Thallium Tungsten Uranium Body mass index Waist circumference

12

Table 4.2: Data from 2007-2010 for chronic illness and toxin portion of the study Environmental Pesticides 2,5-dichlorophenol N = 3765 O-Phenyl phenol 2,4-dichlorophenol 2,4,5-trichlorophenol 2,4,6-trichlorophenol Environmental phenols Urinary 4-tert-octylphenol N = 3765 Urinary benzophenone-3 Urinary bisphenol A Urinary triclosan Butyl paraben Ethyl paraben Methyl paraben Propyl paraben Urinary phthalates Mono(carboxynonyl) phthalate N = 3765 Mono(carboxyoctyl) phthalate Mono-2-ethyl-5-carboxypentyl phthalate Mono-n-butyl phthalate Mono-(3-carboxypropyl) phthalate Mono-cyclohexyl phthalate Mono-ethyl phthalate Mono-(2-ethyl-5-hydroxyhexyl) phthalate Mono-(2-ethyl)-hexyl phthalate Mono-n-methyl phthalate Mono-isononyl phthalate Mono-(2-ethyl-5-oxohexyl) phthalate Mono-n-octyl phthalate Mono-benzyl phthalate Mono-isobutyl phthalate Health outcomes Doctor told you have diabetes Ever told you have cancer or malignancy

13

weight-change data. For 2009 onward, persistent organic pollutant data has not been made publicly available. This data covers participants ranging from 36-85 years old and N = 785 participants. Contents of this data set are given in the table below. Persistent organic pollutant data was obtained from the laboratory portion of NHANES, and weight change data was self-reported data obtained from the questionnaire portion. Table 4.3: Data from 2001-2004 for persistent organic pollutants and weight change Persistent organic pollutants 1,2,3,4,6,7,8-heptachlorodibenzofuran (1,2,3,4,6,7,8-HpCDF) N = 785 trans nonachlor p,p- dichlorodiphenyldichloroethylene (p,p-DDE) β − hexachlorocyclohexane (β − HCH ) PCB 180 PCB 169 1,2,3,4,6,7,8-heptachlorodibenzo-p-dioxin (1,2,3,4,6,7,8-HpCDD) Weight change Weight change over the last year (last years weight minus current weight) Weight change over the last 10 years (weight 10 years ago minus current weight)

14

CHAPTER 5 RESULTS 5.1

ENVIRONMENTAL TOXINS AND OBESITY 5.1.1

Environmental pesticides

Logs were taken of the environmental pesticide concentration data and initial correlations were computed (Table 5.1). A noticeable correlation (above 0.6) was found between 2,5-dichlorophenol and 2,4-dichlorophenol. 2,4-dichlorophenol is used in wood pulp bleaching and 2,5-dichlorophenol is in resins used in construction and as plywood adhesive [9]. Thus, exposure to processed plywood and construction materials may be a potential source of this high correlation. Also, understandably, BMI and waist circumference were highly correlated. There were no noticeably high correlations between body measurements and environmental pesticides. Canonical correlation analysis was performed to determine the cumulative relationship between environmental pesticides and BMI/waist circumference. Canonical variable coefficients and correlation loadings were obtained. From Table 5.2, two factors were obtained and found to be significant using a likelihood ratio test. Although both canonical correlations are small, the correlation for the second set of canonical variables is negligible (0.072). Thus, its likely that only the first factor is going to have any significant meaning. Loadings were orthogonally rotated to improve interpretability (Table 5.3) and large correlation loadings are presented in bold-type. The first pesticide canonical factor is very highly correlated with both 2,5-dichlorophenol and 2,4-dicholorophenol. The corresponding first body factor has a high correlation with BMI. The first canonical correlation from Table 5.2 is 0.169, indicating a small correlation between the two dichlorophenols and BMI. The second pesticide canonical factor is very highly correlated with 2,4,5trichlorophenol and the second body factor is highly correlated with waist circumference. However, the canonical correlation between these two factors is 0.072, indicating a fairly negligible correlation between waist circumference and 2,4,5-trichlorophenol. 15

Table 5.1: Correlations between obesity indicators and environmental pesticides

Table 5.2: Canonical correlation results for environmental pesticides and obesity indicators

16

Table 5.3: Rotated Correlation Loadings for Environmental Pesticides Rotated Correlation Loadings for log(environmental pesticides) logpest1 logpest2 log(2,5-dichlorophenol) 0.9750 0.1469 log(O-phenyl phenol) 0.0626 0.1261 log(2,4-dichlorophenol) 0.8029 0.1122 log(2,4,5-trichlorophenol) 0.0430 0.9583 log(2,4,6-trichlorophenol) 0.2121 0.1395 Rotated Correlation Loadings for body measurements body1 body2 BMI 0.8415 0.5402 Waist circumference 0.5245 0.8514

From the canonical redundancy analysis (Table 5.4), there is very little relationship between the two domains of pesticide concentrations and body measurements. The pesticide factors only explain 1.3% of the variance in the body measurement factors and the body measurement factors explain 1.9% of the variance in the pesticide factors. The initial (un-rotated) canonical correlation loadings were then converted into interbattery factor loadings and orthogonally rotated using a common transformation matrix. As discussed previously, these modifications incorporate a relevant assumption that the same underlying factors influence the variables in both domains and allow for ease of interpretation [28]. The resulting interbattery factor loadings are given in Table 5.5. Factor loadings of 0.25 and above are presented in bold-type. The first factor has noticeable loadings for 2,5-dichlorophenol, 2,4-dichlorophenol, BMI and waist circumference. Thus, the first factor appears to indicate there is a common cause behind the two indicators of obesity and 2,4- and 2,5-dichlorophenol. The second factor only has one factor loading above 0.25, making it not useful for interpretation. Distance correlation, a non-parametric method that is sensitive to all types of dependence structures, was also used with the original (un-logged) data. A permutation test of independence was used with 699 replicates. A p-value of 0.006 was obtained from this test, so the null hypothesis of independence is rejected at α = 0.01. This indicates that there is a relationship between environmental pesticides and obesity. 17

Table 5.4: Canonical Redundancy Analysis for Environmental Pesticides

Table 5.5: Rotated Interbattery Factor Loadings for Environmental Pesticides and Obesity Indicators

log(2,5-dichlorophenol) log(O-phenyl phenol) log(2,4-dichlorophenol) log(2,4,5-trichlorophenol) log(2,4,6-trichlorophenol) BMI Waist circumference

18

factor1 0.3817 0.0191 0.3147 -0.0275 0.0780 0.3730 0.2723

factor2 0.0078 0.0333 0.0040 0.2676 0.0319 0.0392 0.1567

Table 5.6: Correlations between obesity indicators and environmental phenols

5.1.2

Environmental phenols

Logs were taken of the environmental phenol concentration data and initial correlations were computed (Table 5.6). There was a noticeable correlation between methyl paraben and propyl paraben, both of which are used as preservatives in personal care products and pharmaceuticals [39]. There are no noticeably high correlations between body measurements and environmental phenols. Canonical correlation analysis was performed to determine the cumulative relationship between environmental phenols and BMI/waist circumference. Canonical variable coefficients and correlation loadings were obtained. From Table 5.7, two factors were obtained and found to be significant using a likelihood ratio test, but both canonical correlations are small. Loadings were orthogonally 19

Table 5.7: Canonical correlation results for environmental phenols and obesity indicators

Table 5.8: Rotated Correlation Loadings for environmental phenols Rotated correlation loadings for log(environmental phenols) logphen1 logphen2 log(Urinary 4-tert-octylphenol) -0.0204 0.1688 log(Urinary benzophenone-3) 0.4544 0.0361 log(Urinary bisphenol A) 0.0310 0.6961 log(Urinary triclosan) 0.1298 0.3186 log(Butyl paraben) 0.7017 -0.0635 log(Ethyl paraben) 0.6176 -0.3657 log(Methyl paraben) 0.7445 0.3383 log(Propyl paraben) 0.9176 0.2674 Rotated Correlation Loadings for body measurements body1 body2 BMI -0.1824 0.9832 Waist circumference -0.5903 0.8072

rotated to improve interpretability (Table 5.8) and large correlation loadings are presented in bold-type. The first phenol canonical factor is highly correlated with all four parabens. The corresponding first body factor is highly negatively correlated with waist circumference. The first canonical correlation from Table 5.7 is 0.279, indicating a moderate negative correlation between parabens and waist circumference. This is interesting, as it appears to imply that higher levels of parabens are associated with smaller waist circumferences and lower body fat levels. The second phenol canonical factor is highly correlated with urinary bisphenol A and the second body factor is highly correlated with both BMI and waist circumference. The canonical correlation between these two factors is 0.128, indicating a small correlation between the two obesity indicators and bisphenol A. From the canonical redundancy analysis (Table 5.9), there is very little rela20

Table 5.9: Canonical correlation results for environmental pesticides and obesity indicators

tionship between the two domains of phenol concentrations and body measurements. The phenol factors only explain 3.5% of the variance in the body measurement factors and the body measurement factors explain 3.4% of the variance in the phenol factors. The initial (un-rotated) canonical correlation loadings were then converted into interbattery factor loadings and orthogonally rotated using a common transformation matrix. The resulting interbattery factor loadings are given in Table 5.10. Factor loadings of 0.25 and above are presented in bold-type. The first factor has noticeable loadings for the four parabens and waist circumference. Thus, the first factor appears to indicate there is a common cause behind the four parabens and waist circumference. The second factor has noticeable loadings for urinary bisphenol A and both BMI and waist circumference. Thus, there is a common factor indicated behind both the presence of bisphenol A and the indicators of obesity. Distance correlation, a non-parametric method that is sensitive to all types of dependence structures, was also used with the original (un-logged) data. A permutation test of independence was used with 699 replicates. A p-value of 0.0014 was obtained from this test, so the null hypothesis of independence is rejected at α = 0.01. This indicates that there is a relationship between environmental phenols and obesity.

21

Table 5.10: Rotated Interbattery Factor Loadings for environmental phenols and obesity indicators

log(Urinary 4-tert-octylphenol) log(Urinary benzophenone-3) log(Urinary bisphenol A) log(Urinary triclosan) log(Butyl paraben) log(Ethyl paraben) log(Methyl paraben) log(Propyl paraben) BMI Waist circumference

5.1.3

factor1 -0.0032 0.2403 0.0474 0.0824 0.3657 0.3081 0.4062 0.4940 -0.1262 -0.3355

factor2 0.0607 0.0103 0.2497 0.1136 -0.0269 -0.1349 0.1171 0.0906 0.3419 0.2607

Polyfluorinated compounds

Logs were taken of the polyfluorinated compound concentration data and initial correlations were computed (Table 5.11). There are a number of noticeable correlations between these compounds, as shown in Table 5.11. Polyfluorinated compounds have multiple commercial applications and it is not initially clear what these cause of these particularly high correlations might be. However, there are no noticeably high correlations between BMI/waist circumference and polyfluorinated compounds. Canonical correlation analysis was performed to determine the cumulative relationship between polyfluorinated compounds and BMI/waist circumference. Canonical variable coefficients and correlation loadings were obtained. From Table 5.12, two factors were obtained and found to be significant using a likelihood ratio test, but both canonical correlations are small. Loadings were orthogonally rotated to improve interpretability (Table 5.13) and large correlation loadings are presented in bold-type. The first polyfluorinated compound canonical factor is highly correlated with perfluorooctane sulfonic acid and perfluorohexane sulfonic acid. The corresponding first body factor is only slightly correlated with waist circumference. The first canonical correlation from Table 5.12 is 0.256, indicating a moderate correlation between the two acids and waist circumference. The second polyfluorinated compound canonical factor is highly negatively correlated with perfluoroundecanoic acid and the second body factor is highly correlated with both BMI and waist circumference. The canonical correlation between these two factors 22

Table 5.11: Correlations between obesity indicators and polyfluorinated compounds

23

Table 5.12: Canonical correlation results for polyfluorinated compounds and obesity indicators

Table 5.13: Rotated Correlation Loadings for polyfluorinated compounds Rotated Correlation Loadings for log(polyfluorinated compounds) logpolyfluor1 logpolyfluor2 log(2-(N-ethyl-PFOSA) acetate) 0.1459 -0.1965 log(Perfluorodecanoic acid) 0.0785 -0.3842 log(Perfluorooctanoic acid) 0.5359 -0.0997 log(Perfluorooctane sulfonic acid) 0.8066 -0.1486 log(Perfluorohexane sulfonic acid) 0.7687 -0.2294 log(2-(N-methyl-PFOSA) acetate) 0.4218 -0.2064 log(Perfluorobutane sulfonic acid) 0.0288 -0.1693 log(Perfluoroheptanoic acid) 0.1380 -0.0767 log(Perfluorononanoic acid) 0.3213 0.0291 log(Perfluorooctane sulfonamide) 0.0657 -0.1319 log(Perfluoroundecanoic acid) 0.0335 -0.6092 log(Perfluorododecanoic acid) 0.0307 -0.2011 Rotated Correlation Loadings for body measurements body1 body2 BMI -0.1131 0.9936 Waist circumference 0.3324 0.9431

24

Table 5.14: Canonical redundancy analysis for polyfluorinated compounds

is 0.161, indicating a small correlation between the two obesity indicators and perfluoroundecanoic acid. From the canonical redundancy analysis (Table 5.14), there is very little relationship between the two domains of polyfluorinated compound concentrations and body measurements. The polyfluorinated compound factors only explain 3.0% of the variance in the body measurement factors and the body measurement factors explain 1.9% of the variance in the polyfluorinated compound factors. The initial (un-rotated) canonical correlation loadings were then converted into interbattery factor loadings and orthogonally rotated using a common transformation matrix. The resulting interbattery factor loadings are given in Table 5.15. Factor loadings of 0.25 and above are presented in bold-type. The first factor has noticeable loadings for perfluorooctanoic acid, perfluorooctane sulfonic acid, and perfluorooctane sulfonic acid. Although this factor doesnt include any sizeable loadings for any of the body measurements, it hints at a common underlying cause for high levels of these three compounds in individuals. The second factor has noticeable loadings for both body measurements, but no noticeable loadings for any of the polyfluorinated compounds. Thus, this hints that there is a common cause behind both BMI and waist circumference, but other than this rather unremarkable indication, this factor does not provide any other information. Distance correlation, a non-parametric method that is sensitive to all types of dependence structures, was also used with the original (un-logged) data. A 25

Table 5.15: Rotated Interbattery Factor Loadings for polyfluorinated compounds and body measurements

log(2-(N-ethyl-PFOSA) acetate) log(Perfluorodecanoic acid) log(Perfluorooctanoic acid) log(Perfluorooctane sulfonic acid) log(Perfluorohexane sulfonic acid) log(2-(N-methyl-PFOSA) acetate) log(Perfluorobutane sulfonic acid) log(Perfluoroheptanoic acid) log(Perfluorononanoic acid) log(Perfluorooctane sulfonamide) log(Perfluoroundecanoic acid) log(Perfluorododecanoic acid) BMI Waist circumference

factor1 0.0790 0.0500 0.2736 0.4118 0.3948 0.2188 0.0191 0.0718 0.1617 0.0367 0.0332 0.0209 -0.0437 0.1811

factor2 -0.0767 -0.1531 -0.0321 -0.0477 -0.0807 -0.0766 -0.0676 -0.0288 0.0165 -0.0520 -0.2442 -0.0803 0.4009 0.3730

permutation test of independence was used with 699 replicates. A p-value of 0.0014 was obtained from this test, so the null hypothesis of independence is rejected at α = 0.01. This indicates that there is a relationship between polyfluorinated compounds and obesity.

5.1.4

Urinary phthalates

Logs were taken of the urinary phthalate concentration data and initial correlations were computed (Table 5.16). There are a number of noticeable correlations between the urinary phthalates. These compounds are used widely as plasticizers and it is not initially clear what the cause of these high correlations might be. However, there are no noticeably high correlations between BMI/waist circumference and urinary phthalates. Canonical correlation analysis was performed to determine the cumulative relationship between urinary phthalates and BMI/waist circumference. Canonical variable coefficients and correlation loadings were obtained. From Table 5.17, two factors were obtained and found to be significant using a likelihood ratio test, but both canonical correlations are small. Loadings were orthogonally rotated to 26

Table 5.16: Correlations between obesity indicators and urinary phthalates

27

Table 5.17: Canonical correlation results for urinary phthalates and obesity indicators

improve interpretability (Table 5.18) and large correlation loadings are presented in bold-type. The first phthalate canonical factor is highly correlated with both monoethyl phthalate and mono-isobutyl phthalate. The corresponding first body factor is very highly correlated with both obesity indicators. The first canonical correlation from Table 5.17 is 0.227, indicating a moderate correlation between these two phthalates and both BMI and waist circumference. The second phthalate canonical factor has only one moderately large correlation loading corresponding to mono-isobutyl phthalate and the second body factor is moderately correlated with BMI. The canonical correlation between these two factors 0.185, indicating a small correlation between BMI and mono-isobutyl paraben. From the canonical redundancy analysis (Table 5.19), there is very little relationship between the two domains of urinary phthalate concentrations and body measurements. The phthalate factors only explain 5.1% of the variance in the body measurement factors and the body measurement factors explain 1.3% of the variance in the phthalate factors. The initial (un-rotated) canonical correlation loadings were then converted into interbattery factor loadings and orthogonally rotated using a common transformation matrix. The resulting interbattery factor loadings are given in Table 5.20. Factor loadings of 0.25 and above are presented in bold-type. The first factor has noticeable loadings for mono-ethyl phthalate, BMI and waist circumference. Thus, the first factor appears to indicate there is a common cause behind mono-ethyl phthalate and the two obesity indicators. The second factor only has a noticeable loading for mono-isobutyl phthalate, making this factor un-interpretable. Distance correlation, a non-parametric method that is sensitive to all types of dependence structures, was also used with the original (un-logged) data. A permutation test of independence was used with 699 replicates. A p-value of 0.0014 was obtained from this test, so the null hypothesis of independence is rejected at α = 0.01. This indicates that there is a relationship between phthalates and obesity.

28

Table 5.18: Rotated Correlation Loadings for urinary phthalates Rotated Correlation Loadings for log(urinary phthalates) logphthal1 logphthal2 log(Mono(carboxynonyl) phthalate) 0.4901 0.1699 log(Mono(carboxyoctyl) phthalate) 0.5086 0.2301 log(Mono-2-ethyl-5-carboxypentyl phthalate) 0.5209 0.1864 log(Mono-n-butyl phthalate) 0.4661 0.4056 log(Mono-(3-carboxypropyl) phthalate) 0.4389 0.0432 log(Mono-cyclohexyl phthalate) 0.0908 0.0978 log(Mono-ethyl phthalate) 0.6077 0.3607 log(Mono-(2-ethyl-5-hydroxyhexyl) phthalate) 0.4882 0.1745 log(Mono-(2-ethyl)-hexyl phthalate) 0.0353 0.5095 log(Mono-n-methyl phthalate) 0.1395 0.2033 log(Mono-isononyl phthalate) 0.0910 0.2956 log(Mono-(2-ethyl-5-oxohexyl) phthalate) 0.5012 0.1967 log(Mono-n-octyl phthalate) 0.0732 0.0214 log(Mono-benzyl phthalate) 0.4704 0.2576 log(Mono-isobutyl phthalate) 0.5527 0.5930 Rotated Correlation Loadings for body measurements body1 body2 BMI 0.9051 0.4251 Waist circumference 1.0000 -0.0089

29

Table 5.19: Canonical Redundancy analysis for urinary phthalates

Table 5.20: Rotated Interbattery Factor Loadings for urinary phthalates and obesity indicators

log(Mono(carboxynonyl) phthalate) log(Mono(carboxyoctyl) phthalate) log(Mono-2-ethyl-5-carboxypentyl phthalate) log(Mono-n-butyl phthalate) log(Mono-(3-carboxypropyl) phthalate) log(Mono-cyclohexyl phthalate) log(Mono-ethyl phthalate) log(Mono-(2-ethyl-5-hydroxyhexyl) phthalate) log(Mono-(2-ethyl)-hexyl phthalate) log(Mono-n-methyl phthalate) log(Mono-isononyl phthalate) log(Mono-(2-ethyl-5-oxohexyl) phthalate) log(Mono-n-octyl phthalate) log(Mono-benzyl phthalate) log(Mono-isobutyl phthalate) BMI Waist circumference

30

factor1 0.2169 0.2211 0.2301 0.1885 0.2022 0.0353 0.2580 0.2157 -0.0207 0.0504 0.0210 0.2201 0.0327 0.2013 0.2153 0.4638 0.4592

factor2 0.0954 0.1221 0.1039 0.1960 0.0384 0.0463 0.1830 0.0973 0.2213 0.0940 0.1316 0.1074 0.0125 0.1323 0.2807 0.0655 -0.1266

5.1.5

Heavy metals

Logs were taken of the heavy metal concentration data and initial correlations were computed (Table 5.21). There is a noticeable correlation between cesium and both thallium and molybdenum. Both cesium and thallium are used in medical applications (for cancer radiation treatment and radio imaging, respectively) and molybdenum is used in steel manufacturing. Thus, the reasons for these correlations are not immediately apparent. However, there are no noticeably high correlations between BMI/waist circumference and heavy metals. Canonical correlation analysis was performed to determine the cumulative relationship between heavy metals and BMI/waist circumference. Canonical variable coefficients and correlation loadings were obtained. From Table 5.22, two factors were obtained and found to be significant using a likelihood ratio test, but both canonical correlations are small. Loadings were orthogonally rotated to improve interpretability (Table 5.23) and large correlation loadings are presented in bold-type. The first heavy metal canonical factor is highly correlated with urinary cadmium and lead. The corresponding first body factor is moderately correlated with waist circumference. The first canonical correlation from Table 5.22 is 0.256, indicating a moderate correlation between cadmium, lead and waist circumference. The second heavy metal canonical factor is highly correlated with thallium and the second body factor is highly correlated with both BMI and waist circumference. The canonical correlation between these two factors is 0.115, indicating a small correlation between the two obesity indicators and thallium. From the canonical redundancy analysis (Table 5.24), there is very little relationship between the two domains of heavy metal concentrations and body measurements. The heavy metal factors only explain 1.7% of the variance in the body measurement factors and the body measurement factors explain 0.7% of the variance in the heavy metal factors. The initial (un-rotated) canonical correlation loadings were then converted into interbattery factor loadings and orthogonally rotated using a common transformation matrix. The resulting interbattery factor loadings are given in Table 5.25. Factor loadings of 0.25 and above are presented in bold-type. The first factor has noticeable loadings for cadmium, lead and waist circumference. Thus, the first factor appears to indicate there is a common cause behind the presence of cadmium and lead and increased waist circumference. The second factor has noticeable loadings thallium and BMI. Thus, there is a common factor behind both the presence of thallium and elevated BMI. Distance correlation, a non-parametric method that is sensitive to all types of dependence structures, was also used with the original (un-logged) data. A 31

Table 5.21: Correlations between obesity indicators and heavy metals

Table 5.22: Canonical correlation results for heavy metals and obesity indicators

32

Table 5.23: Rotated correlation loadings for heavy metals Rotated Correlation loadings for log(heavy metals) logmetals1 logmetals2 log(Urinary barium) 0.2954 0.3964 log(Urinary beryllium) 0.0678 0.0383 log(Urinary cadmium) 0.8053 0.0825 log(Cobalt) 0.0514 0.4292 log(Cesium) 0.4746 0.3804 log(Molybdenum) 0.2300 0.4137 log(Lead) 0.5708 -0.0293 log(Platinum) 0.0478 -0.0992 log(Antimony) 0.1141 0.2801 log(Thallium) 0.3709 0.7426 log(Tungsten) 0.0120 0.0458 log(Uranium) 0.2165 0.1440 Rotated Correlation Loadings for body measurements body1 body2 BMI -0.1912 0.9815 Waist circumference 0.2600 0.9656

Table 5.24: Canonical redundancy analysis for heavy metals

33

Table 5.25: Rotated correlation loadings for heavy metals

log(Urinary barium) log(Urinary beryllium) log(Urinary cadmium) log(Cobalt) log(Cesium) log(Molybdenum) log(Lead) log(Platinum) log(Antimony) log(Thallium) log(Tungsten) log(Uranium) BMI Waist circumference

factor1 0.0923 0.0257 0.3368 -0.0147 0.1702 0.0628 0.2463 0.0289 0.0248 0.0949 0.0012 0.0802 0.1590 0.3211

factor2 0.1421 0.0105 -0.0307 0.1756 0.1208 0.1547 -0.0586 -0.0454 0.1081 0.2810 0.0182 0.0428 0.3085 0.1475

permutation test of independence was used with 699 replicates. A p-value of 0.07 was obtained, so the null hypothesis of independence is accepted at α = 0.01. Thus, this non-parametric method doesnt indicate any dependence relationship between heavy metals and obesity.

5.1.6

Linear regression

Using the results of both the canonical correlation analysis and the interbattery factor analysis, multivariate linear regressions were run. Obesity indicators were regressed on selected toxin variables. The R2 values for each regression are given in Table 5.26.

5.2

TOXINS AND HEALTH OUTCOMES

In the second part of this study, we will look at relationships between the presence of environmental toxins in participants body fluids and the health outcomes of diabetes and cancer using principal component logistic regression. Principal component analysis was used on the pesticide, phenol, and phthalate data. It was desirable but impossible to conduct principal component analysis on all 34

Table 5.26: Linear regression results Pesticides Phenols

Polyfluorinated Compounds Phthalates Heavy metals

Regression variables BMI = f(2,4-dichlorophenol, 2,5-dichlorophenol) waist circumference = f( 2,4-dichlorophenol, 2,5-dichlorophenol) waist circumference = f(parabens) BMI = f(BPA) waist circumference = f(BPA) waist circumference = f(perfluorooctane sulfonic acid, perfluorohexane sulfonic acid) BMI = f(perfluoroundecanoic acid) waist circumference = f(perfluoroundecanoic acid) BMI = f(mono-ethyl phthalate, mono-isobutyl phthalate) waist circumference = f(mono-ethyl phthalate, mono-isobutyl phthalate) waist circumference = f(cadmium, lead) BMI = f(thallium) waist circumference = f(thallium)

R2 0.021 *** 0.011 *** 0.030 *** 0.006 *** 0.002 ** 0.003 ** 0.010 *** 0.007 *** 0.022 *** 0.010 *** 0.010 *** 0.010 *** 0.005 ***

Table 5.27: Cumulative variance accounted for by principal components

Prop. of Variance Cumulative Prop.

Comp.1 0.2800

Comp.2 0.0939

Comp.3 0.0815

Comp.4 0.0662

Comp.5 0.0453

Comp.6 0.0432

Comp.7 0.0366

Comp.8 0.0341

0.2800

0.3739

0.4554

0.5216

0.5669

0.6101

0.6467

0.6808

the toxins combined, as there was no overlap between the subjects who had pesticides, phenols, and phthalates measured and those that had either heavy metals or polyfluorinated compounds measured. A plot of the correlations between the toxins is given in Figure 5.1. Cumulative variance accounted for by the first 8 principal components is given in Table 5.27 and a plot of the proportions of variance accounted for by for the first 10 principal components is given in Figure 5.2. Since there is no sudden drop-off in terms of proportion of variance accounted for by each component, the number of components to include in the model cannot be determined by looking at individual proportion of variance for each component. However, since between 5-10 principal components are typically used in a principal component regression, and since the proportions of variance accounted for are so small for components after the first component, we will select the first 6 components to include in the logistic regression. The loadings for the first 6 principal components are shown in Table 5.28. It appears that the first two components have a clear interpretation. The first component appears to be an average of all toxins, while the second principal 35

Figure 5.1: Correlations between pesticides, phenols, and phthalates

36

Figure 5.2: Variance plot for principal components

Table 5.28: Principal component loadings

Pesticides

Phenols

Phthalates

2,5-dichlorophenol O-Phenyl phenol 2,4-dichlorophenol 2,4,5-trichlorophenol 2,4,6-trichlorophenol Urinary 4-tert-octylphenol Urinary benzophenone-3 Urinary bisphenol A Urinary triclosan Butyl paraben Ethyl paraben Methyl paraben Propyl paraben Mono(carboxynonyl) phthalate Mono(carboxyoctyl) phthalate Mono-2-ethyl-5-carboxypentyl phthalate Mono-n-butyl phthalate Mono-(3-carboxypropyl) phthalate Mono-cyclohexyl phthalate Mono-ethyl phthalate Mono-(2-ethyl-5-hydroxyhexyl) phthalate Mono-(2-ethyl)-hexyl phthalate Mono-n-methyl phthalate Mono-isononyl phthalate Mono-(2-ethyl-5-oxohexyl) phthalate Mono-n-octyl phthalate Mono-benzyl phthalate Mono-isobutyl phthalate

Component 1 loadings 0.156

Component 2 loadings

Component 3 loadings 0.166 -0.438 0.143

Component 4 loadings -0.447 -0.109 -0.454 -0.266 -0.216

-0.227

-0.125

0.137

0.175 0.117 0.130 0.107 0.217 0.108

0.130 0.167 0.133

0.180 0.297 0.251 0.134 0.147 0.302

-0.151 0.172 0.157 0.128 0.169

0.181 0.173 0.139 -0.203 -0.306

37

0.167 0.287 0.180 -0.110

-0.176 0.467

0.113 -0.140 0.536

0.247 0.271

Component 6 loadings 0.250 0.224 0.227 -0.334 -0.265 -0.115

-0.168 0.266

-0.102 -0.399 -0.392 -0.454 -0.460

0.102 0.149 0.128 0.219 0.228 0.294 0.274 0.265

Component 5 loadings 0.350 -0.101 0.378 -0.163 -0.168

0.110 -0.140 0.184 0.244 -0.142 0.226 0.170 0.143 -0.129 -0.142

-0.241 0.150

-0.102 -0.113 0.439 -0.106 -0.274 -0.209

-0.255 -0.197 0.217 -0.243 -0.235 0.248 0.234 -0.248 -0.264 0.245 -0.215

Figure 5.3: Diabetes logistic regression results, with and without controlling for age

component can be interpreted as phenols versus phthalates. However, after the second component, the interpretation gets a bit unclear. Logistic regression was performed using the first six principal components alone and then again controlling for age. The first health outcome looked at was diabetes. The results of the logistic regression are given in Figure 5.3. We can see from the results that the second and fourth principal components were statistically significant in both models and that age was found to be a significant variable. The second principal component appeared to represent phenols versus phthalates, indicating that an absence of phenols (including all four parabens) paired with the presence of phthalates may be related to the presence of diabetes. This echoes the previous result involving phthalates, in which we saw evidence of paraben exposure being protective against obesity. The fourth component is harder to interpret, but features negative loadings for all of the pesticides, positive loadings for four of the phenols, and a mix of positive and negative loadings for the phthalates. Next, logistic regression was performed on cancer using the first six principal components alone and then again controlling for age. The results are given in Figure 5.4. In this analysis, the only significant variable was age. Thus, the compounds we looked at dont appear to have a strong association with cancer.

38

Figure 5.4: Cancer logistic regression results, with and without controlling for age

5.3

WEIGHT LOSS AND PERSISTENT ORGANIC POLLUTANTS

In the third portion of this study, we looked at the relationship between weight loss and persistent organic pollutants (POPs). Self-reported weight change data for both the past year and past ten years was used. Initially, data sets were split into observations for participants who had lost weight over the past 1 year/ 10 years and those whose weight had not changed or had increased over this time period. Logs were taken of POP concentration data. Outliers were removed, correlations between pollutant concentration and weight lost/gained were computed for each POP, and the resulting correlations were compared. The results are given in Table 5.29. We can see that there were significantly different correlations between those who lost weight and those who gained weight for the second and the fourth through the sixth POP, but only looking at weight lost over 10 yrs. The differences between those who gained or lost weight over the last year were not significant. In all four significant cases, we see that, in the case of those who lost weight over the past ten years, the concentration of pollutants in their system was higher the more weight they lost. However, for those who gained weight or had no weight change, they had less of the four POPs in their system the more weight they had gained over the last ten years. This agrees with Lim, et al., who found stronger differences between weight loss categories over 10 years than over 1 year [40]. Canonical correlation analysis was then conducted on this data. Initial cor39

Table 5.29: Correlations between persistent organic pollutants and weight loss/gain POP 1 year

10 years

1,2,3,4,6,7,8-HpCDF trans-nonachlor p,p-DDE β − HCH PCB 180 PCB 169 1,2,3,4,6,7,8-HpCDD 1,2,3,4,6,7,8-HpCDF trans-nonachlor p,p-DDE β − HCH PCB 180 PCB 169 1,2,3,4,6,7,8-HpCDD

Correlation for loss 0.0901 -0.0315 0.0593 0.0210 -0.0947 -0.1125 0.0922 0.1331 0.3443 0.2409 0.2855 0.0522 -0.0036 0.1475

Correlation for gain -0.0465 -0.0734 0.0068 -0.0588 -0.1898 -0.1899 -0.0871 0.0328 -0.0927 -0.0185 -0.0296 -0.2707 -0.2924 -0.0363

p-value 0.09 0.6 0.51 0.32 0.22 0.32 0.02 0.23 8.9 x 10−8 ** 0.0018 0.00013 ** 9.5 x 10−5 ** 0.00043 ** 0.03

** - p ≤ 0.0014(α = 0.01 significance level with Bonferroni correction)

40

relations are given in Table 5.30. There is a noticeable correlation between PCB 180 and PCB 169. These two compounds are congeners and the main way that the U.S. population is exposed to them is through food, as these chemicals accumulate in the fatty tissues of animals [41]. There were also noticeable correlations between PCB 180 and trans-nonachlor and between p,p- DDE and -HCH. Both trans-nonachlor and various PCBs are found in lakes and waterways such as the Great Lakes, and thus are found significantly in predatory fish from these bodies of water [42]. Thus, consumption of contaminated fish might be a cause of this correlation. The source of the correlation between p,p- DDE and -HCH is not apparent. There were no noticeably high correlations between weight change over 10 years and weight change over the last year. Also, there were no noticeable correlations between the weight change variables and the POPs. Canonical correlation analysis was performed to determine the cumulative relationship between weight change and POPs. Canonical variable coefficients and correlation loadings were obtained. From Table 5.31, two factors were obtained and were found to be significant using a likelihood ratio test. The first canonical correlation is moderate in size, while the second correlation is small. Loadings were orthogonally rotated to improve interpretability (Table 5.32) and large correlation loadings are presented in bold-type. The first POP canonical factor is highly correlated with the two PCBs and trans-nonachlor and the first weight change factor is highly correlated with weight change over the past 10 years. The first canonical correlation from Table 5.31 is 0.348, indicating a moderate correlation between the levels of PCB 180, PCB 169, trans-nonachlor and weight change over the past 10 years. The second POP canonical factor is highly correlated with 1,2,3,4,6,7,8-heptachlorodibenzofuran (1,2,3,4,6,7,8-HpCDF) and the second weight change factor is highly correlated with weight change over the past year. The second correlation from Table 5.31 is 0.147, indicating only a small correlation between 1,2,3,4,6,7,8-HpCDF and weight change over the past year. From the canonical redundancy analysis (Table 5.33), there is very little relationship between the two domains of POPs and weight change. The weight change factors only explain about 4.2% of the variance in the POP factors and the POP factors explain 9.4% of the variance weight change factors. Distance correlation, a non-parametric method that is sensitive to all types of dependence structures, was also used with the original (un-logged) data. A permutation test of independence was used with 699 replicates. A p-value of 0.001 was obtained, so the null hypothesis of independence is rejected at α = 0.01. Thus, this non-parametric method doesnt indicate any dependence relationship between weight change and the levels of persistent organic pollutants in individuals body fluids. 41

Table 5.30: Correlations between persistent organic pollutants and weight change

Table 5.31: Canonical correlation results for persistent organic pollutants and weight change

42

Table 5.32: Rotated Correlation Loadings for persistent organic pollutants and weight change Rotated Correlation Loadings for log(persistent organic pollutants) logpop1 logpop2 log(1,2,3,4,6,7,8-HpCDF) 0.0508 0.8069 log(trans nonachlor) 0.7085 0.2466 log(p,p- DDE) 0.4083 0.2754 log(β-HCH) 0.4580 0.2104 log(PCB 180) 0.8713 0.2658 log(PCB 169) 0.8871 -0.0341 log(1,2,3,4,6,7,8-HpCDD) 0.2702 0.4701 Rotated Correlation Loadings for weight change change1 change2 Weight change over past year 0.0952 0.9955 Weight change over past 10 years 0.9642 0.2652

Table 5.33: Canonical redundancy analysis for persistent organic pollutants and weight change

43

Table 5.34: Linear regression results Linear regression variables 10 yr weight change = f( Trans-nonachlor, PCB 180, PCB 169) 1 yr weight change = f(1,2,3,4,6,7,8- heptachlorodibenzofuran)

5.3.1

R2 0.103 *** 0.007 *

Linear Regression

Using the results of the canonical correlation analysis, multivariate linear regressions were run. Weight change variables were regressed on selected persistent organic pollutant variables. The R2 values for each regression are given in Table 5.34.

44

CHAPTER 6 DISCUSSION 6.1

PART I

Here we focus on the results for the interbattery factor analysis, as the assumption of a common casual factor seems more reasonable than a causal relationship between toxins and obesity. All distance correlation results were significant except for heavy metals. Throughout the results, it is interesting to note that, although both waist circumference and BMI are taken to be indicators of obesity and their highly correlated with each other, that many times only one or the other was highlighted in association with a certain group of toxins. This may be because the fat cells in the thighs and other parts of the body are different from fat cells in the fat around the waist [43]. Thus, these toxins may be affecting fat cells in the waist in a different way than fat in the rest of the body, leading either only waist circumference or BMI, but not both, to surface in many of the analyses.

6.1.1

Environmental pesticides and obesity

Both canonical correlation analysis and interbattery factor analysis highlighted 2,4-dichlorophenol and 2,5-dichlorophenol as being positively related to the obesity indicators. In the interbattery factor analysis, the results indicated that there is a common factor which is causing changes in 2,4-dichlorophenol concentrations, 2,5-dichlorophenol concentrations, BMI, and waist circumference. The previous study using NHANES 1999-2000 data also found a positive relationship between 2,4-dichlorophenol and BMI [6]. Thus, this result supports prior research.

45

6.1.2

Environmental phenols and obesity

Both methods found the largest relationship to be between parabens and waist circumference and the second-largest relationship between BPA, BMI, and waist circumference. Parabens are inversely related with waist circumference, which appears to imply a protective effect of exposure on waist circumference and hence obesity. On the other hand, BPA was found to be positively related to both the obesity indicators. The bisphenol-A result supports a growing body of research that higher urinary BPA concentrations are associated with obesity [44].

6.1.3

Polyfluorinated compounds and obesity

The results of the canonical correlation analysis indicated a moderate positive correlation between perfluorooctane sulfonic acid, perfluorohexane sulfonic acid, and waist circumference. Also, a smaller inverse relationship was seen between perfluoroundecanoic acid and the two obesity indicators. The results of the factor analysis, however, were not indicative of any relationship between polyfluorinated compounds and obesity. The first factor simply indicated a common cause for three of the compounds and the second factor indicated a common cause for the two obesity variables. Thus, no new information was revealed in the factor analysis.

6.1.4

Urinary phthalates and obesity

The results of canonical correlation analysis indicated a positive relationship between mono-ethyl phthalate, mono-isobutyl phthalate, BMI, and waist circumference. Factor analysis indicated a common factor influencing the level of mono-ethyl phthalate, BMI, and waist circumference. This supports previous research which found a positive relationship between mono-ethyl phthalate and BMI for adults [15]. The results regarding mono-isobutyl phthalate support past results that mono-isobutyl phthalate has a positive relationship with waist circumference [12]. As with a majority of the other compound groups, distance correlation indicated a non-linear dependence relationship with the urinary phthalates and obesity. Non-linear relationships between waist-circumference and phthalates have also been seen in other research [12].

46

6.1.5

Heavy metals and obesity

Similar results were obtained with both canonical correlation analysis and factor analysis. A positive relationship between cadmium, lead, and waist circumference was determined. Also, a less-strong positive relationship between thallium and BMI was also obtained. Therefore, there appears to be a common factor affecting urinary cadmium and lead levels and waist circumference. This result differs from previous research using 99-02 NHANES data, which indicated an inverse relationship between cadmium/lead and BMI/waist circumference [16]. In addition, the results indicate a common factor influencing thallium levels and BMI.

6.1.6

Linear regression

Although the R2 values obtained from the linear regressions were not very good at first glance (we would ideally want values close to 1), they are all statistically significant. There are many other factors involved in the development of obesity, we have large sample sizes, and the effects of these toxins on both BMI and waist circumference are small, so having R2 values ranging from 0.2% to 3% are reasonable.

6.2

PART II 6.2.1

Pesticides, phenols, phthalates, and chronic illness

From the principal component analysis, we see that the first two components (which account for the two highest proportions of variance) appear to represent an average of all the toxins and phenols versus phthalates, respectively. In the logistic regression involving diabetes, we see that the second principal component (absence of phenols and presence of phthalates) and the fourth component were significant. This result supports previous research, which found a positive correlation between several phthalate metabolites and insulin resistance in adult human males [14]. Of particular interest is the re-appearance of an inverse relationship and thus an implied protective effect of paraben exposure, this time with diabetes. In the logistic regression involving cancer, age was the only significant variable, indicating that there was not a significant association between the toxins examined and cancer. 47

6.3

PART III 6.3.1

Persistent organic pollutants and weight loss

In both the correlation and canonical correlation analyses, we see that only the analyses involving the weight change over the past 10 years (long-term weight change) yielded large/significant correlations. The greater strength of associations between POP levels and long term (10 year) weight change (as opposed to short-term/1 year weight change) was also seen by Lim, et al [40]. We also see the trend that increasing weight loss is accompanied by a significant increase in the amounts of four of the seven persistent organic pollutants in serum. On the other hand, the more weight individuals gained over the past ten years, the lower the concentrations of persistent organic pollutants found in their body fluids. This supports findings of several studies [22] [40]. This trend that we see here may be contributing in some way to the obesity paradox. By storing these toxins away from the organs, fat may be contributing to longevity. From the canonical correlation analysis, we see that the two PCBs and trans-nonachlor correlated with the 10 year weight change and 1,2,3,4,6,7,8-HpCDF correlated with 1 year weight change, although the latter relationship was small. The correlation between PCBs and 10 year weight change supports previous findings [40]. The distance correlation method indicates that there may be a nonlinear relationship between weight change and the levels of persistent organic pollutants in serum. Although the R2 values obtained from the linear regressions were not very good at first glance, they are all statistically significant. There are many other factors involved in the both weight loss and the presence of pollutants in the blood than we explored in this portion of the study. Thus, having R2 values of 0.7% and 10.3% is reasonable.

48

CHAPTER 7 CONCLUSION AND FUTURE WORK In this study, we have found moderate associations between various toxins and obesity, between various toxins and cancer/diabetes, and between longterm weight change and persistent organic pollutants. In particular, we saw positive relationships between obesity indicators and the following toxins: 2,4dichlorophenol, 2,5-dichlorophenol, bisphenol A, perfluorooctane sulfonic acid, perfluorohexane sulfonic acid, mono-isobutyl phthalate, cadmium, lead, cadmium, and thallium. Inverse relationships were seen between obesity indicators and parabens and also between diabetes and parabens. Thus, it is indicated that paraben exposure may be protective against both obesity and diabetes. In addition, we have found a number of results that support previous findings regarding associations between these factors. A major drawback of this work is that, since it is purely observational, the direction of causality cannot be determined. In addition, we cannot be sure that we are indeed measuring what we mean to measure as BMI and waist circumference are not perfect predictors of obesity. We have merely been able to explore some of the relationships between toxins and obesity, weight loss, cancer, and diabetes. However, the relationships that are hinted at in this work may hopefully be helpful in directing continued biological and epidemiological research. An additional drawback to this study is that the only weight loss, cancer, and diabetes data available were obtained using self-report. Thus, the quality of this data depends entirely on the quality of individuals knowledge about their past and current health. As discussed previously, the causes of obesity are complex. An emerging hypothesis is that exposure to toxins, especially endocrine-disruptors, in the fetal stages of development can contribute to obesity later in life [4]. Fetuses and newborns are much more vulnerable to such chemical attacks and are sensitive to chemicals that behave like hormones. Studies with mice have shown that brief exposure to environmental endocrine-disrupting compounds increases body weight in mice as they age [4]. Neonatal male rats exposed to the organophosphate pesticide parathion exhibited disrupted fat homeostasis later in life [45]. In addition, there is data that supports the hypothesis that exposure to chemicals 49

during development may increase the risk of obesity by altering the differentiation of adipocytes and the neural circuits that regulate feeding behavior [46]. Thus, focusing on levels of toxins in pregnant women and the obesity outcomes of their children later in life would be a useful direction of future work. In addition, based on current research involving the obesity paradox, it is clear that BMI alone cannot always predict health in an individual. Some individuals can have a normal BMI but be metabolically unhealthy, while others may have a high BMI but be metabolically healthy and have reduced mortality [47]. Thus, it may be helpful to determine better measures of metabolic health than BMI to involve in future analyses. Adding more contributing factors such as socioeconomic status and cardiovascular health to the analysis would also be a relevant follow-up focus to this study. As discussed previously, there are many factors that contribute to obesity and including more of them in the analysis would lead to a richer picture of the relationships between obesity and health and lifestyle factors. Additionally, the analysis in this study was restricted to adults. However, it would be valuable to look at relationships between these variables in children to identify any similarities and differences. Finally, adding gender and race into the analysis to look for any gender or racial dependence would be an informative future focus.

50

APPENDIX A SAS CODE A.1

Canonical correlation analysis with obesity and toxins

51

52

53

54

55

56

57

A.2

Canonical correlation analysis with weight loss and POP concentration

58

59

60

APPENDIX B R CODE B.1

Distance correlation

61

B.2

PCA and logistic regression

62

B.3

Linear regression

63

64

65

B.4

Correlations between POPs and weight loss/gain

66

67

68

69

70

71

REFERENCES [1]

CDC, ”Overweight and Obesity,” 2012. http://www.cdc.gov/obesity/data/facts.html.

[Online].

Available:

[2] Y. Wang, M. A. Beydoun, L. Liang, B. Caballero and S. K. Kumanyika, ”Will All Americans Become Overweight or Obese? Estimating the Progression and Cost of the US Obesity Epidemic,” Obesity, vol. 16, no. 10, pp. 2323-2330, 2008. [3]

National Insitutes of Health, ”What are the Health Risks of Overweight and Obesity?,” 2012. [Online]. Available: http://www.nhlbi.nih.gov/health/health-topics/topics/obe/risks.html.

[4] R. R. Newbold, E. Padilla-Banks, R. J. Snyder, T. M. Phillips and W. N. Jefferson, ”Developmental Exposure to Endocrine Disruptors and the Obesity Epidemic,” Reproductive Toxicology, vol. 23, no. 3, pp. 290-296, 2007. [5] J. J. Heindel, ”Endocrine Disruptors and the Obesity Epidemic,” Toxicology Sciences, vol. 76, pp. 247-249, 2003. [6] S. S. Vaddadhi, Effects of Xenobiotic Body Burden on Human Health, Master’s Thesis, Socorro, New Mexico: New Mexico Institute of Mining and Technology, 2005. [7] D.-H. Lee, M. W. Steffes, A. Sjodin, R. S. Jones, L. L. Needham and D. R. Jacobs, ”Low Dose Organochlorine Pesticides and Polychlorinated Biphenyls Predict Obesity, Dyslipidemia, and Insulin Resistance amoung People Free of Diabetes,” PLoS ONE, vol. 6, no. 1, 2011. [8] C. Twum and Y. Wei, ”The association between urinary concentrations of dichlorophenol pesticides and obesity in children,” Reviews on Environmental Health, vol. 26, no. 3, pp. 215-219, 2011. [9]

CDC, ”Environmental Pesticides (2,4-dichlorophenol, 2,5dichlorophenol, ortho-phenylphenol, 2,4,5-trichlorophenol, and 2,4,6-trichlorophenol,” March 2010. [Online]. Available: http://www.cdc.gov/nchs/nhanes/nhanes2003-2004/L24PP C.htm.

[10] J. Barlow and J. A. P. Johnson, ”Fact Sheet on Phenols,” Breast Cancer the Environment Research Centers, 7 November 2007. [Online]. Available: http://www.bcerc.org/COTCpubs/BCERC.FactSheet Phenols.pdf.

72

[11] F. Vom Saal, S. Nagel, B. Coe, B. Angle and J. Taylor, ”The estrogenic endocrine disrupting chemical bisphenol A (BPA) and obesity,” Molecular Cell Endocrinology, vol. 354, no. 1-2, pp. 74-84, 2012. [12] P. M. Lind, V. Roos, M. Ronn, L. Johansson, H. Ahlstrom, J. Kullberg and L. Lind, ”Serum concentrations of phthalate metabolites are related to abdominal fat distribution two years later in elderly women,” Environmental Health, vol. 11, no. 21, 2012. [13] Endocrine Society, ”Pthalate, environmental chemical is linked to higher rates of childhood obesity,” ScienceDaily, 26 June 2012. [Online]. Available: http://www.sciencedaily.com/releases/2012/06/120626113915.htm. [14] R. Stahlhut, v. W. E. T. Dye, S. Cook and S. Swan, ”Concentrations of Urinary Phthalate Metabolites are Associated with Increased Waist Circumference and Insulin Resistance in Adult US Males,” Environmental Health Perspectives, vol. 115, pp. 876-882, 2007. [15] E. E. Hatch, J. W. Nelson, M. M. Qureshi, J. Weinberg, L. L. Moore, M. Singer and T. F. Webster, ”Association of urinary phthalate metabolite concentrations with body mass index and waist circumference: a cross-sectional study of NHANES data, 1999-2002,” Environmental Health, vol. 7, no. 27, 2008. [16] M. A. Padilla, M. Elobeid, D. M. Ruden and D. B. Allison, ”An Examination of the Association of Selected Toxic Metals with Total and Central Obesity Indices: NHANES 99-02,” International Journal of Environmental Research and Public Health, vol. 7, pp. 3332-3347, 2010. [17]

CDC, ”Diabetes,” 2011. [Online]. able: http://www.cdc.gov/chronicdisease/ sources/publications/AAG/ddt.htm.

Availre-

[18] M. P. Montgomery, F. Kamel, T. M. Saldana, M. C. R. Alavanja and D. P. Sandler, ”Incident Diabetes and Pesticide Exposure among Licensed Pesticide Applicators: Agricultural Health Study, 1993-2003,” American Journal of Epidemiology, vol. 167, no. 10, pp. 1235-1246, 2008. [19] D.-H. Lee, P. M. Lind, D. R. Jacobs Jr., S. Salihovic, B. van Bavel and L. Lind, ”Polychlorinated Biphenyls and Organochlorine Pesticides in Plasma Predict Development of Type 2 Diabetes in the Elderly,” Diabetes Care, vol. 34, pp. 1778-1784, 2011. [20] National Cancer Institute at the National Institutes of Health, ”What is Cancer?,” 8 February 2013. [Online]. Available: http://www.cancer.gov/cancertopics/cancerlibrary/what-is-cancer. [21] O. Hue, J. Marcotte, F. Berrigan, M. Simoneau, J. Dore, P. Marceau, S. Marceau, A. Tremblay and N. Teasdale, ”Increased Plasma Levels of Toxic Pollutants Accompanying Weight Loss Induced by Hypocaloric Diet or by Bariatric Surgery,” Obesity Surgery, vol. 16, pp. 1145-1154, 2006. 73

[22] C. La Rocca, S. Alivernini, M. Badiali, A. Cornoldi, N. Iacovella, L. Silverstroni, G. Spera and L. Turrio-Baldassarri, ”TEQs and body burden for PCDDs, PCDFs, and dioxin-like PCBs in the human adipose tissue,” Chemosphere, vol. 73, pp. 92-96, 2008. [23] J. Chevrier, E. Dewailly, P. Ayotte, P. Mauriege, J.-P. Despres and A. Tremblay, ”Body weight loss increases plasma and adipose tissue concentrations of potentially toxic pollutants in obese individuals,” International Journal of Obesity, vol. 24, pp. 1272-1278, 2000. [24] V. Hughes, ”The Big Fat Truth,” Nature, pp. 428-430, 23 May 2013. [25]

UCLA: Statistical Consulting Group, ”SAS Data Analysis Examples: Canonical Correlation Analysis,” [Online]. Available: http://www.ats.ucla.edu/stat/sas/dae/canonical.htm. [Accessed October 2013].

[26] M. W. Browne, ”An Overview of Analytic Rotation in Exploratory Factor Analysis,” Multivariate Behavioral Research, vol. 36, no. 1, pp. 111-150, 2001. [27] R. Maitra, ”Factor Analysis - Introduction,” May 2013. [Online]. Available: http://www.public.iastate.edu/ maitra/stat501/lectures/FactorAnalysis.pdf. [28] G. Huba, M. Newcomb and P. Bentler, ”Comparison of Canonical Correlation and Interbattery Factor Analysis on Sensation Seeking and Drug Use Domains,” Applied Psychological Measurement, vol. 5, no. 3, pp. 291-306, 1981. [29] M. W. Browne, ”The maximum-likelihood solution in interbattery factor analysis,” British Journal of Mathematical and Statistical Psychology, vol. 32, pp. 75-86, 1979. [30] G. J. Boyle, ”Re-examination of the major personality-type factors in the Cattell, Comrey, and Eysenck scales: Were the factor solutions by Noller et al. optimal?,” Personality and Individual Differences, vol. 10, no. 12, pp. 12891299, 1989. [31] G. P. de Bruin, ”An Inter-battery Factor Analysis of the Comrey Personality Scales and the 16 Personality Factor Questionnaire,” Journal of Industrial Psychology, vol. 26, no. 3, pp. 4-7, 2000. [32] Rizzo, Maria L; Szekely, Gabor J, ”Package ’energy’,” 2013. [Online]. Available: http://cran.r-project.org/web/packages/energy/energy.pdf. [33] M. Clark, ”Canonical Correlation,” University of North Texas, 2009. [Online]. Available: http://www.unt.edu/rss/class/mike/6810/Cancorr.pdf. [34] CDC, ”Overview of NHANES Survey Design and Weights,” [Online]. Available: http://www.cdc.gov/nchs/tutorials/environmental/orientation /sample design/index.htm. [Accessed 2013]. 74

[35] CDC, ”National Health and Nutrition Examination Survey Data 2007-2008,” U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, [Online]. Available: http://wwwn.cdc.gov/nchs/nhanes/search/nhanes07 08.aspx. [Accessed 2013]. [36] CDC, ”National Health and Nutrition Examination Survey Data 2009-2010,” U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, [Online]. Available: http://wwwn.cdc.gov/nchs/nhanes/search/nhanes09 10.aspx. [Accessed 2013]. [37] CDC, ”National Health and Nutrition Examination Survey Data 2001-2002,” U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, [Online]. Available: http://wwwn.cdc.gov/nchs/nhanes/search/nhanes01 02.aspx. [Accessed 2013]. [38] CDC, ”National Health and Nutrition Examination Survey Data 2003-2004,” U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, [Online]. Available: http://wwwn.cdc.gov/nchs/nhanes/search/nhanes03 04.aspx. [Accessed 2013]. [39] CDC, ”Environmental Phenols (EPH E),” October 2010. [Online]. Available: http://www.cdc.gov/nchs/nhanes/nhanes2007-2008/EPH E.htm. [40] J. Lim, H.-K. Son, S. Park, D. Jacobs Jr and D.-H. Lee, ”Inverse associations between long-term weight change and serum concentrations of persistent organic pollutants,” International Journal of Obesity, no. 35, pp. 744-747, 2011. [41]

CDC, ”Non-dioxin-like Polychlorinated Biphenyls,” 2008. [Online]. Available: http://www.cdc.gov/nchs/nhanes/nhanes20032004/L28NPB C.htm.

[42] H. B. McCarty, J. Schofield, K. Miller, R. N. Brent, P. Van Hoof and B. Eadie, ”Results of the Lake Michigan Mass Balance Study: Polychlorinated Biphenyls and trans-Nonachlor Data Report,” U.S. Environmental Protection Agency, Chicago, IL, 2004. [43] K. Karastergiou, S. K. Fried, H. Xie, M.-J. Lee, A. Divoux, A. Rosencrantz, R. J. Chang and S. Smith, ”Distinct developmental signatures of human abdominal and gluteal subcutaneous adipose tissue depots,” Journal of Clinical Endocrinology Metabolism, vol. 98, no. 1, 2012. [44] J. L. Carwile and K. B. Michels, ”Urinary bisphenol A and obesity: NHANES 2003-2006,” Environmental Research, 2011.

75

[45] T. L. Lassiter, I. T. Ryde, E. A. MacKillop, K. K. Brown, E. D. Levin, F. J. Seidler and T. A. Slotkin, ”Exposure of Neonatal Rats to Parathion Elicits Sex-Selective Reprogramming of Metabolism and Alters the Response to a High-Fat Diet in Adulthood,” Environmental Health Prospectives, vol. 116, no. 11, pp. 1456-1462, 2008. [46] K. Thayer, J. Heindal, J. Bucher and M. Gallo, ”Role of Environmental Chemicals in Diabetes and Obesity: A National Toxicology Program Workshop Report.,” Environmental Health Prospectives, vol. 120, no. 6, pp. 779-789, 2012. [47] R. S. Ahima and M. A. Lazar, ”The Health Risk of Obesity - Better Metrics Imperative,” Science, vol. 341, no. 856, pp. 856-858, 2013. [48] U. Lorenzo-Seva and P. J. Ferrando, ”FACTOR: A computer program to fit the exploratory factor analysis model,” Behavior Research Methods, vol. 38, no. 1, pp. 88-91, 2006. [49]

UCLA Statistical Consulting Group, ”SAS Annotated Output: Cannonical Correlation Analysis,” [Online]. Available: http://www.ats.ucla.edu/stat/sas/output/sas CCA.htm. [Accessed September 2013].

[50] A. Mascarelli, ”Growing up with Pesticides,” Science, pp. 740-741, 16 August 2013. [51] E. Stokstad and G. Grullon, ”Pesticide Planet,” Science, p. 730, 16 August 2013. [52] J. Gomez-Ambrosi, C. Silva, V. Catalan, A. Rodriguez, J. C. Galofre, J. Escalada, V. Valenti, F. Rotellar, S. Romero, B. Ramirez, J. Salvador and G. Fruhbeck, ”Clinical Usefulness of a New Equation for Estimating Body Fat,” Diabetes Care, vol. 35, pp. 383-388, 2012. [53] P. Deurenberg, J. A. Weststrate and J. C. Seidell, ”Body mass index as a measure of body fatness: age- and sex-specific prediction formulas,” British Journal of Nutrition, vol. 65, pp. 105-114, 1991.

76

ASSOCIATIONS BETWEEN ENVIRONMENTAL TOXINS, OBESITY, WEIGHT LOSS, AND CHRONIC DISEASE USING CANONICAL CORRELATION ANALYSIS, FACTOR ANALYSIS, AND PCA LOGISTIC REGRESSION by Margaret A. Snell

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the last page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and may require a fee.

77