Assessment and Modelling of Groundwater Quality Data by Environmetric Methods in the Context of Public Health

Water Resour Manage DOI 10.1007/s11269-010-9605-0 Assessment and Modelling of Groundwater Quality Data by Environmetric Methods in the Context of Pub...
0 downloads 1 Views 566KB Size
Water Resour Manage DOI 10.1007/s11269-010-9605-0

Assessment and Modelling of Groundwater Quality Data by Environmetric Methods in the Context of Public Health Agelos Papaioannou · Eleni Dovriki · Nikolaos Rigas · Panagiotis Plageras · Ioannis Rigas · Maria Kokkora · Panagiotis Papastergiou

Received: 27 June 2009 / Accepted: 26 January 2010 © Springer Science+Business Media B.V. 2010

Abstract Various chemometric methods were used to analyze and model potable water quality data. Twenty water quality parameters were measured at 164 different sites in three representative areas (low land, semi-mountainous, and coastal) of the Thessaly region (Greece), for a 3-month period (September to November 2006). Hierarchical cluster analysis (CA) grouped the 164 sample sites into two clusters (CA-group 1 and CA-group 2) based on the similarities of potable water quality characteristics. Discriminant analysis was assigned about 94.5% of the cases grouped by CA. Factor analysis (FA) was applied to standardized log-transformed data sets to examine the differences between the above clusters and identify their latent factors. For each of the above two clusters (CA-group 1 and CA-group 2), FA yielded six latent factors that explain 68.7% and 73.4% of the total variance, respectively. FA was also identified the latent factors that characterize each cluster. The identification was obtained, using (a) descriptive statistics, (b) t test for equality of cluster means, (c) box plot, (d) error bar, (e) factors score plots, (f) matrix scatter score means plot and (g) scatter plot of the six significant latent factors from the factor set of

A. Papaioannou (B) · P. Plageras Clinical Chemistry–Biochemistry Section, Department of Medical Laboratories, Education & Technological Institute of Larissa, 41110, Larissa, Greece e-mail: [email protected] E. Dovriki · N. Rigas · I. Rigas Informatics Section, Department of Animal Production, Education &Technological Institute of Larissa, 41110, Larissa, Greece M. Kokkora Technological Research Center of Thessaly, T.E.I. of Larissa, 41110, Larissa, Greece P. Papastergiou Department of Hygiene and Epidemiology, Medical Faculty, University of Thessaly, Larissa, Greece

A. Papaioannou et al.

all samples group. The classification scheme obtained through cluster analysis was confirmed by discriminant analysis and explained by factor analysis. Keywords Potable water quality · Cluster analysis · Factor analysis · Discriminant analysis · Groundwater pollution sources · Public health

1 Introduction Water is essential to sustain life. Improving access to safe drinking-water can result in tangible benefits to health (WHO 2004; EU Directive 1998/83/EC, 2000/60/EC, 2006/118/EU). Water has a profound influence on human health. At a very basic level, a minimum amount of water is required for consumption on a daily basis for survival and therefore access to some form of water is essential for life. However, water has much broader influences on health and wellbeing and issues such as the quantity and quality of the water supplied are important in determining the health of individuals and whole communities (Fattal et al. 1988; Medema et al. 2003; Shearer et al. 1972; Shuval and Gruener 1972). The water demand is continuously increasing mainly due to population growth and raising needs in agriculture, industrial uses and domestic services. Integrated water management has a strong impact on long-term protection and sustainability (Mendiguchia et al. 2004; Singh et al. 2005; Strebel et al. 1989; Thornburn et al. 2003; Zang and Rasmussen 2005; Zhou et al. 2007a, b, c). The water resources in Greece are considered to be the most potential among the Mediterranean countries. However, variations in the hydrological conditions, the rainfall intensity and frequency (Nastos et al. 2007; Nastos and Zerefos 2008), the population density and the industrial activities affect water discharge rates in many parts of the country (González Vázquez et al. 2005; Koklu et al. 2009; Kotti et al. 2005; Lambrakis et al. 2004; Modarres 2009; Papaioannou et al. 2007; Papatheodorou et al. 2006; Simeonov et al. 2003; Zeng et al. 2008). In the last decades, methods such as cluster analysis (CA), discriminant analysis (DA), factor analysis (FA) have become accepted in identifying variations and sources of pollution in river water, groundwater, wastewater, marine sediments, seawater and rainwater (Boyacioglu and Boyacioglu 2007; Brodnjak-Vonˇcina et al. 2002; Hanrahan et al. 2008; Kowalkowski et al. 2006, 2007; Shrestha and Kazama 2007; Simeonov et al. 2002; Singh et al. 2004; Nastos and Zerefos 2009). The aim of this work was to find the correlations between the sampling sites and the variables obtained by physical and chemical measurements, which can be used to construct a fast decision model for separating different water quality samples. These results could be helpful to official authorities in optimizing the potable watermonitoring plan and enhancing their pollution control action.

2 Study Area and Sampling Water quality monitoring data were collected between September and November 2006 as part of the potable water quality control program in the region of Thessaly, located north of Athens in central Greece (Fig. 1).

Assessment and Modelling of Groundwater Quality 21.5

22.0

22.5

23.0

23.5 40.0

40.0

SM

A E G E A N S E A

C LL

39.5

39.5

THESSALIA

39.0

39.0 21.5

22.0

22.5

23.0

23.5

Fig. 1 Map of Thessaly (central Greece) with studied areas (x-axis: longitude (degrees) and y-axis: latitude (degrees)

The study area covers a large region located in central Greece, and the main uses of this region include agriculture, livestock farming, forestry, industry, urban and semi-urban growth, marine activities, road networks, disposal areas for treated effluent from major public sewage treatment works, reclamation sites, tourism, etc. Thus, the area is easily affected by agricultural cultures, industrial activities, the existence of enough forests, and urban and semi-urban activities.

3 Materials and Methods 3.1 Parameters and Analytical Methods The data set of potable water quality consisted of the following 20 water-quality parameters monitored from September to November 2006. Three areas (lowland (LL), semi-mountainous (SM), and coastal (C)), and 164 monitoring sites (Fig. 1) in the region of Thessaly were selected for the collection of water samples (groundwater samples from central water supplies after disinfection). The following: physical (odor, taste, color, electric conductivity (EC, 25◦ C, μS·cm−1 ), pH, total hardness (TH, mg·L−1 CaCO3 )), chemical (mg·L−1 ) (chloride − − −2 (Cl− ), nitrate (NO− 3 ), nitrite (NO2 ), bicarbonate (HCO3 ), carbonate (CO3 ), potas+ sium (K), sodium (Na), ammonium (NH4 ), calcium (Ca), magnesium (Mg), zinc (Zn), iron (Fe), manganese (Mn) and lead (Pb)) parameters have been determined, following the methods of European directive 1998/83/EC and (APHA 2005). However, only 17 typical physical and chemical parameters were analyzed, since the parameters of color, odor, and taste did not present any variability in the three studied areas.

A. Papaioannou et al.

3.2 Chemometric Methods Chemometric methods, such as cluster analysis (CA), discriminant analysis (DA), and factor analysis (FA, using principal component analysis, PCA) were used (Massart and Kaufman 1983; Vanderginste et al. 1998). CA is an unsupervised pattern detection method that partitions all cases into smaller groups or clusters of relatively similar cases that are dissimilar to other groups. We applied hierarchical CA on standardized log-transformed data using Ward’s method, with squared Euclidean distances. DA is a method of analyzing dependence that is a special case of canonical correlation used to analyze dependence. One of its objectives is to discriminate between two or more groups in terms of the means of the discriminating variables. DA is performed on log-transformed data without affecting the results and comparability with other chemometric methods and constructs a discriminant function for each group as follows: f (Gi ) = ki +

n 

wij · pij

j=1

where i is the number of groups (G); ki is a constant inherent to each group; n is the number of parameters used to classify a set of data into a given group; and wij is the weight coefficient assigned by DA to a given parameter (pj ). We performed DA on the data set based on two different modes, i.e., standard and stepwise to construct the best discriminant functions (DFs) to confirm the clusters determined by means of CA and evaluate the spatial variations in potable water quality. The monitoring sites (spatial) were the grouping variable, and the measured parameters were the independent variables. PCA, as one method of extracting the eigenvectors of FA, extracts eigenvalues and eigenvectors (a list of loadings) from the covariance matrix of original variables to produce new orthogonal variables known as varifactors (VFs) through varimax rotation, which are linear combinations of the original variables. VFs provide information on the most meaningful parameters that describe a whole data set, allowing data reduction with minimum loss of original information. Moreover, VFs include unobservable, hypothetical, and latent variables.

4 Results and Discussion 4.1 Data Treatment The methods, such as CA and FA, require variables to conform to a normal distribution, thus, the normality of the distribution of each variable was checked by Kolmogorov–Smirnov statistic, box plots, stem-and-leaf plots, histograms, normality plots, and spread-versus-level plots with Levene tests and transformations. The original data demonstrated that EC, pH, and Ca were almost normally distributed, whereas the other parameters (except Pb) were positively skewed with kurtosis coefficients significantly greater than zero. After log transformation of the parameters, all skewness and kurtosis were significantly reduced (Fig. 2).

Assessment and Modelling of Groundwater Quality Fig. 2 Skewness and kurtosis coefficients of log-transformed (black down-pointing triangles) and original (white squares) data

10 NH4 8 NO2 Ca2

6

SKEWENES

Fe NO3

4

Cl TH

2

0

-2

LogNO3 Pb LogK LogPb

SORT

Original Log-tran

-4 -20

0

20

40

60

80

100

KURTOSIS

For CA and FA, all parameters were also z-scale standardized (mean = 0; variance = 1) to minimize the effects of differences in measurement units and variance, and render the data dimensionless. Autoscaling of individual variables was performed, which is called “column standardization”. With this procedure, the mean of the column elements is subtracted from individual elements and divided by the column standard deviation. Consequently, each column had zero mean and unit variance. 4.2 Spatial Similarities and Grouping of Sample Sites (Cluster Analysis) An initial exploratory approach involved the use of hierarchical CA on a logtransformed data set of 164 sampling sites. Spatial CA identified similar monitoring sites, and in this case, produced a dendrogram grouping the 164 sampling sites into two main groups. The classifications of the studied sampling sites varied, because the sites in these groups had similar features and natural background, and were affected by similar sources. The first CA-group (CA-group 1) consisted of 63.9% from samples of LLarea, of 5.2% from samples of SM-area, and of 30.9% from samples of C-area. The second CA-group (CA-group 2) consisted of 9.0% from samples of LL-area, of 35.8% from samples of SM-area, and of 55.2% from samples of C-area. Since the number of samples was not consistent within the different sample areas, taking into

A. Papaioannou et al. Table 1 Percentages of CA-groups in each studied area

Areas

CA-group 1 Samples Percentage

CA-group 2 Samples Percentage

LL-area SM-area C-area

62 5 30

6 24 37

91.2 17.2 44.8

8.8 82.8 55.2

consideration the percentages of groups in each area gives a clearer picture of the percentages of groups in each area (Table 1). 4.3 Statistical Screening of Water Quality Data Three of the studied parameters (color, odor, and taste) did not present any variability among the studied areas, and therefore they were excluded from further statistical analysis. The mean values of these parameters were found within the allowable limits set within the directive (EU Directive 1998/83/EC). Tables 2, 3, and 4 provide descriptive statistics of the 17 determined physical parameters, ions, and trace elements per studied area and per CA-group, respectively. EC, pH, and TH mean values were found within acceptable limits of the EU Directive (1998/83/EC). EC values ranged from 690.7 μS·cm−1 in CA-group 1 to 232.4 μS·cm−1 in CA-group 2 with an overall mean for all sites of 503.5 μS·cm−1 . Mean pH values were found between 7.15 and 7.73, showing the slight alkaline character of water, and mean values of TH are characteristic for moderate hardness water. Based on the results of cluster analysis that grouped the studied sites into two clusters, the higher mean values of EC, pH, and TH are presented in CA-group 1 and are similar to those of the LL-area. The mean values of the studied ions were found within the allowable limits that the directive (EU Directive 1998/83/EC) sets, except nitrite anions in the LL-area, with mean concentration slightly higher from the corresponding allowable limit. The study of the trace elements (Table 4) reveals that Pb presents slightly high mean concentrations in LL and SM areas. The other trace elements present mean concentrations below the allowable limits.

Table 2 Statistical descriptives (mean, standard error (SE) and standard deviation (SD)) of physical parameters by studied area and CA-group Parameters EC

pH

TH

Mean SE SD Mean SE SD Mean SE SD

Areas LL

C

CA-groups 1 2

All sites

SM

636.4 31.7 261.2 7.73 0.05 0.45 306.9 12.2 100.7

316.9 32.3 173.8 7.50 0.09 0.51 181.0 16.8 90.7

449.3 32.3 378.9 7.15 0.06 0.48 270.0 32.0 262.0

690.7 28.8 283.6 7.56 0.05 0.50 358.2 19.2 188.7

503.5 25.4 325.8 7.45 0.04 0.54 269.6 14.7 188.1

Allowable limits: EC (2,500 μS·cm−1 ), pH (6.50–9.50), TH (mg·L−1 CaCO3 )

232.4 16.9 138.3 7.30 0.07 0.56 141.2 10.4 85.3

Assessment and Modelling of Groundwater Quality Table 3 Statistical descriptives (mean, standard error (SE) and standard deviation (SD)) of chemical ions by studied area and CA-group Parameters Cl−

NO− 3 NO− 2 HCO− 3 CO−2 3

Ca

Mg

Na

K

NH+ 4

Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD

Areas LL

C

CA-groups 1 2

All sites

SM

33.2 1.1 9.0 15.9 2.5 20.8 0.021 0.008 0.062 314.8 14.9 122.8 5.9 1.1 9.4 63.0 2.9 23.6 26.7 2.5 20.8 42.7 4.9 40.3 1.63 0.25 2.06 0.119 0.022 0.178

26.4 2.2 11.8 4.2 1.2 6.7 0.005 0.003 0.019 171.0 19.9 107.0 2.8 0.6 3.4 38.0 4.0 21.4 11.3 2.1 11.5 11.9 1.9 10.0 0.99 0.26 1.38 0.068 0.019 0.103

42.8 3.2 26.3 2.0 0.4 3.0 0.001 0.001 0.002 305.9 39.4 322.1 1.0 0.3 2.5 64.2 9.6 78.4 18.6 2.1 17.6 5.6 0.6 4.9 0.49 0.06 0.45 0.109 0.046 0.376

40.0 2.3 22.5 12.0 1.9 1.6 0.016 0.005 0.053 390.4 24.1 237.3 4.2 0.8 8.3 75.1 6.3 61.8 28.6 2.0 19.3 31.9 3.8 37.0 1.29 0.18 1.72 0.155 0.036 0.340

35.9 1.5 19.4 8.2 1.2 15.2 0.010 0.003 0.042 285.7 18.0 230.3 3.4 0.5 6.8 59.1 4.2 23.8 20.7 1.5 18.9 22.1 2.5 31.6 1.05 0.12 1.55 0.106 0.021 0.269

29.9 1.4 11.4 2.6 0.5 4.1 0.001 0.001 0.002 134.1 12.0 98.1 2.1 0.4 3.3 35.9 3.2 25.9 9.2 1.3 10.9 7.9 1.4 11.4 0.70 0.15 1.20 0.035 0.006 0.045

− − −2 Allowable limits: (mg·L−1 ): Cl− (250), NO− 3 (50), NO2 (0.10), HCO3 (. . . ), CO3 (. . . ), Ca (. . . .), Mg (50), Na (200), K (12), NH+ (0.50) 4

The results of the correlation analysis on the whole water data set are presented below (correlation coefficients of r ≥ 0.40 at p < 0.001 considered significant). Total hardness showed strong positive correlations with HCO− 3 (0.956), EC (0.826) and Ca (0.782), and to a lesser extent with Mg (0.561). EC also showed strong positive correlation with HCO− 3 (0.756) and moderate positive correlations with Mg (0.715) and Ca (0.619); Ca showed strong positive correlation with HCO− 3 (0.767); Na showed moderate correlation with CO−2 (0.623); and lastly, K showed weak 3 correlation with NO− (0.440). All these correlations are summarized in Fig. 3, which 3 presents a correlation map, showing an overview of the correlations per CA-group among the nine parameters above. This matrix scatter plot of Fig. 3 could be a very useful tool as it shows the structure and strength of the correlations among the above nine parameters. For example, the pair of parameters TH–EC presented a strong linear correlation in both groups

A. Papaioannou et al. Table 4 Statistical descriptives (mean, standard error (SE) and standard deviation (SD)) of trace elements by studied area and CA-group Parameters Zn

Fe

Mn

Pb

Mean SE SD Mean SE SD Mean SE SD Mean SE SD

Areas LL

C

CA-groups 1 2

All sites

SM

0.542 0.047 0.389 0.187 0.033 0.418 0.018 0.003 0.023 0.012 0.000 0.009

0.193 0.067 0.361 0.124 0.040 0.331 0.024 0.007 0.038 0.012 0.000 0.005

0.253 0.049 0.404 0.128 0.023 0.125 0.025 0.004 0.035 0.009 0.000 0.001

0.449 0.041 0.407 0.179 0.030 0.291 0.026 0.003 0.031 0.011 0.000 0.004

0.362 0.033 0.418 0.152 0.018 0.236 0.022 0.002 0.031 0.011 0.000 0.004

0.236 0.049 0.403 0.112 0.013 0.110 0.016 0.004 0.030 0.011 0.000 0.003

Allowable limits: (mg·L−1 ): Zn (. . . ), Fe (0.200), Mn (0.050) and Pb (0.010)

(CA-group 1 and CA-group 2), while the pair of parameters TH–HCO− 3 had a strong linear correlation in CA-group 1 but not in CA-group 2. 4.4 Spatial Variations in Potable Water Quality (Discriminant Analysis (DA))

TH EC HCO3 CA MG

MG

CA HCO3 EC TH

Spatial variation in potable water quality was evaluated using DA, with clusters based on spatial CA. The objective of the DA was to test the significance of discriminant functions and determine the most significant variables associated with

NO3 CO3 NA

NA CO3

Cluster

NO3

K

K TH

EC

HCO3

CA

MG

NA

CO3

NO3

2 1

K

Fig. 3 Matrix scatter plot showing correlations between CA-group 1 (cluster 1) and CA-group 2 −2 − (cluster 2) among the parameters TH, HCO− 3 , EC, Ca, Mg, Na, K, CO3 , and NO3

Assessment and Modelling of Groundwater Quality Table 5 Classification functions coefficients for DA of spatial variation Parameter EC pH Cl− NO− 3 NO− 2 HCO− 3 CO−2 3 Ca Mg Na K NH+ 4 Zn Fe Mn Pb TH Constant

Standard mode CA-group 1

CA-group 2

17.322 1,368.813 36.655 −4.897 1.668 78.454 −20.47 −64.582 −9.315 −40.206 20.033 −14.429 −23.47 1.597 −6.275 −1.55 −9.51 −656.307

10.201 1,400.658 26.835 −5.432 3.778 75.464 −21.39 −67.798 −9.874 −44.47 19.873 −13.511 −24.27 0.906 −5.85 −4.745 −10.508 −637.014

Stepwise mode CA-group 1

CA-group 2

32.136

25.045

56.037

47.419

−5.745 21.124

−3.852 15.586

1,111.465

7.569

1.019

−1.732

−125.086

−82.481

the differences among the CA-groups. The spatial DA was performed, using the logtransformed data set of 17 parameters, after classification into the two major CAgroups (CA-group 1 and CA-group 2) obtained from the spatial CA. The sites were the dependent variables, and the measured parameters were the independent variables. Wilks’ lambda and the Chi-square for each discriminant function varied from 0.227 to 0.247 and from 227.686 to 222.234, respectively, at p < 0.001, suggesting that the spatial DA was credible and effective. The discriminant functions (DFs) and classification matrices (CMs) obtained from the standard and stepwise modes of DA are shown in Tables 5 and 6, respectively. In the stepwise mode, the variables were added step by step, beginning with the most significant, until no significant changes were obtained; the standard DA mode constructed DFs containing all parameters (Table 5). The standard mode DFs, using 17 discriminant variables yielded CM correctly assigning 97.56% of the cases. However, in the stepwise mode, the DA produced a CM with 94.51% correct assignments using only six discriminant parameters Table 6 Classification matrix for DA of spatial variation Monitoring sites Standard mode CA-group 1 CA-group 2 All samples Stepwise mode CA-group 1 CA-group 2 All samples

Percentage of correct classification

Cluster assigned by DA CA-group 1 CA-group 2

All samples

97.0 94.7 97.6

96 3 99

1 64 65

97 67 164

96.9 91.0 94.5

94 6 100

3 61 64

97 67 164

A. Papaioannou et al.

(Tables 5 and 6). These results were similar to those obtained using the standard mode, but with significantly fewer parameters. − Thus, the spatial DA suggests that EC, Cl− , NO− 2 , HCO3 , Na, and Pb were the most significant parameters for discriminating between two CA-groups and accounted for most of the expected spatial variation in potable water quality. Based on the above results, hierarchical CA provided a classification of potable water quality that aided in designing an optimal spatial monitoring plan with a sharply reduced number of monitoring sites and corresponding costs. Concretely, the number of water quality parameters might be decreased, as well; the number of monitoring sites could also be reduced and only chosen from CA-group 1 and 2. Moreover, stepwise DA was proven to be a useful tool in recognizing the discriminant parameters in spatial variations of potable water quality. It was essential − to strengthen the monitoring accuracy of EC, Cl− , NO− 2 , HCO3 , Na, and Pb to clearly identify variations in the future.

5 Cluster and Factor Analysis 5.1 Cluster Analysis Figure 4 shows the dendrogram for the 17 studied parameters of all samples data set. The dendrogram shows two main clusters, which are divided into subclusters as follows: + − Mn, Zn, NO− 2 , Pb, NH4 , and Cl .

Cluster 1:

Subcluster 1.1: Mn, Zn, and NO− 2. − Subcluster 1.2: Pb, NH+ 4 , and Cl . −2 − K, Na, NO− 3 , Fe, CO3 , pH, Ca, HCO3 , Mg, TH, and EC.

Cluster 2:

Subcluster 2.1: Subcluster 2.2: Subcluster 2.3: Subcluster 2.4:

K, Na, and NO− 3. CO−2 and pH. 3 Fe Ca, HCO− 3 , Mg, TH, and EC.

120

(Dlink/Dmax)*100

100 80 60 40 20 0 Mn

Zn

NO2-

Pb

NH4+

Cl -

K

Na

NO3-

Fe

CO3-2

pH

Ca HCO3- Mg

Fig. 4 Hierarchical dendrogram for the 17 parameters of all samples group

TH

EC

Assessment and Modelling of Groundwater Quality

This data analysis gives an idea of how the single water quality parameters should be compared and related to one another, if the sample is treated with all parameter values simultaneously, not separately. For instance, within a group of all samples, there was a stronger relation between the group of physical parameters (EC and TH), with parameters like Mg, HCO− 3 and Ca than to pH and the other studied quality parameters. The same cluster analysis was performed to datasets consisting of only CA-group 1 and only CA-group 2 water samples. The respective hierarchical dendrograms are shown in Fig. 5a, b.

120

(Dlink/Dmax)*100

100

80

60

40

20

0 Ca

NO3 K

CO3 Na

Mn pH

Zn Fe

NH4 Pb

Mg NO2

TH Cl

EC HCO3

(a) 120

(Dlink/Dmax)*100

100

80

60

40

20

0 K

Mg Na

pH TH

NH4 Pb

NO3 Mn

NO2 Zn

Ca Cl

Fe HCO3

EC CO3

(b) Fig. 5 Hierarchical dendrogram for the 17 parameters of a CA-group 1 and b CA-group 2

A. Papaioannou et al.

When the dendrogram of CA-group 1 is considered, several clusters are also − + − formed, like (EC, HCO− 3 , and TH) group; (K, Ca, and NO3 ) group; (NH4 , NO2 , −2 − Cl , Mg, and Pb) group; (Na, CO3 , and pH) group; and (Fe, Mn, and Zn) group. This seems to be a CA-group 1 specific water quality parameters classification rule. The careful observation in Fig. 5b indicates the presence of (TH, pH, Na, K, and + − Mg) group; (Cl− , Zn, and NO− 2 ) group; (NH4 , NO3 , Mn, and Pb) group; and (EC, − −2 Ca, HCO3 , CO3 , and Fe) group. It seems obvious that the possible introduction of a more general potable water quality indicator has to be group specific. This is a confirmation of the finding about differences between the average values of the single potable water quality parameters between CA-group 1 and CA-group 2. 5.2 Source Identification in Monitoring Sites (Factor Analysis (FA)) Clustering does not imply dependency, only similarity. There is, typically, no cause and effect relationship among the clusters of the data points. Therefore, PCA was performed after Varimax rotation to gain a more reliable display method and better understanding regarding the relationships within the dataset. Varimax searches for an orthogonal rotation (i.e., a linear combination) of the original factors such that the variance of the loadings is maximized. Before conducting the FA, the Kaiser–Meyer–Olkin (KMO) and Bartlett’s sphericity tests were performed on the parameter correlation matrix to examine the validity of the FA. FA was conducted for each spatial cluster (CA-groups 1 and 2) and for all samples. The KMO results for CA-group 1 and CA-group 2, as well as for all samples were 0.541, 0.574, and 0.685, respectively. The results for Bartlett’s sphericity were 595.855, 572.509, 1,072.545 ( p < 0.005), indicating that FA may be useful in providing significant reductions in dimensionality. FA was applied to the standardized log-transformed datasets to examine differences between CA-group 1 and CA-group 2 and identify the latent factors. FA was also applied to the group of all samples to identify the latent factors that characterize each CA-group. Based on the screeplot for the FA and the eigenvalues—one criterion only the VFs with eigenvalues >1 were considered essential. The screeplot for the FA and Kaiser criterion (eigenvalues >1) showed that only the first six eigenvalues could be taken into consideration for CA-group 1, CA-group 2, and all samples data set. The FA of the above three data sets yielded six VFs for each of the three groups, explaining 68.78%, 73.31%, and 67.29% of the total variance in the respective waterquality datasets (CA-group 1, CA-group 2, and the all samples group). Table 7 summarizes the FA results comprising the loadings, eigenvalues, and variance (%). The parameters with loadings (L), whose absolute value is more than 0.5, are considered significant. The classification of the VFs loadings as “strong”, “moderate” and “weak”, corresponds to absolute values of (>0.75), (0.75–0.50) and (0.50–0.30), respectively (Liu et al. 2003; Singh et al. 2005). More specific, for the all samples group, the factor VF1, which explained 27.3% of the total variance, contains EC, TH, Ca, Mg, and HCO− 3 . VF1 was considered to be a “hardness–salinity” factor. It is well known that the TH is connected to Ca content and is a function of the salinity of water. EC and Ca loadings on VF1 can be explained by the dissolution of soils and minerals in sediments such as calcite (CaCO3 ). Similar loadings of Ca and EC on a VF have been reported by other researchers and were

EC (0.830) Mg (0.634) HCO− 3 (0.710) TH (0.838) 3.610 21.2 pH (0.818) NO− 3 (0.627) Mg (0.694) Na (0.634) TH (0.681) K (0.303) 3.754 22.1 EC (0.742) HCO− 3 (0.904) Ca (0.771) Mg (0.685) TH (0.867) 4.647 27.3

CA-group 1

Eigenvalue Variance (%)

Eigenvalue Variance (%) All samples

Eigenvalue Variance (%) CA-group 2

VF1

Group

2.006 11.8

3.127 18.4 NO− 3 (0.695) Na (0.519) K (0.716) NO− 2 (−0.763)

pH (0.513) CO−2 3 (0.854) Na (0.749) Ca (−0.606) 2.471 14.5 HCO− 3 (0.878) CO−2 3 (0.632) Ca (0.837)

VF2

1.448 8.5

1.783 10.5 pH (0.753) CO−2 3 (0.768)

1.404 8.3 Cl− (0.514) NH+ 4 (0.620) Mn (0.697) Zn (0.199) Na (−0.502) 1.261 7.4

(0.657) NH+ 4 (0.767) Mg (0.274) Ca (−0.600) 1.450 8.5 NH+ 4 (0.687) Mn (0.804) NO− 3 (0.472) Pb (0.179)

Cl−

NO− 2

(0.741) Pb (0.629) NO− 3 (−0.518) K (−0.702) 1.813 10.7 Cl− (0.810) NO− 2 (0.526) Na (−0.523) K (−0.858)

VF4

VF3

1.073 6.3

1.322 7.8 Pb (0.671) Zn (−0.613)

1.233 7.2 NO− 2 (0.604) Zn (0.821)

Fe (0.677) Mn (0.692) Zn (0.124)

VF5

1.004 5.9

1.073 6.3 Fe (0.929)

NO− 3 (0.420) Ca (0.192) K (0.309) Zn (−0.805) 1.115 6.6 Fe (0.618) EC (0.770)

VF6

Table 7 Loadings (L) of the 17 measured parameters on significance VFs (L > 0.5) for CA-group 1, CA-group 2, and all samples data set (parameter (L) in italics indicate weak correlation)

Assessment and Modelling of Groundwater Quality

A. Papaioannou et al.

interpreted as a “mineral weathering” signal (Wayland et al. 2003). Other researchers (Simeonov et al. 2001) used factor analysis in a study of Danube river data and interpreted the first VF (with high loadings on Ca, Mg, HCO− 3 , EC and TH) as a “water hardness–salinity” factor, too. According to the above study, the VF depends mainly on the origin of natural water. Of the multiple parameters within this all samples group, EC can be readily measured and used as a surrogate for the presence of the remaining parameters. Therefore, reducing the number of analytes reduces the need for laboratory analysis of these parameters and allows resources to be freed up for additional measurements elsewhere. The additional analytes could still be sampled, especially during periods when elevated EC is observed. The second factor, accounting for 11.8% of the total variance, incorporated those water quality variables are characteristic of non-point pollution sources, such as agricultural runoff and domestic wastewater, including K, Na, and NO− 3 . In addition, variable NO− loaded negatively on the second factor, which is consistent with 2 discharges from domestic wastewater facilities, too. The third factor (8.5% of the total variance) contained two parameters (pH and CO−2 3 ), with strong positive correlations that characteristic of natural pollution, which includes soil weathering and subsequent runoff. The latent factor VF4 (7.4% of the total variance) had − moderate positive association with NH+ 4 , Mn, and Cl , which are related to urban land use and agricultural activities (NH+ represents influence of agricultural runoff 4 from the soil as nitrogenous fertilisers are extensively used in this region; Mn and Cl− representing influences from point sources such as municipal and industrial effluents). In addition, the variable Zn loaded slight positively, while Na loaded negatively on the fourth factor. The latent factor VF5 (6.3% of the total variance) had moderate loadings with the positive value 0.7 being for Pb and the negative value -0.6 found for Zn and was considered as representing “anthropogenic–toxic” pollution from industrial effluents. Lastly, factor VF6 (5.9 of the total variance) had strong positive loading on Fe and was mainly associated with natural mineral pollution including soil weathering (Mendiguchia et al. 2004; Shrestha and Kazama 2007; Simeonov et al. 2003; Singh et al. 2005; Papatheodorou et al. 2006; Helena et al. 2000). For CA-group 1, factor VF1, which explained 21.2% of the total variance, had strong positive loadings (L > 0.8) on EC and TH and positive loadings on HCO−2 3 and Mg. Next factor, VF2 (14.5% of the total variance), had strong positive loading on CO−2 3 , positive loadings on Na and pH, and negative loading on Ca. VF3 (10.7% of the total variance) had positive loadings on NO− 2 and Pb and negative loadings on K and NO− . VF4 (8.5% of the total variance) had positive association with NH+ 3 4 − and Cl , weak positive with Mg, and negative with Ca. VF5 (7.2% of the total variance) included Fe and Mn with positive a correlation to each other. Additionally, the variable Zn loaded slight positively on the fifth factor. Lastly, factor VF6 (6.6% of the total variance) had strong negative loading on Zn; weak positive loadings of NO− 3 , Ca, and K. The pollution structure of CA-group 2 was different from that of CA-group 1, but almost similar to the degree of pollution (depending on VFs scores). The first factor VF1 (22.1% of the total variance) had strong positive loading on pH; positive loadings on NO− 3 , Mg, Na,and TH; weak positive loading on K. VF2 (18.4% of the total variance) had strong positive loadings on HCO− 3 and Ca and positive loading on CO−2 . VF3 (10.5% of the total variance) had strong positive loading on 3

Assessment and Modelling of Groundwater Quality

Cl− , positive loading on NO− 2 , strong negative loading on K, negative loading on Na. VF4 (8.3% of the total variance) had strong positive association with Mn, positive − association with NH+ 4 , and weak positive loadings on NO3 and Pb. VF5 (7.8% of the total variance) had strong positive loading on Zn and positive loading on NO− 2 . VF6 (6.3% of the total variance) included EC and Fe with positive correlation. It is readily seen that the major groups of potable water quality parameters interpreted by cluster analysis were also involved in the factor loadings presented in Table 7. More specifically, when all samples group was taken into consideration, VF1 corresponded to subcluster (2.4), VF2 to subcluster (2.1), VF3 to subcluster (2.2), VF4 and VF5 together to combined subclusters (1.1 and 1.2), and VF6 to subcluster (2.3). It also appears that each of the six factors had a physical basis of observation. The primary contributors to potable water quality of all samples data set were (1) groundwater original quality (VF1, 27.3 %); (2) agricultural runoff and domestic wastewater (VF2, 11.8%); (3) natural sources (soil leaching processes) (VF3 and VF6, 14.4%); and (4) urbanization (municipal effluents), agricultural and industrial activities (VF4 and VF5, 13.7%). The remaining variability (32.7%) may result from localized processes or unique watershed contributions. Consequently, the anthropogenic activities, which are different in each studied group, is one of the main causes for the higher degree of pollution (but still small) in CA-group 1 that CA-group 2. More specific, in LL-area (densely populated area) the potable water quality is affected mainly by intensive anthropogenic activities like urbanization, industry, and agriculture, while in SM-area (populated area) the common activities are animal production and tree cultivations. Moreover, in the LLarea the intensive agricultural activities that used large amounts of mineral fertilizers are considered to be the primary cause of high nitrate concentrations in groundwater and soil.

Table 8 Statistical descriptives of the six VFs and t test equality of the CA-group means

Varifactors

CA-group 1

CA-group 2

t test for equality of group means Statistic p level

VF1

0.576 0.061 0.604 0.289 0.101 0.991 0.112 0.104 1.020 −0.080 0.094 0.924 0.163 0.101 0.996 0.034 0.113 1.108

−0.834 0.105 0.863 −0.419 0.105 0.859 −0.161 0.117 0.955 0.116 0.134 1.097 −0.236 0.118 0.964 −0.050 0.101 0.824

11.561

0.000

4.872

0.000

1.731

0.085

−1.240

0.217

2.557

0.011

0.530

0.597

VF2

VF3

VF4

VF5

VF6

Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD Mean SE SD

A. Papaioannou et al.

Continuously, it is evident that the six VFs have an affect on the structure of CAgroup 1 and CA-group 2. The identification was obtained using descriptive statistics of the six VFs and t test for equality of the CA-group means (Table 8), box plot (Fig. 6a) and error bar (Fig. 6b) of the FA scores for the six VFs obtained for the CA-groups, matrix scatter plot of the FA score means for the six VFs obtained for the CA-groups (Fig. 7), and PCA scores plot for the six VFs (Fig. 8a–c). When a VF had positive mean value for a specific group and the mean values of this VF among studied groups were found to be significantly different, then this VF characterizes the group. The data in Table 8 show that VF1, VF2, and VF5

6

4

2 VF1 VF2

0

VF3 VF4

-2

VF5 -4

VF6 1

2

Cluster

(a) 1,0

VF1

,5

VF2

95% CI

0,0 VF3 -,5 VF4

-1,0

VF5

VF6

-1,5 1

2

Cluster

(b) Fig. 6 a Box plots and b error bars, of the FA scores for the six VFs according to CA-group 1 (cluster 1) and CA-group 2 (cluster 2)

Assessment and Modelling of Groundwater Quality

VF1

VF2

VF3

VF4

VF5 Cluster 2 VF6 1

Fig. 7 Matrix scatter plot of the FA score means for the six VFs obtained for CA-group 1 (cluster 1) and CA-group 2 (cluster 2)

characterize CA-group 1, since the mean values of VF1, VF2, and VF5 were positive in CA-group 1 and negative in CA-group 2, and there was a significant difference ( p < 0.05) between the two CA-groups (1 and 2) in these three factors. On the contrary, the mean values of VF3 and VF6 were positive in CA-group 1 and negative in CA-group 2, but the difference between the two CA-groups was insignificant ( p = 0.085 and p = 0.597). Moreover, VF4’s mean value was positive in CA-group 2 and negative in CA-group 1, but the difference between the two CA-groups was also insignificant ( p = 0.217). It is shown that the mean degree of pollution in the monitoring sites of CA-group 1 and CA-group 2 differed significantly for VF1, VF2, and VF5 and not significantly for VF3, VF4, and VF6 (Table 8). For a clearer interpretation of the pollution patterns at each CA-group, scores cluster box plot (Fig. 6a) for the six VFs was used to

1

3

3

4

2

2

3

1

1

0

0 -1

-2

-2

-3

-3 -1

0

1 VF2

(a)

2

3

4

1 0

-1

-2

2

VF5

VF3

VF1

CLUSTER 2

-1 -2 -3 -4

-3

-2

-1 0 VF4

(b)

1

2

3

-3

-2

-1

0 VF6

1

2

3

(c)

Fig. 8 Spatial variation in potable water quality using the FA scores of the six VFs (a VF1 against VF2, b VF3 against VF4, and c VF5 against VF6), obtained for CA-group 1 (cluster 1) and CA-group 2 (cluster 2)

A. Papaioannou et al.

visualize differences among the CA-groups. The distribution of the factor scores of VF1 followed the sequence CA-group 1 > CA-group 2. For VF2, the sequence was CA-group 1 > CA-group 2; for VF3, a significant difference did not exist (CA-group 1 ∼ CA-group 2); for VF4, a significant difference did not exist (CA-group 1 ∼ CAgroup 2); for VF5, the sequence was CA-group 1 > CA-group 2; and for VF6, a significant difference did not exist (CA-group 1 ∼ CA-group 2). That means CAgroup 1 was influenced by VF1, VF2, and VF5, while CA-group 2 by VF3, VF4 and VF6. Additionally, error bars of the six VFs could contribute to VF’s characterization of CA-groups and the same results were also found when the error bar of the VFs in each studied group was considered (Fig. 6b). It is clear that VF1, VF2, and VF5 have the biggest mean values in CA-group 1 than CA-group 2. An overview of the VFs’ affection on structure of each CA-group is presented in Fig. 7. This matrix scatter plot shows the factors that characterize the CA-group 1 or the CA-group 2 and also, it may be further used to analyse and explain the group formation. The matrix scatter plot of score means for the six VFs (Fig. 7) shows how the VFs scores means for CA-group 1 and CA-group 2 were discriminated in each scatter plot. The biggest mean value of one VF in a group shows the stronger affection of this VF on this group. Consequently, Fig. 7 is a useful graphical tool for understanding which VF is characteristic for CA-group 1 and CA-group 2. For example, it can be seen that VF1 and VF2 presented large values in CA-group 1 and small values in CA-group 2, so these VFs characterize CA-group 1. Moreover, it was found from the FA scores plot of VF1 and VF2 (Fig. 8a) that samples are well separated according to the two CA-groups and the separation was obtained in the first varifactor (VF1). It is evident that samples distributed in the region of larger values of VF1 were almost all collected from CA-group 1.

Fig. 9 Scatter plot of loadings for the first two factors (VF1 against VF2) obtained for all samples data set

1,0

HCO3 CA

,8

TH EC

MG ,6

VF1

,4

NA

CL

K NO3

PH

,2 ZN 0,0 NO2

NH4 CO3 FE PB

-,2 MN -,4 -,8

-,6

-,4

-,2

0,0

VF2

,2

,4

,6

,8

Assessment and Modelling of Groundwater Quality

The scatter plots of the rest of the four factors (Fig. 8b, c) showed that the separation between the CA-groups was obtained only in the first varifactor. It can be seen from the scatter plot of VF3 (no significant difference ( p = 0.085 > 0.05) between the two CA-groups) against factor VF4 (no significant difference ( p = 0.217 > 0.05) between the two CA-groups) that the two CA-groups were not well separated from each other (Fig. 8b). Similarly, from the scatter plot of factor VF5 (positive mean value in CA-group 1, negative in CA-group 2, and a significant difference ( p = 0.011 < 0.05) between the two CA-groups) and VF6 (there is no significant difference ( p = 0.597 > 0.05) between the two CA-groups) it is shown that the two CA-groups were not well separated from each other (Fig. 8c). The corresponding factor loading plot for the first two VFs is shown in Fig. 9. It must be stressed that potential groups of labeled parameters in the loadings plot do not influence the groups of samples in the score plots. The only important message from the loadings plot is the magnitude of the factors associated with individual parameters. The square of the loading of one individual parameter in one VF shows the percentage of variance of the values of the individual parameter that is explained by that VF. If the absolute value is close to zero, the −2 parameter has small influence, as in the case of Fe, Pb, NH+ 4 , and CO3 . To determine the distribution of the monitoring sites in the three studied areas, the FA scores of the six VFs were plotted for CA-group 1 and CA-group 2 (Fig. 10).

M 4

SITE LL

C 3

3

3 1

1

0 -1 -2

AVF5

1 AVF3

AVF1

2

2

2

0 -1

-3 -4 -2

-1

0

1 AVF2

2

3

-2

-2 -2

4

-3 -1

(a)

0

1 AVF4

2

3

-3

2

2

1

1

1

0 -1

-1

-2

-2

-3 -3

-2 -1 BVF2

(d)

0 1 AVF6

0

1

2

2

3

4

2

3

0 BVF5

0

-4

-1

(c)

2

-5

-2

(b)

3

BVF3

BVF1

0 -1

-1 -2 -3 -4

-3

-2

-1 0 BVF4

(e)

1

2

-5

-4

-3

-2

-1 0 BVF6

1

(f)

Fig. 10 Scatter plot of scores for the six VFs obtained for CA-group 1 (a–c) and CA-group 2 (d–f), according to studied areas (M semi-mountainous, LL low land and c coastal)

A. Papaioannou et al.

The scores plot of the VFs showed the distribution of the monitoring sites of each area to the CA-group 1 (A) and CA-group 2 (B) and the degree of pollution in each monitoring site. A greater VF score indicated a greater effect. As it can be seen in Fig. 10, the score plots of VFs did not show clear associations of the pollution in the CA-groups to the monitoring sites and could not further discriminate among the monitoring sites. This happened because the degree of pollution of the monitoring sites was small.

6 Conclusions The objective of this study was to find the correlations between sampling sites and the parameters obtained by physical, chemical–biological measurements. The quality of potable water has been analysed by different chemometric methods. Hierarchical CA classified the 164 sampling sites into two clusters (CA-group 1 and CA-group 2) based on the similarity of potable water quality characteristics. Based on the outcome, DA gave the best results, with good discriminatory ability according to significance validation tests and identified significant parameters for discrimination among spatial CA-groups. For the analysis of spatial variation, the − stepwise DA used only six parameters (EC, Cl− , NO− 2 , HCO3 , Na, and Pb) and correctly assigned about 94.5% of the cases. Therefore, the spatial similarities and differences may allow optimization of a monitoring program in the future by decreasing the number of sampling stations, the number of parameters monitored, and thus, the subsequent costs. FA allowed the reduction of the 17 parameters to six significant VFs that explain 67.2% of the variance of the original data set. In addition, FA identified the VFs/sources responsible for variations in groundwater quality at 164 different sampling sites. VFs obtained from factor analysis indicated that the parameters responsible for groundwater quality variations are mainly related to non-point sources (groundwater original quality, soil weathering and leaching, and agricultural runoff) and point sources (municipal and industrial pollution). The characterization of pollution sources of the two CA-groups were well separated on scores plot of VF1 and VF2 (Fig. 8a), and the separation was obtained in the VF1 (which includes the parameters EC, TH, Ca, Mg and HCO− 3 ). The identification of the VFs affection on structure of each CA-group was obtained using: (1) descriptive statistics of the six significant VFs and t test for equality of the CA-group means, (2) box plots and error bars of the FA scores for the six VFs obtained for the CA-groups, (3) matrix scatter plot of the FA score means for the six VFs obtained for the CAgroups and (4) FA scores plot for the six VFs. The results show that CA-group 1 was influenced by VF1, VF2, and VF5, while CA-group 2 by VF3, VF4 and VF6. Moreover, it was derived from the FA scores plot of VF1 and VF2 that water samples were well separated according to the two CA-groups and the separation was obtained in the VF1. Moreover, the score plots of VFs did not show clear associations of the pollution in the CA-groups to the monitoring sites and could not further discriminate among the monitoring sites. This happened because the degree of pollution of the monitoring sites was small. The present study gives the opportunity to begin following the quality of potable water at different sampling sites in Thessaly region within a defined time period. The

Assessment and Modelling of Groundwater Quality

monitoring of the groundwater’s general pollution and following the measured parameters that are above the permitted concentration level can be used to search for the source of pollution, for planning prevention measures, and to prevent pollution. The benefit of applying chemometric methods is not only the possibility of visualization of large amounts of multivariate data, but also a possibility for a quick classification of potentially polluted unknown water samples. Specifically, chemometric methods were successfully applied to evaluate spatial variations in potable water quality and source identification at the monitoring sites in Thessaly, indicating that different methods are effective and harmonious with each other and useful for potable water quality management.

References APHA (2005) Standard methods for the examination of water and wastewater, 21st edn. American Public Health Association (APHA), New York Boyacioglu H, Boyacioglu H (2007) Surface water quality assessment by environmetric methods. Environ Monit Assess 131:371–376 Brodnjak-Vonˇcina D, Dobˇcnik D, Noviˇc M, Zupan J (2002) Chemometrics characterisation of the quality of river water. Anal Chim Acta 462:87–100 EU Directive (1998/83/EC) Official journal of the European communities. No L 31/1 EU Directive (2000/60/EC) Official journal of the European communities. No. L 297/1 EU Directive (2006/118/EU) Official journal of the European communities No. L 372/19 Fattal B, Gutman-Bass N, Agursky T, Shuval HI (1988) Evaluation of health risk associated with drinking water quality in agricultural communities. Water Sci Technol 20:409–415 González Vázquez JC, Grande JA, Barragán FJ, Ocaña JA, De La Torre ML (2005) Nitrate accumulation and other components of the groundwater in relation to cropping system in an aquifer in Southwestern Spain. Water Resour Manage 19:1–22 Hanrahan G, Gibani S, Miller K (2008) Multivariate chemometrical classification and assessment of Lake Tuendae: a Mojave desert aquatic environment housing the endangered Mojave Tui Chub. Ecol Informat 3:334–342 Helena B, Pardo R, Vega M, Barrodo E, Fernandez JM, Fernandez L (2000) Temporal evolution of groundwater composition in an alluvial aquifer (Pisuerga river, Spain) by principal component analysis. Water Res 34(3):807–816 Koklu R, Sengorur B, Topal B (2009) Water quality assessment using multivariate statistical methods—a case study: Melen River System (Turkey). Water Resour Manage. doi:10.1007/ s11269-009-9481-7 Kotti ME, Vlessidis AG, Thanasoulias NC, Evmiridis NP (2005) Assessment of river water quality in Northwestern Greece. Water Resour Manage 19:77–94 Kowalkowski T, Zbytniewski R, Szpejna J, Buszewki B (2006) Application of chemometrics in river water classification. Water Res 40:744–752 Kowalkowski T, Cukrowska EM, Mkhatshwa BH, Buszewski B (2007) Statistical characterisation of water quality in Great Usuthu River (Swaziland). J Environ Sci Health Part A 42:1065–1072 Lambrakis N, Antonakos A, Panagopoulos G (2004) The use of multicomponent statistical analysis in hydrogeological environmental research. Water Res 38:1862–1872 Liu CW, Lin KH, Kuo YM (2003) Application of factor analysis in the assessment of groundwater quality in a black foot disease area in Taiwan. Sci Total Environ 313:77–89 Massart DL, Kaufman L (1983) The interpretation of chemical data by the use of cluster analysis. Wiley, New York Medema GJ, Payment P, Dufour A, Robertson W, Waite M, Hunter P et al (2003) Safe drinking water: an ongoing challenge. In: Dufour A et al (eds) Assessing microbial safety of drinking water. Improving approaches and methods. World Health Organization, Geneva Mendiguchia C, Moreno C, Galindo-Riaño MD, Garcia-Vargas M (2004) Using chemometric tools to assess anthropogenic effects in river water. A case study Q Guadalquivir River (Spain). Anal Chim Acta 515:143–149 Modarres R (2009) Regional dry spells frequency analysis by L-Moment and multivariate analysis. Water Resour Manage. doi:10.1007/s11269-009-9556-5

A. Papaioannou et al. Nastos PT, Zerefos CS (2008) Decadal changes in extreme daily precipitation in Greece. ADGEO 16:55–62 Nastos PT, Zerefos CS (2009) Spatial and temporal variability of consecutive dry and wet days in Greece. Atmos Res 94:616–628 Nastos PT, Alexakis D, Kanellopoulou HA, Kelepertsis AE (2007) Chemical composition of wet deposition in a Mediterranean site Athens, Greece related to the origin of air masses. J Atmos Chem 58:167–179 Papaioannou A, Plageras P, Dovriki E, Minas A, Krikelis V, Nastos PT, Kakavas K, Paliatsos AG (2007) Groundwaters’ quality and location of productive activities in the region of Thessaly (Greece). Desalination 213:209–217 Papatheodorou G, Demopoulou G, Lambrakis N (2006) A long-term study of temporal hydrochemical data in a shallow lake using multivariate statistical techniques. Ecol Model 193:759–776 Shearer LA, Goldsmith JR, Young C (1972) Methemoglobin levels in infants, in an area with high nitrate water supply. Am J Public Health 62:1174–1180 Shrestha S, Kazama F (2007) Assessment of surface water quality using multivariate statistical techniques: a case study of the Fuji river basin, Japan. Environ Model Softw 22:464–475 Shuval HI, Gruener N (1972) Epidemiological and toxicological aspects of nitrates and nitrites in the environment. Am J Public Health 62:1045–1051 Simeonov V, Sarbu C, Massart DC, Tsakovski S (2001) Danube River water data modelling by multivariate data analysis. Microchimica Acta 137:243–248 Simeonov V, Einax JW, Stanimirova I, Kraft J (2002) Environmetric modelling and interpretation of river water monitoring data. Anal Bioanal Chem 374:898–905 Simeonov V, Stratis JA, Samara C, Zachariadis G, Voutsa D, Anthemidis A, Sofoniou M, Kouimtzis Th (2003) Assessment of the surface water quality in Northern Greece. Water Res 37:4119–4124 Singh KP, Malik A, Mohan D, Sinha S (2004) Multivariate statistical techniques fort he evaluation of spatial and temporal variations in water quality of Gomti River (India)—a case study. Water Res 38:3980–3992 Singh KP, Malik A, Sinha S (2005) Water quality assessment and apportionment of pollution sources of Gomti river (India) using multivariate statistical techniques—a case study. Anal Chim Acta 538:355–374 Strebel O, Duynisveld WHM, Bottcher J (1989) Nitrate pollution of groundwater in Western Europe. Agric Ecosyst Environ 26:189–214 Thornburn PJ, Biggs JS, Weier K, Keating BA (2003) Nitrate in groundwaters of intensive agricultural areas in coastal Northeastern Australia. Agric Ecosyst Environ 94:49–58 Vanderginste B, Massart DL, Buydens L, De Jong S, Lewi P, Smeyers-Verbeke J (1998) Handbook of chemometrics and qualimetrics. Elsevier, Amsterdam Wayland K, Long D, Hyndman D, Pijanowski B, Woodhams S, Haak K (2003) Identifying relationships between baseflow geochemistry and land use with synoptic sampling and R-Mode factor analysis. J Environ Qual 32:180–190 WHO (2004) Guidelines for drinking-water quality, vol 1. World Health Organization, Geneva Zang X, Rasmussen TC (2005) Multivariate statistical characterization of water quality in Lake Lanier, Georgia, USA. J Environ Qual 34:1980–1991 Zeng S, Chen J, Fu P (2008) Strategic zoning for urban wastewater reuse in China. Water Resour Manage 22:1297–1309 Zhou F, Guo H, Liu Y, Hao Z (2007a) Identification and spatial patterns of coastal water pollution sources based on GIS and chemometric approach. J Environ Sci 19:805–810 Zhou F, Liu Y, Guo H (2007b) Application of multivariate statistical methods to water quality assessment of the watercourses in Northwestern New Territories, Hong Kong. Environ Monit Assess 132:1–13 Zhou F, Guo H, Liu Y, Jiang Y (2007c) Chemometrics data analysis of marine water quality and source identification in Southern Hong Kong. Mar Pollut Bull 54:745–756

Suggest Documents