Different chemometric approaches were used to determine generic patterns in the temporal and spatial variations in the coastal water quality of the northern Yellow Sea off Yantai, China. Hierarchical cluster analysis grouped the 16 months into two periods (i.e. March-October and November-February), reflecting strong seasonality in the data, and grouped the 12 sampling sites into two clusters (i.e. outside Yantai Bay and inside Yantai Bay), based on similarities in water quality characteristics. Discriminant analysis gave the best results for data complexity reduction during temporal analysis, but not during spatial analysis. Discriminant analysis identified five significant parameters (water temperature, salinity, and concentrations of dissolved inorganic nitrogen, dissolved inorganic phosphate and dissolved silicate) affording about 97.9% correct assignations in temporal analysis. In addition, principal component analysis identified three varifactors that explained 71% of temporal changes in the coastal water quality data set. Overall, the present study showed that these multivariate statistic methods were effective for evaluating temporal and spatial variations in the coastal water quality of Yantai. Water temperature and nutrient inputs may be major driving factors for the trophic status of these coastal waters. Low variances in spatial patterns of water quality parameters of Yantai were mostly related to the unrestricted water exchange between Sishili Bay and the Yellow Sea.
Coastal waters are more susceptible to pollution than open ocean waters due to the high levels of human activities in the coastal zone. Anthropogenic influences, including urban, industrial, and agricultural activities (e.g. sewage disposal, dredge disposal, accidental releases of pollutants, aquaculture, and overfishing), together with climate change, have already negatively influenced coastal water quality and coastal ecosystem function (Lindeboom, 2002). As a result, frequent occurrences of red tides, green tides, and Jellyfish blooms, with considerable economic impacts, have been observed in Chinese coastal waters (Liu et al., 2009; Dong et al., 2010b; Lin et al., 2010). Therefore, it is necessary to monitor the coastal marine environment and to understand changes in coastal water quality. However, spatial and temporal patterns in coastal waters are difficult to recognize due to the complexity of biological, chemical and physical water quality characteristics in the coastal marine environment. In addition, the large amount of data collected across a range of parameters presents challenges for the extraction of meaningful conclusions about water quality (Bierman et al., 2011).
In recent years, multivariate statistic methods have been widely used to evaluate spatial and temporal variations in riverine and coastal water quality and enable meaningful interpretation (Vega et al., 1998; Simeonov et al., 2003; Singh et al., 2004; Zhou et al., 2006, 2007; Panda et al., 2006; Wu et al., 2009, 2010; Hennemann and Petrucio, 2011; Ruggieri et al., 2011; Dong et al., 2015). Techniques such as cluster analysis (CA) and discriminant analysis (DA) can be used to group objects according to their water quality characteristics and to determine parameters controlling variations in water quality. Techniques such as principal components analysis (PCA) can reduce the dimensionality of a data set, to reveal the underlying factors behind the variability and identify the components responsible for the majority of variability.
In the present study, CA and DA were employed to analyze the similarities between monitoring sites and times, and to determine major parameters responsible for the spatial and temporal patterns in coastal water quality. PCA was used to identify the underlying spatial and temporal patterns in the trophic status of the coastal waters off Yantai city. The aim of our study was to evaluate the patterns of coastal water quality in the northern Yellow Sea off Yantai, during a 16-month survey between 2008 and 2010 using several multivariate statistic methods (CA, DA, and PCA) in support of better management of this coastal ecosystem.
Materials and methods
Yantai city is located on the northeast of the Shandong Peninsula, facing the northern Yellow Sea to the northeast (Figure 1). It is currently the second largest industrial city in Shandong province. Both aquaculture and shipping have been the subject of significant development in Yantai city over the last two decades. As a result, Yantai Port, located in Zhifu Bay on the northern shore of Yantai city, is one of China's most important coastal ports. Sishili Bay, which is located on the eastern shore of Yantai city, is one of the most intensive scallop aquaculture areas in northern China.
Yantai has a temperate monsoon climate. According to the data recorded by a weather observation center near Yantai city from 1971 to 2000 (http://www.weather.com.cn), the mean annual temperature and precipitation were 12.6 °C and 672.5 mm, respectively (Figure 2). The average monthly precipitation was at a maximum in August (161.6 mm) and minimum in February (9.7 mm). The average monthly weather temperature was at a maximum in August (24.8 °C) and minimum in January (-1.2 °C). The coastal waters off Yantai city are generally less than 15 m deep.
Sampling and analytical methods
Figure 1 shows the location of 12 sampling stations in the coastal waters around Yantai city in northeast Shandong province. Stations were sampled monthly from December 2008 to March 2010. Seven parameters (water temperature, salinity, pH, and concentrations of chlorophyll a, dissolved inorganic nitrogen (DIN), dissolved inorganic phosphate (DIP) and dissolved silicate (DSi)) were measured at each of the 12 sampling stations over the 16-month period.
Water samples for the determination of chlorophyll a and nutrient concentrations were taken from the surface with a 5L Niskin bottle. The sea surface temperature and salinity were measured in-situ using a YSI 6920 multi-parameter water quality monitor. Chlorophyll a concentrations were measured using a UV-VIS spectrophotometer (TU-1810, Beijing Purkinje General Instrument Co., Ltd, China) after filtration on GF/F membranes (Whatman) (Lorenzen, 1967). Nutrient concentrations, including NO3-, NO2-, NH4+, PO43-, and SiO43+, were analyzed using Flow Injection Analysis (AA3, Bran + Luebbe, German). The data quality was ensured through careful standardization, procedural blank measurements, appropriate sample sizes and analytical replicates (Keith et al., 1983). The data set consisted of 1,344 observations of coastal water quality of Yantai.
Multivariate statistical analysis
CA was applied to classify the objects into clusters based on their nearness or similarity (Vega et al., 1998). Since CA requires variables to conform to a normal distribution, the skewness and kurtosis values were analyzed before the cluster analysis to determine the normality of the distribution of each variable (Zhou et al., 2006; Lattin et al., 2003). In addition, all parameters were standardized through z-scale transformation (mean = 0; variance = 1) to minimize the effects of differences in measurement units and variance and render the data dimensionless (Zhou et al., 2007). In this study, hierarchical agglomerative CA was performed on the standardized data set using Ward’s method, using Euclidean distances as a measure of similarity (Singh et al., 2004; Wu et al., 2010). This method uses the analysis of variance approach to evaluate the distances between clusters, attempting to minimize the sum of the squares of any two clusters that can be formed at each step (Simeonov et al., 2003; Wunderlin et al., 2001).
DA was applied to evaluate the rationality of the temporal and spatial variations in coastal water quality determined by CA (Singh et al., 2004; Zhou et al., 2007). Backward stepwise DA was used to confirm the groups found by CA and to evaluate the spatial and temporal variations in terms of discriminant variables. In this case study, the monitoring periods and regions were the grouping variables, while all the measured parameters in the original dataset were the independent variables.
Before applying PCA, the relationships among the environmental variables, including water temperature, salinity, Chlorophyll a concentrations and nutrient concentrations, were determined using Pearson's rank correlation. PCA was used to reduce the dimensionality of the data set by explaining the variance of a large set of inter-correlated variables in terms of a small set of underlying factors (principal components). The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy and Bartlett's test of sphericity were performed prior to principal component analysis to evaluate the suitability of the data. PCA of the coastal water quality data set was performed to extract significant PCs and to further reduce the contribution of variables with minor significance. Factor analysis further reduces the contribution of less significant variables obtained from PCA and the new group of variables, known as varifactors, is extracted by rotating the axis defined by PCA. The Kaiser criterion was used to determine the optimal number of PCs. In the present study, these PCs were subjected to quartimax rotation generating varifactors.
Results and discussion
Temporal variation of water quality parameters
During the course of this study, the water temperature varied seasonally. The minimum water temperature was recorded in January 2010 (-0.1 °C), while the maximum water temperature was reached in August 2009 (24.4 °C). On average, salinity fluctuated between 29.6 psu and 33.0 psu. During the period of this study, the minimum salinity was measured in July 2009 after a period of rainfall, while the maximum salinity was in December 2009. Chlorophyll a concentrations exhibited two peaks during the 16-month period of this study. From December 2008 to February 2009, chlorophyll a concentration was low, around 0.3-1.0 μg l−1. Chlorophyll a concentration reached the first peak value of 4 μg l−1 in March 2009, and then decreased from April to June 2009. From July to September 2009, chlorophyll a concentration increased, with a maximum value of 7.9 μg l−1 in September 2009, and then decreased from October to November 2009. DIN concentration ranged between 1.1 μM and 19.8 μM (mean = 10.8 μM). Values higher than 5 μM were common during the study period. The highest mean monthly DIN value was 19.8 μM in September 2009. From December 2008 to February 2009, DIN concentration increased and reached a peak in February 2009 (18.5 μM). DIN concentration dropped to 1.1 μM in June 2009 and increased to 19.8 μM in September 2009. DIP concentration ranged from 0.1 μM to 0.5 μM (mean = 0.3 μM). Relatively high DIP concentrations were observed in both the winters of 2008/2009 and 2009/2010. DSi concentration ranged between 0.3 μM and 6.4 μM (mean = 1.6 μM). Similarly, relatively high concentrations of DSi were observed during the winters, while relatively low concentrations of DSi were observed in the spring of 2009.
Temporal variations in the coastal water quality of Yantai were evaluated through CA and DA. CA was used to group the months in the entire sampling period by similar water quality parameters. After log transformation of the original data set, the skewness and kurtosis values were significantly reduced, with ranges of -1.146 to 1.668 and -1.924 to 1.984, respectively, which were less than the critical values. Temporal cluster analysis rendered a dendrogram clustering the 16-month period into two seasonal groups (dry/cold season – Group A; wet/warm season – Group B) (Figure 3). Group A comprised the samples collected in winter (December 2008 to February 2009 and November 2009 to February 2010), while Group B included the samples collected in spring, summer, and autumn (March to October 2009 and March 2010). To further evaluate the clusters determined by temporal CA, a backward stepwise DA was conducted on the raw data after dividing the whole data set into two (seasonal) groups. Wilks' lambda and the Chi-square for the discriminant function were 0.250 and 259.663, respectively, at p < 0.001 (Table 1), suggesting that the temporal DA was credible and effective. The discriminant functions and classification matrices obtained from the backward stepwise mode of DA are shown in Tables 2 and 3, respectively. The backward stepwise DA mode constructed discriminant functions using five discriminant parameters (Table 3). For the two-cluster result, DA produced classification matrices with about 97.9% correct assignments (Table 2). Therefore, the temporal DA suggests that water temperature, salinity, DIN, DIP, and DSi concentrations were the most significant parameters for discriminating between the two seasonal periods. Box and whisker plots of these variables are shown in Figure 5. The statistical description of temporal variation in coastal water quality near Yantai, northern Yellow Sea, China, is shown in Table 4.
Two periods (the dry-cold season and the wet-warm season) were identified in the present study. The mean water temperatures in the dry-cold season and the wet-warm season were 3.1 °C and 15.1 °C, respectively (Table 1; Figure 5). The temporal variation of water temperature was significant (P < 0.05) as determined by one-way analysis of variance. The chlorophyll a concentration, as an indicator of phytoplankton biomass and eutrophication (Ruiz-Ruiz et al., 2016), was higher in the wet-warm season (3.8 ± 3.7μg l−1) than in the dry-cold season (0.6 ± 0.3μg l−1) (Table 4; Figure 5). In addition, the Pearson correlation coefficient between chlorophyll a concentration and water temperature was 0.575 (P < 0.0001), indicating temporal variations in phytoplankton biomass were largely controlled by water temperature. The rainfall was relatively high from March to October, and lower between November and February (Figure 2). Therefore, the surface water salinity was higher in the dry-cold season (31.7 psu) than in the wet-warm season (30.8 psu). In general, higher rainfall and more land-based water input may wash soluble nutrients from the soil towards the coast and discharge them into coastal waters. However, in the coastal waters of Sishili Bay, the nutrient concentration, including DIN, DIP, and DSi, were higher in the dry-cold season than in the wet-warm season, although no significant temporal variations were detected (Table 4; Figure 5). A possible explanation for this inconsistency is that sewage disposal and winter snow (January and February) might be important sources of nutrient supply to the coastal waters of Yantai. Additionally, phytoplankton might consume more nutrients in the wet-warm season than in the dry-cold season, causing a decrease in the amount of dissolved nitrogen and phosphorus in the water.
The Pearson correlation matrix for the seven variables (water temperature, salinity, pH, chlorophyll a, and DIN, DIP and DSi concentration) used in the PCA analysis is shown in Table 5. A strong correlation existed between chlorophyll a concentration and water temperature. In this study, the KMO value was 0.563 and the Bartlett's test was significant (Chi-square 320.738 with 21 degree of freedom at the 0.01 level). Therefore, principal component analysis (PCA) was considered an appropriate technique for reducing the dimensionality of the original data. In this study, PCA did not result in much data reduction; therefore, a quartimax rotation was performed, which achieved a simpler and more meaningful representation of the underlying factors by decreasing contributions to PCs of variables with minor significance and increasing influence of the more significant ones (Razmkhah et al., 2010). PCA of the entire original data set identified three significant PCs with eigenvalue >1 that together explain 71% of the total variance in the coastal water quality data set (Table 6). The first PC, accounting for 28.05% of the total variance, was positively correlated with water temperature and Chl a concentration and negatively correlated with DIP concentration. When the chlorophyll a concentration was higher, the DIP concentrations decreased, indicating that the phytoplankton biomass was limited by DIP concentration. The second PC, accounting for 24.72% of the total variance, was positively correlated (loading >0.70) with loading of DIN and DSi concentration. This factor represented nutrient pollution from anthropogenic sources such as wastewater and agricultural activities (Singh et al., 2005; Zhou et al., 2007). The third PC, accounting for 17.93% of the total variance, was correlated (loading >0.70) with pH.
The results of the PCA analysis allowed us to identify three varifactors, accounting for 71% of temporal variability in the coastal water quality of Yantai. Based on PCA and DA analysis, water temperature and nutrient inputs may be major driving factors of the trophic status of the coastal waters of Yantai. The temporal variations in water quality were primarily due to the natural effects of seasonal change. The water temperature was a significant indicator of temporal variations in coastal waters of Yantai City. The dry-cold season from November to the next February and the wet-warm season from March to October have been distinguished. This two-period pattern was similar to that in Sanya Bay, South China Sea (Dong et al., 2010a). However, the trophic status of the coastal waters of Yantai did not correspond to the seasons (spring, summer, autumn, and winter), because the trophic status of the water was also related to pollution characteristics, such as aquaculture and domestic sewage effluent discharge (Wu et al., 2010).
Spatial variation of water quality parameters
CA was used to group sampling stations based on similarities in water quality characteristics throughout the entire sampling period. Spatial cluster analysis of all the samples produced a dendrogram with two groups (Figure 4). All 12 stations in the coastal waters of Yantai clustered into two groups, with Group A comprising of five stations (A1-A3 and D1-D2), and Group B containing seven stations (B1-B4 and C1-C3). The stations of Group A were located outside of Sishili Bay, near the sewage disposal (A1-A3) and dredge disposal (D1-D2) areas. The stations of Group B were located within Zhifu Bay (B1-B4), which is one of China's most important coastal ports, and Sishili Bay (C1-C3), which is one of the most intensive scallop culture areas in China.
The statistical descriptions of spatial variation in water quality parameters in Sishili Bay, northern Yellow Sea, China, are shown in Table 4. For the two clusters determined by spatial CA, backward stepwise DA was also performed on the raw data after dividing the whole data set into two spatial groups. There was no statistically significant difference in the seven parameters between the two groups, indicating low variances in spatial patterns of water quality parameters. The spatial variations in water temperature, salinity, chlorophyll a concentration and nutrient concentrations were not significant by one-way analysis of variance (Table 4).
The hydrodynamic conditions, including water flow and tides in Sishili Bay, may explain this. Sishili Bay is permanently open to the Yellow Sea, although there are some small islands in the mouth of the bay (Figure 1). The water exchange between Sishili Bay and the Yellow Sea is relatively unrestricted. The astronomical tide along the coast of Yantai is semi-diurnal, with a tidal range of 1.66 m. In addition, the Yellow Sea coastal current is considered a driving force of nutrient transport and mixing in Sishili Bay (Pang et al., 2005).
In summary, the present study showed that multivariate statistic methods (CA, DA, and PCA) were effective for evaluating coastal water quality. Hierarchical CA grouped the 16-month study period into two groups (cold-dry and warm-wet) based on similarities in coastal water quality characteristics. Based on these results, backward stepwise DA gave the best result with good discriminatory ability according to significance validation tests, and identified five significant parameters for discrimination among temporal groups (water temperature, salinity, DIN, DIP, and DSi concentrations), correctly assigning about 97.9% of cases. PCA identified three varifactors that contributed to the temporal variations in the coastal water quality of Yantai. Based on PCA and DA analysis, water temperature and nutrients may be the major driving factors of the trophic status of coastal water quality of Yantai. However, backward stepwise DA failed to distinguish the spatial variations in coastal water quality of Yantai, suggesting low variances in spatial patterns of coastal water quality.
Blooms of harmful phytoplankton (Chattonella marina) and Moon Jellyfish Aurelia aurita have been reported in Sishili Bay, causing considerable economic losses (Jiang et al., 2011; Dong et al., 2012). Excess nutrient inputs were suggested as possible causes for the blooms of phytoplankton and jellyfish (Dong et al., 2010b; Lin et al., 2010). Our study showed that water temperature and nutrient inputs were major driving factors of the trophic status of coastal water quality of Yantai. Therefore, the government needs to be mindful of nutrient loadings in Sishili Bay from the harbor, sewage discharge, and dredge disposal. In addition, the low variances in spatial patterns of coastal water quality of Yantai suggest that nutrient inputs from the sewage disposal and dredge disposal areas located outside of Sishili Bay should be carefully monitored in future.
We thank Prof. John Keesing, Dr. Yajun Shi, Dr. Qingxi Han, Dr. Zhang Yong and Mr. Xin Li for their assistance in this study. This work was supported by grants from the Knowledge Innovation Research Project of Chinese Academy of Sciences (KZCX2-YW-Q07-04) and the National Natural Science Foundation of China (No.41576152).