MTH241 (Probability and Statistics) Spring 2013

Statistical Study of the Geochemical Properties of Kampoosa Bog

Group D - Section 2 Three Anonymous Smith Students

  1. Abstract: The Kampoosa Bog, home to many endangered species, is polluted each year by road salt (sodium chloride), altering the sensitive ecosystem. The purpose of this study is to predict calcium levels in Kampoosa Bog based on sodium concentration (from pollution), well depth, and season. We used a mixed effects model to account for lack of independence in geochemical measurements. Our model indicates a positive relationship between Ca2+ and Na+ concentrations. This relationship is strongest at shallower well depths and at sites closest to the road likely due to their proximity to the pollution source. Our summary shows a statistically significant relationship between Ca2+ and Na+ when controlling for site and well depth.

  2. Introduction: Kampoosa Bog is the largest and most ecologically diverse calcareous fen in Massachusetts [1]. The Bog, technically a fen, is home to seven rare and endangered plants and animals; consequently, the site is an area of critical environmental concern (ACEC). The northern part of Kampoosa Bog is cut by the Massachusetts Turnpike. Each year, many hundreds of tons of road salt (NaCl) are used to maintain safe driving conditions on the turnpike. Unfortunately, road salt is transported by melting snow and precipitation from the highway to the surrounding environment where it harms sensitive flora and fauna by altering ecosystem geochemistry. Fen geochemistry is characterized by three major ecological components: groundwater, surface water, and the peat chemistry of the site. Similar to soil, peat is a compositionally complex material, containing many metals, ions, and organics. Water inputs and distribution can cause physical and chemical changes to the landscape, making road salt a major concern. It is generally known that sodium (one component of road salt) can undergo a process called cation exchange with metals like calcium and magnesium on peat surfaces [2]. In that process, sodium is adsorbed onto the peat and calcium (or magnesium) is desorbed and added to the surrounding water. Chloride, the other component of road salt, in contrast does not adsorb onto peat as it travels through a wetland. Changing levels of calcium and magnesium as a result of road salt flux alters fen geochemistry. This study will focus on exploring this process as applied to Kampoosa Bog. We will use data collected by Smith College Professor of Geology, Amy Rhodes who monitored levels of cations like calcium and magnesium at five well sites over several years. Only these five well sites exist in the 70-hectare fen, but it is of scientific interest to characterize ion distribution throughout the entire bog because of potential danger to the rare flora and fauna of this natural heritage site. Changes in the aqueous environment have been suggested for the movement of invasive halophytes like Phragmites, shifting the species composition of the fen [2]. In this study, we will test whether measured sodium concentrations from water samples at the five well sites can predict calcium levels in sampled areas, conditioning for well depth, distance from the road, and season. Samples from a site uphill of the turnpike (MB-100) will serve as the control and establish “background” levels of sodium and calcium, which are naturally present within peat and water. Conditioning variables that we will consider are well site distance from the turnpike, time of year (seasonal changes in road salt use), and water sample depth (surface water vs. three wells of varying depths).

Model Assumptions: The sites lack independence (multiple observations from the same site). We used a linear mixed effects model because multiple regression assumes independence of observations [3].

  1. Data:

This data was collected by Smith College Professor of Geology, Amy Rhodes who monitored levels of metals like calcium and magnesium at five well sites and a control (MB-100) over several years [1]. Samples from MB-100, uphill of the turnpike, serves as the control and establish “background” levels of sodium and calcium. Ion concentrations are measured in mg/L. Conditioning variables that we will consider are well site distance from the turnpike (in meters), time of year (seasonal changes in road salt use; variable levels: Spring, Summer, Fall and Winter), and water sample depth (surface water vs. three wells of varying depths; variable levels: Well0, Well1, Well2 and Well3).

require(mosaic)
require(nlmeODE)
options(digits = 3)
options(show.signif.stars = FALSE)
trellis.par.set(theme = col.mosaic())  # get a better color scheme for lattice

# read in the data
Sites = read.csv("GroupD_masterbog.csv")

# reorder sites by distance from the road
Sites$Site = factor(Sites$Site, levels(Sites$Site)[c(6, 3, 2, 4, 5, 1)])

# reorder seasons by chronology
Sites$Season = factor(Sites$Season, levels(Sites$Season)[c(4, 2, 3, 1)])


# create no control subset
noControl = subset(Sites, Sites$Site != "MB-100")

This code describes how we organized our data sets. One data set has all observations, while we created a subset of that to exclude the control observations (called noControl). We also ordered the sites by increasing distance from the road. This order is Site E, B, A, C, D.

# salt
histogram(~Na., data = Sites)

plot of chunk unnamed-chunk-2

favstats(~Na., data = Sites)
##  min   Q1 median  Q3 max mean   sd   n missing
##  2.3 32.5     75 103 202 74.3 47.4 295       1

Sodium concentration distribution is roughly unimodal and right-skewed, with the mean of 74.3 mg/L and standard deviation of 47.4 mg/L

# metal
histogram(~Ca2., data = Sites)

plot of chunk unnamed-chunk-3

favstats(~Ca2., data = Sites)
##   min   Q1 median   Q3 max mean   sd   n missing
##  21.5 49.3     67 90.4 161 71.8 30.3 295       1

Calcium, the response variable, also appear to be distributed in a unimodal way with a right-skew, but looks more normal. The mean is 71.8 mg/L and standard deviation is 30.3 mg/L.

histogram(~Season, data = Sites)

plot of chunk unnamed-chunk-4

tally(~Season, format = "percent", data = Sites)
## 
## Winter Spring Summer   Fall  Total 
##   12.8   34.8   34.1   18.2  100.0

The majority of the samples were taken in Spring and Summer (34.8% and 34.1% respectively). 18.2% of the samples were taken in the Fall and 12.8% were taken in the Winter.

histogram(~WellDepth, data = Sites)

plot of chunk unnamed-chunk-5

tally(~WellDepth, format = "percent", data = Sites)
## 
## Well 0 Well 1 Well 2 Well 3  Total 
##   34.1   34.1   16.9   14.9  100.0

More than half of the samples were taken at the surface (34.1%) or at Well 1 at depth ~ 1m (34.1%). 16.9% of samples were taken at Well 2 at depth ~ 2m, and 14.9% was taken at Well 3 (depth ~ 3 m).

histogram(~Site, data = Sites)

plot of chunk unnamed-chunk-6

tally(~Site, format = "percent", data = Sites)
## 
## Site E Site B Site A Site C Site D MB-100  Total 
##   4.39  29.39  27.70  12.50  10.47  15.54 100.00

Most of the samples were taken at Site B (29.5%) and Site A (27.5%). Sites C, D and MB-100 have about 10-16% of the samples. The Site E has the least proportion of the samples (4.41%).

bwplot(Na. ~ Site, data = Sites)

plot of chunk unnamed-chunk-7

As expected, the salt concentration goes down with increasing distance from the road. The control site has salt levels of around 0 mg/L. This plot also illustrates variations within each sites and in-between the sites, that are used to construct mixed effects model.

  1. Results:

We see a strong positive linear relationship between calcium levels and sodium levels at surface and shallow depth (well 0 and 1). For deeper wells 2 and 3, we do not observe a relationship between calcium and sodium. The control site (MB-100) was removed for this plot because it was categorized as surface water (i.e. as well 0). Also MB-100 is not affected by sodium, because it is far from the pollution site.

xyplot(Ca2. ~ Na., data = noControl, groups = WellDepth, ylab = "Concentration of Calcium (mg/L)", 
    xlab = "Concentration of Sodium (mg/L)", pch = c(15, 16, 17, 18), col = c(1, 
        2, 3, 4), type = c("p", "r"), key = list(points = list(pch = c(15, 16, 
        17, 18)), col = c(1, 2, 3, 4), columns = 2, text = list(c("Well 0 (Surface Water)", 
        "Well 1 (~1 m)", "Well 2 (~2 m)", "Well 3 (~ 3 m)"))))

plot of chunk unnamed-chunk-8

Relationship between Ca2+ concentration (mg/L) and Na+ concentration (mg/L) grouped by site (distance from road). MB-100 (pink), the control site, shows no relationship between Ca2+ concentration and Na+ concentration because there is no salt pollution at that site. Site D (light blue) and C (dark blue) also shows little relationship between Ca2+ concentration and Na+ concentration, possibly because these two sites are farther from the road. The closer sites, E (black) and B (red) show a more positive relationship between the two concentrations. Measurements from the deeper well depths 2 and 3 (groups in the upper left corner of the plot) show less of a relationship as discussed previously.

xyplot(Ca2. ~ Na., data = Sites, groups = Site, ylab = "Concentration of Calcium (mg/L)", 
    xlab = "Concentration of Sodium (mg/L)", pch = c(1, 2, 15, 16, 17, 18), 
    col = c(1, 2, 3, 4, 5, 6), key = list(points = list(pch = c(1, 2, 15, 16, 
        17, 18)), col = c(1, 2, 3, 4, 5, 6), columns = 2, text = list(c("Site E", 
        "Site B", "Site A", "Site C", "Site D", "MB-100"))))

plot of chunk unnamed-chunk-9

We plotted all the surface water observations to investigate the effects of seasons. As this plot shows, we did not observe significant differences in calcium and sodium levels by season.

Depth0 = subset(noControl, WellDepth == "Well 0")
xyplot(Ca2. ~ Na., data = Depth0, groups = Season, auto.key = TRUE)

plot of chunk unnamed-chunk-10

The statistically significant coefficients related to the response variable, calcium, are sodium levels and well depths. Season categories appeared to not be statistically significant in this model.Not conditioning on site, the standard deviation between sites is 15.8 mg/L and within each site is 16.6 mg/L.

Holding well depth and season variables constant, for each unit increase in sodium concentration there is a corresponding increase of 0.3 mg/L of calcium. Sodium and calcium can replace one another on peat surfaces via cation exchange. An influx of sodium from road salt pollution can cause more calcium ions to be replaced on peat (de-adsorbed), thus increasing the levels of calcium in measured samples. Our coefficient of 0.3 mg/L supports the claim that there is a positive correlation between sodium and calcium levels. Cation exchange reactions are likely occurring to some degree within the fen.

Relative to surface waters (well depth=0) and holding season and sodium levels constant, there is an additional 7.9 mg/L of calcium for well depth 1 (approximately 1 meter deep). For well 2 the relative increase in calcium levels is 32.6 mg/L and for well 3 it is 61.2 mg/L. Increases in calcium levels with depth is likely explained by fen bedrock geology. Increasing levels of calcium are measured with increasing depth because the calcitic bedrock weathers to produce calcium ions in minerotrophic waters, thus enriching the deepest peat layers first.

lmeint = lme(fixed = Ca2. ~ Na. + WellDepth + Season, random = ~1 | Site, na.action = na.omit, 
    data = noControl)

summary(lmeint)
## Linear mixed-effects model fit by REML
##  Data: noControl 
##    AIC  BIC logLik
##   2087 2122  -1033
## 
## Random effects:
##  Formula: ~1 | Site
##         (Intercept) Residual
## StdDev:        16.6     15.8
## 
## Fixed effects: Ca2. ~ Na. + WellDepth + Season 
##                 Value Std.Error  DF t-value p-value
## (Intercept)      32.5      8.84 237    3.68  0.0003
## Na.               0.3      0.04 237    7.87  0.0000
## WellDepthWell 1   7.9      2.69 237    2.94  0.0036
## WellDepthWell 2  32.6      3.14 237   10.38  0.0000
## WellDepthWell 3  61.2      3.56 237   17.19  0.0000
## SeasonSpring      3.3      3.37 237    0.97  0.3333
## SeasonSummer      3.1      3.29 237    0.95  0.3442
## SeasonFall        8.1      3.60 237    2.24  0.0263
##  Correlation: 
##                 (Intr) Na.    WllDW1 WllDW2 WllDW3 SsnSpr SsnSmm
## Na.             -0.383                                          
## WellDepthWell 1 -0.145 -0.103                                   
## WellDepthWell 2 -0.205  0.112  0.507                            
## WellDepthWell 3 -0.285  0.395  0.448  0.438                     
## SeasonSpring    -0.299  0.115 -0.064  0.003  0.030              
## SeasonSummer    -0.197 -0.182  0.035 -0.064 -0.075  0.669       
## SeasonFall      -0.233 -0.007  0.005 -0.022 -0.006  0.622  0.639
## 
## Standardized Within-Group Residuals:
##    Min     Q1    Med     Q3    Max 
## -3.135 -0.780  0.022  0.550  3.313 
## 
## Number of Observations: 249
## Number of Groups: 5

As discussed, we are not analyzing residuals for this model.

  1. Conclusions:

After accounting for clustering within wells, our data show a positive relationship between Ca2+ and Na+ concentrations. This relationship is strongest at shallower well depths and at sites closest to the road, likely due to their proximity to the pollution source.Our summary shows a statistically significant relationship between Ca2+ and Na+ when controlling for site and well depth. The limitations of our model include having limited number of samples in Site E (closest to the road). We also had far more samples in the spring and summer, as compared to winter and fall. We could not analyze residuals for this model, as it would go beyond the scope of this class. Because the measurements are not independent of each other, the model exacerbates the measurement errors.

  1. Rhodes, A. L.; Guswa, A. J.; Pufall, A. Fate And Transport of Road Salt Contamination Through a Calcareous Fen: Kampoosa Bog, Stockbridge, MA. Massachusetts Environmental Trust, 2005.
  2. Richburg, J. A.; Patterson III, W. A.; Lowenstein, F. Effects of Road Salt and Phragmites australis Invasion on the Vegetation of a Western Massachusetts Calcareous Lake-Basin Fen. Wetlands, 21(2): 2001, 247 – 255.
  3. Finucane MM, Samet JH, Horton NJ (2007) Translational methods in biostatistics: linear mixed effect regression models of alcohol consumption and HIV disease progression over time. Epidemiologic Perspectives & Innovations, 4(8).