This is an attempt to reproduce the anaylses presented in the paper Parasitism alters three power laws of scaling in a metazoan community: Taylor’s law, density-mass allometry, and variance-mass allometry, by Lagrue, Poulin, and Cohen (http://www.pnas.org/content/112/6/1791). Data are here as txt, but need to be saved as csv.
[The study was conducted] in metazoan communities in lakes of Otago, New Zealand. In 13,752 samples comprising 1,037,058 organisms, we found that species of different lifestyles differed in taxonomic distribution and body mass and were well described by three power laws: a spatial Taylor’s law (the spatial variance in population density was a power-law function of the spatial mean population density); density-mass allometry (the spatial mean population density was a power-law function of mean body mass); and variance-mass allometry (the spatial variance in population density was a power-law function of mean body mass). To our knowledge, this constitutes the first empirical confirmation of variance-mass allometry for any animal community. We found that the parameter values of all three relationships differed for species with different lifestyles in the same communities. Taylor’s law and density-mass allometry accurately predicted the form and parameter values of variance-mass allometry. We conclude that species of different lifestyles in these metazoan communities obeyed the same major ecological power-law relationships but did so with parameters specific to each lifestyle […].
Let’s see if we are able to reproduce their results.
Clean R’s memory, load useful packages, load data.
# clean R's memory:
rm(list=ls())
# Load packages:
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate)
library(stringr)
library(ggplot2)
library(RCurl)
## Loading required package: bitops
library(reshape2)
library(knitr)
# Load data:
dd <- read.csv(text=getURL("https://raw.githubusercontent.com/opetchey/RREEBES/Lagrue/Lagrue_etal_2014_PNAS/data/Lague2014.csv"), header=T)
# head(dd)
# str(dd)
# make some colnames more R-friendly:
colnames(dd)[which(names(dd) == "Mean.density..m..")] <- "Mean_density"
colnames(dd)[which(names(dd) == "Body.mass..mg.")] <- "Body_mass_mg"
Numbers (counts) and percentages of samples from broad taxonomic groups for each lifestyle separately.
#?aggregate
counts <- aggregate(x=dd$Species, by=list(Taxa=dd$Taxonomic.group, lifestyle=dd$Life.style), function(x) length(x))
colnames(counts)<-c("Taxa", "Lifestyle","Counts")
# same information in a different format:
counts2 <- tapply(dd$Species, list(Taxa=dd$Taxonomic.group, lifestyle=dd$Life.style), na.omit(length))
# replace NA's with zeroes:
counts2[is.na(counts2)]<- 0
total <- aggregate(x=dd$Species, by=list(Lifestyle=dd$Life.style), function(x) length(x))
colnames(total)<- c("Lifestyle", "Total")
tab1 <- merge(counts, total, by="Lifestyle")
tab1$percent <- tab1$Counts / tab1$Total * 100
tab1$percent <- round(tab1$percent, 0)
kable(tab1)
Lifestyle | Taxa | Counts | Total | percent |
---|---|---|---|---|
Free-living | Annelid | 26 | 329 | 8 |
Free-living | Crustacean | 107 | 329 | 33 |
Free-living | Fish | 5 | 329 | 2 |
Free-living | Insect | 122 | 329 | 37 |
Free-living | Mite | 12 | 329 | 4 |
Free-living | Mollusc | 35 | 329 | 11 |
Free-living | Other | 22 | 329 | 7 |
Free-living parasitised | Annelid | 18 | 151 | 12 |
Free-living parasitised | Crustacean | 18 | 151 | 12 |
Free-living parasitised | Fish | 58 | 151 | 38 |
Free-living parasitised | Insect | 45 | 151 | 30 |
Free-living parasitised | Mollusc | 12 | 151 | 8 |
Parasite | Acanthocephalan | 8 | 253 | 3 |
Parasite | Cestode | 2 | 253 | 1 |
Parasite | Mite | 4 | 253 | 2 |
Parasite | Nematoda | 26 | 253 | 10 |
Parasite | Trematode | 213 | 253 | 84 |
Parameter estimates using ordinary least squares linear regression to fit TL, DMA, and VMA to log- transformed data on the spatial variation of population density (individuals per square meter) of free-living unparasitized species, free-living parasitized species, and parasitic species; and the distribution of average body masses of species in each lifestyle.
# TL: variance = a × (mean density)^b
# ie Log.variance = log10a + Log.mean * b
lm.tl.unp <- lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living",])
lm.tl.prtz <- lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living parasitised",])
lm.tl.par <- lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Parasite",])
# DMA: Mean density = u × (mean body mass)^v
lm.dma.unp <- lm(Log.mean ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",])
lm.dma.prtz <- lm(Log.mean ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",])
lm.dma.par <- lm(Log.mean ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",])
# VMA: variance = c × (mean body mass)^d
lm.vma.unp <- lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",])
lm.vma.prtz <- lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",])
lm.vma.par <- lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",])
# Create a dataframe where to store info:
table2 <- data.frame(model=c(rep("TL", 7), rep("DMA",7), rep("VMA",7)))
table2$values <- c("log10a", "log10a - 95%CI", "log10a + 95%CI",
"b", "b - 95%CI", "b + 95%CI", "adj.R2",
"log10u", "log10u - 95%CI", "log10u + 95%CI",
"v", "v - 95%CI", "v + 95%CI","adj.R2",
"log10c", "log10c - 95%CI", "log10c + 95%CI",
"d", "d - 95%CI", "d + 95%CI","adj.R2"
)
table2$free.living.unparasitized <- rep("NA",21)
table2$free.living.parasitized <- rep("NA",21)
table2$parasite <- rep("NA",21)
# enter info about TL
table2[1,3]<-round(summary(lm.tl.unp)$coef[1,1],4)
table2[2,3]<-round(summary(lm.tl.unp)$coef[1,1] - 1.96*summary(lm.tl.unp)$coef[1,2],4)
table2[3,3]<-round(summary(lm.tl.unp)$coef[1,1] + 1.96*summary(lm.tl.unp)$coef[1,2],4)
table2[4,3]<-round(summary(lm.tl.unp)$coef[2,1],4)
table2[5,3]<-round(summary(lm.tl.unp)$coef[2,1] - 1.96*summary(lm.tl.unp)$coef[2,2],4)
table2[6,3]<-round(summary(lm.tl.unp)$coef[2,1] + 1.96*summary(lm.tl.unp)$coef[2,2],4)
table2[7,3]<-round(summary(lm.tl.unp)$adj.r.squared,4)
table2[1,4]<-round(summary(lm.tl.prtz)$coef[1,1],4)
table2[2,4]<-round(summary(lm.tl.prtz)$coef[1,1] - 1.96*summary(lm.tl.prtz)$coef[1,2],4)
table2[3,4]<-round(summary(lm.tl.prtz)$coef[1,1] + 1.96*summary(lm.tl.prtz)$coef[1,2],4)
table2[4,4]<-round(summary(lm.tl.prtz)$coef[2,1],4)
table2[5,4]<-round(summary(lm.tl.prtz)$coef[2,1] - 1.96*summary(lm.tl.prtz)$coef[2,2],4)
table2[6,4]<-round(summary(lm.tl.prtz)$coef[2,1] + 1.96*summary(lm.tl.prtz)$coef[2,2],4)
table2[7,4]<-round(summary(lm.tl.prtz)$adj.r.squared,4)
table2[1,5]<-round(summary(lm.tl.par)$coef[1,1],4)
table2[2,5]<-round(summary(lm.tl.par)$coef[1,1] - 1.96*summary(lm.tl.par)$coef[1,2],4)
table2[3,5]<-round(summary(lm.tl.par)$coef[1,1] + 1.96*summary(lm.tl.par)$coef[1,2],4)
table2[4,5]<-round(summary(lm.tl.par)$coef[2,1],4)
table2[5,5]<-round(summary(lm.tl.par)$coef[2,1] - 1.96*summary(lm.tl.par)$coef[2,2],4)
table2[6,5]<-round(summary(lm.tl.par)$coef[2,1] + 1.96*summary(lm.tl.par)$coef[2,2],4)
table2[7,5]<-round(summary(lm.tl.par)$adj.r.squared,4)
# enter info about DMA
table2[1+7,3]<-round(summary(lm.dma.unp)$coef[1,1],4)
table2[2+7,3]<-round(summary(lm.dma.unp)$coef[1,1] - 1.96*summary(lm.dma.unp)$coef[1,2],4)
table2[3+7,3]<-round(summary(lm.dma.unp)$coef[1,1] + 1.96*summary(lm.dma.unp)$coef[1,2],4)
table2[4+7,3]<-round(summary(lm.dma.unp)$coef[2,1],4)
table2[5+7,3]<-round(summary(lm.dma.unp)$coef[2,1] - 1.96*summary(lm.dma.unp)$coef[2,2],4)
table2[6+7,3]<-round(summary(lm.dma.unp)$coef[2,1] + 1.96*summary(lm.dma.unp)$coef[2,2],4)
table2[7+7,3]<-round(summary(lm.dma.unp)$adj.r.squared,4)
table2[1+7,4]<-round(summary(lm.dma.prtz)$coef[1,1],4)
table2[2+7,4]<-round(summary(lm.dma.prtz)$coef[1,1] - 1.96*summary(lm.dma.prtz)$coef[1,2],4)
table2[3+7,4]<-round(summary(lm.dma.prtz)$coef[1,1] + 1.96*summary(lm.dma.prtz)$coef[1,2],4)
table2[4+7,4]<-round(summary(lm.dma.prtz)$coef[2,1],4)
table2[5+7,4]<-round(summary(lm.dma.prtz)$coef[2,1] - 1.96*summary(lm.dma.prtz)$coef[2,2],4)
table2[6+7,4]<-round(summary(lm.dma.prtz)$coef[2,1] + 1.96*summary(lm.dma.prtz)$coef[2,2],4)
table2[7+7,4]<-round(summary(lm.dma.prtz)$adj.r.squared,4)
table2[1+7,5]<-round(summary(lm.dma.par)$coef[1,1],4)
table2[2+7,5]<-round(summary(lm.dma.par)$coef[1,1] - 1.96*summary(lm.dma.par)$coef[1,2],4)
table2[3+7,5]<-round(summary(lm.dma.par)$coef[1,1] + 1.96*summary(lm.dma.par)$coef[1,2],4)
table2[4+7,5]<-round(summary(lm.dma.par)$coef[2,1],4)
table2[5+7,5]<-round(summary(lm.dma.par)$coef[2,1] - 1.96*summary(lm.dma.par)$coef[2,2],4)
table2[6+7,5]<-round(summary(lm.dma.par)$coef[2,1] + 1.96*summary(lm.dma.par)$coef[2,2],4)
table2[7+7,5]<-round(summary(lm.dma.par)$adj.r.squared,4)
# enter info about VMA
table2[1+14,3]<-round(summary(lm.vma.unp)$coef[1,1],4)
table2[2+14,3]<-round(summary(lm.vma.unp)$coef[1,1] - 1.96*summary(lm.vma.unp)$coef[1,2],4)
table2[3+14,3]<-round(summary(lm.vma.unp)$coef[1,1] + 1.96*summary(lm.vma.unp)$coef[1,2],4)
table2[4+14,3]<-round(summary(lm.vma.unp)$coef[2,1],4)
table2[5+14,3]<-round(summary(lm.vma.unp)$coef[2,1] - 1.96*summary(lm.vma.unp)$coef[2,2],4)
table2[6+14,3]<-round(summary(lm.vma.unp)$coef[2,1] + 1.96*summary(lm.vma.unp)$coef[2,2],4)
table2[7+14,3]<-round(summary(lm.vma.unp)$adj.r.squared,4)
table2[1+14,4]<-round(summary(lm.vma.prtz)$coef[1,1],4)
table2[2+14,4]<-round(summary(lm.vma.prtz)$coef[1,1] - 1.96*summary(lm.vma.prtz)$coef[1,2],4)
table2[3+14,4]<-round(summary(lm.vma.prtz)$coef[1,1] + 1.96*summary(lm.vma.prtz)$coef[1,2],4)
table2[4+14,4]<-round(summary(lm.vma.prtz)$coef[2,1],4)
table2[5+14,4]<-round(summary(lm.vma.prtz)$coef[2,1] - 1.96*summary(lm.vma.prtz)$coef[2,2],4)
table2[6+14,4]<-round(summary(lm.vma.prtz)$coef[2,1] + 1.96*summary(lm.vma.prtz)$coef[2,2],4)
table2[7+14,4]<-round(summary(lm.vma.prtz)$adj.r.squared,4)
table2[1+14,5]<-round(summary(lm.vma.par)$coef[1,1],4)
table2[2+14,5]<-round(summary(lm.vma.par)$coef[1,1] - 1.96*summary(lm.vma.par)$coef[1,2],4)
table2[3+14,5]<-round(summary(lm.vma.par)$coef[1,1] + 1.96*summary(lm.vma.par)$coef[1,2],4)
table2[4+14,5]<-round(summary(lm.vma.par)$coef[2,1],4)
table2[5+14,5]<-round(summary(lm.vma.par)$coef[2,1] - 1.96*summary(lm.vma.par)$coef[2,2],4)
table2[6+14,5]<-round(summary(lm.vma.par)$coef[2,1] + 1.96*summary(lm.vma.par)$coef[2,2],4)
table2[7+14,5]<-round(summary(lm.vma.par)$adj.r.squared,4)
kable(table2)
model | values | free.living.unparasitized | free.living.parasitized | parasite |
---|---|---|---|---|
TL | log10a | 0.806 | 0.2903 | 0.4333 |
TL | log10a - 95%CI | 0.7529 | 0.205 | 0.3553 |
TL | log10a + 95%CI | 0.859 | 0.3755 | 0.5114 |
TL | b | 1.6802 | 2.0193 | 2.102 |
TL | b - 95%CI | 1.6442 | 1.9743 | 2.057 |
TL | b + 95%CI | 1.7162 | 2.0642 | 2.147 |
TL | adj.R2 | 0.9623 | 0.981 | 0.9708 |
DMA | log10u | 0.8577 | 1.9905 | -0.2538 |
DMA | log10u - 95%CI | 0.732 | 1.738 | -0.4741 |
DMA | log10u + 95%CI | 0.9834 | 2.2431 | -0.0334 |
DMA | v | -0.2167 | -0.7983 | -0.9088 |
DMA | v - 95%CI | -0.296 | -0.8808 | -1.0642 |
DMA | v + 95%CI | -0.1375 | -0.7159 | -0.7535 |
DMA | adj.R2 | 0.0779 | 0.7055 | 0.341 |
VMA | log10c | 2.2423 | 4.4773 | -0.0572 |
VMA | log10c - 95%CI | 2.0236 | 4.0086 | -0.5363 |
VMA | log10c + 95%CI | 2.461 | 4.946 | 0.4219 |
VMA | d | -0.2967 | -1.6841 | -1.8657 |
VMA | d - 95%CI | -0.4346 | -1.8371 | -2.2036 |
VMA | d + 95%CI | -0.1587 | -1.5311 | -1.5279 |
VMA | adj.R2 | 0.0487 | 0.7559 | 0.3155 |
Test of TL. Log variance of population density was an approximately linear function of the log mean of population density for (A) free-living un- parasitized species, (B) free-living parasitized species, and (C) parasites. Each dot represents the log variance and log mean of population density of multiple samples of one life stage of one species at one lake in one season. The solid lines represent least-squares regressions. The estimated parameters and their 95% confidence intervals, the number of data points N (dots), and the adjusted R2 are given in Table 2.
First, explore the data by plotting untransformed variances versus untransformed means for each lifestyle:
# in ggplot2:
p <- qplot(Mean_density, Variance, data=dd)
p +
theme_bw((base_size = 18)) + facet_wrap(~Life.style) + guides(fill=F) +
stat_smooth(method="lm", se=TRUE, fill=NA,formula=y ~ poly(x, 2, raw=TRUE),colour="black") +
scale_x_continuous(name = "Mean density") +
scale_y_continuous(name = "Variance") +
theme(text=element_text(family="Times New Roman"), axis.title.x = element_text(vjust=-0.4),axis.title.y = element_text(vjust=0.2))
Although large mean values are not very well represented, there is a general trend toward an exponential increase in variance with increasing mean. Here is the reproduction of figure 1, with log(variances) versus log(means):
par(mfrow=c(1,3))
plot(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living",],
xlim=c(-3,5), ylim=c(-6,10), main="Free-living",
xlab="log mean", ylab="log variance")
abline(lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living",]))
plot(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living parasitised",],
xlim=c(-3,5), ylim=c(-6,10), main="Free-living parasitised",
xlab="log mean", ylab="log variance")
abline(lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living parasitised",]))
plot(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Parasite",],
xlim=c(-3,5), ylim=c(-6,12), main="Parasite",
xlab="log mean", ylab="log variance")
abline(lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Parasite",]))
Test of DMA. Log mean of population density was a linear function of the log mean body mass for (A) free-living unparasitized species, (B) free-living parasitized species, and (C) parasites. Plotting symbols are as in Fig. 1. The estimated parameters and their 95% confidence intervals, the number of data points N (dots), and the adjusted R2 are given in Table 2.
p <- qplot(Log.body.mass, Log.mean, data=dd)
p +
theme_bw((base_size = 18)) + facet_wrap(~Life.style) + guides(fill=F) +
stat_smooth(method="lm", se=TRUE, fill=NA,
formula=y ~ x,colour="black") +
scale_x_continuous(name = "log body mass") +
scale_y_continuous(name = "log mean") +
theme(text=element_text(family="Times New Roman"), axis.title.x = element_text(vjust=-0.4),axis.title.y = element_text(vjust=0.2))
Test of VMA. Log variance of population density was a linear function of the log mean body mass for (A) free-living unparasitized species, (B) free- living parasitized species, and (C) parasites. Plotting symbols are as in Fig. 1. The estimated parameters and their 95% confidence intervals, the number of data points N (dots), and the adjusted R2 are given in Table 2.
par(mfrow=c(1,3))
plot(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",], main="Free-living unparasitized",axes = FALSE)
box()
axis(lwd=0,side=1,line=-0.4,at=c(seq(from=-3,to=6,by=1)))
axis(lwd=0,side=2,line=-0.4,at=c(seq(from=-6,to=10,by=2)))
fit<-lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",])
abline(fit)
plot(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",], main="Free-living parasitized",axes = FALSE)
box()
axis(lwd=0,side=1,line=-0.4,at=c(seq(from=0,to=6,by=1)))
axis(lwd=0,side=2,line=-0.4,at=c(seq(from=-6,to=10,by=2)))
fit2<-lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",])
abline(fit2)
plot(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",], main="Parasite",axes = FALSE)
box()
axis(lwd=0,side=1,line=-0.4,at=c(seq(from=-3,to=2,by=1)))
axis(lwd=0,side=2,line=-0.4,at=c(seq(from=-6,to=12,by=2)))
fit3<-lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",])
abline(fit3)