Introduction

This is an attempt to reproduce the anaylses presented in the paper Parasitism alters three power laws of scaling in a metazoan community: Taylor’s law, density-mass allometry, and variance-mass allometry, by Lagrue, Poulin, and Cohen (http://www.pnas.org/content/112/6/1791). Data are here as txt, but need to be saved as csv.

[The study was conducted] in metazoan communities in lakes of Otago, New Zealand. In 13,752 samples comprising 1,037,058 organisms, we found that species of different lifestyles differed in taxonomic distribution and body mass and were well described by three power laws: a spatial Taylor’s law (the spatial variance in population density was a power-law function of the spatial mean population density); density-mass allometry (the spatial mean population density was a power-law function of mean body mass); and variance-mass allometry (the spatial variance in population density was a power-law function of mean body mass). To our knowledge, this constitutes the first empirical confirmation of variance-mass allometry for any animal community. We found that the parameter values of all three relationships differed for species with different lifestyles in the same communities. Taylor’s law and density-mass allometry accurately predicted the form and parameter values of variance-mass allometry. We conclude that species of different lifestyles in these metazoan communities obeyed the same major ecological power-law relationships but did so with parameters specific to each lifestyle […].

Let’s see if we are able to reproduce their results.

Housework

Clean R’s memory, load useful packages, load data.

# clean R's memory:
rm(list=ls())
# Load packages:
library(tidyr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
library(stringr)
library(ggplot2)
library(RCurl)
## Loading required package: bitops
library(reshape2)
library(knitr)

# Load data:
dd <- read.csv(text=getURL("https://raw.githubusercontent.com/opetchey/RREEBES/Lagrue/Lagrue_etal_2014_PNAS/data/Lague2014.csv"), header=T)
# head(dd)
# str(dd)
# make some colnames more R-friendly:
colnames(dd)[which(names(dd) == "Mean.density..m..")] <- "Mean_density"
colnames(dd)[which(names(dd) == "Body.mass..mg.")] <- "Body_mass_mg"

Tables

Reproducing Table 1

Numbers (counts) and percentages of samples from broad taxonomic groups for each lifestyle separately.

#?aggregate

counts <- aggregate(x=dd$Species, by=list(Taxa=dd$Taxonomic.group, lifestyle=dd$Life.style), function(x) length(x))
colnames(counts)<-c("Taxa", "Lifestyle","Counts")
# same information in a different format:
counts2 <- tapply(dd$Species, list(Taxa=dd$Taxonomic.group, lifestyle=dd$Life.style), na.omit(length))
# replace NA's with zeroes:
counts2[is.na(counts2)]<- 0

total <- aggregate(x=dd$Species, by=list(Lifestyle=dd$Life.style), function(x) length(x))
colnames(total)<- c("Lifestyle", "Total")
tab1 <- merge(counts, total, by="Lifestyle")
tab1$percent <- tab1$Counts / tab1$Total * 100
tab1$percent <- round(tab1$percent, 0)

kable(tab1)
Lifestyle Taxa Counts Total percent
Free-living Annelid 26 329 8
Free-living Crustacean 107 329 33
Free-living Fish 5 329 2
Free-living Insect 122 329 37
Free-living Mite 12 329 4
Free-living Mollusc 35 329 11
Free-living Other 22 329 7
Free-living parasitised Annelid 18 151 12
Free-living parasitised Crustacean 18 151 12
Free-living parasitised Fish 58 151 38
Free-living parasitised Insect 45 151 30
Free-living parasitised Mollusc 12 151 8
Parasite Acanthocephalan 8 253 3
Parasite Cestode 2 253 1
Parasite Mite 4 253 2
Parasite Nematoda 26 253 10
Parasite Trematode 213 253 84

Reproducing Table 2

Parameter estimates using ordinary least squares linear regression to fit TL, DMA, and VMA to log- transformed data on the spatial variation of population density (individuals per square meter) of free-living unparasitized species, free-living parasitized species, and parasitic species; and the distribution of average body masses of species in each lifestyle.

# TL: variance = a × (mean density)^b
# ie Log.variance = log10a + Log.mean * b
lm.tl.unp <- lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living",])
lm.tl.prtz <- lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living parasitised",])
lm.tl.par <- lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Parasite",])

# DMA: Mean density = u × (mean body mass)^v
lm.dma.unp <- lm(Log.mean ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",])
lm.dma.prtz <- lm(Log.mean ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",])
lm.dma.par <- lm(Log.mean ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",])

# VMA: variance = c × (mean body mass)^d
lm.vma.unp <- lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",])
lm.vma.prtz <- lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",])
lm.vma.par <- lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",])

# Create a dataframe where to store info:
table2 <- data.frame(model=c(rep("TL", 7), rep("DMA",7), rep("VMA",7)))
table2$values <- c("log10a", "log10a - 95%CI", "log10a + 95%CI",
                   "b", "b - 95%CI", "b + 95%CI", "adj.R2",
                   "log10u", "log10u - 95%CI", "log10u + 95%CI",
                   "v", "v - 95%CI", "v + 95%CI","adj.R2",
                   "log10c", "log10c - 95%CI", "log10c + 95%CI",
                   "d", "d - 95%CI", "d + 95%CI","adj.R2"
                   )
table2$free.living.unparasitized <- rep("NA",21)
table2$free.living.parasitized <- rep("NA",21)
table2$parasite <- rep("NA",21)

# enter info about TL
table2[1,3]<-round(summary(lm.tl.unp)$coef[1,1],4)
table2[2,3]<-round(summary(lm.tl.unp)$coef[1,1] - 1.96*summary(lm.tl.unp)$coef[1,2],4)
table2[3,3]<-round(summary(lm.tl.unp)$coef[1,1] + 1.96*summary(lm.tl.unp)$coef[1,2],4)
table2[4,3]<-round(summary(lm.tl.unp)$coef[2,1],4)
table2[5,3]<-round(summary(lm.tl.unp)$coef[2,1] - 1.96*summary(lm.tl.unp)$coef[2,2],4)
table2[6,3]<-round(summary(lm.tl.unp)$coef[2,1] + 1.96*summary(lm.tl.unp)$coef[2,2],4)
table2[7,3]<-round(summary(lm.tl.unp)$adj.r.squared,4)

table2[1,4]<-round(summary(lm.tl.prtz)$coef[1,1],4)
table2[2,4]<-round(summary(lm.tl.prtz)$coef[1,1] - 1.96*summary(lm.tl.prtz)$coef[1,2],4)
table2[3,4]<-round(summary(lm.tl.prtz)$coef[1,1] + 1.96*summary(lm.tl.prtz)$coef[1,2],4)
table2[4,4]<-round(summary(lm.tl.prtz)$coef[2,1],4)
table2[5,4]<-round(summary(lm.tl.prtz)$coef[2,1] - 1.96*summary(lm.tl.prtz)$coef[2,2],4)
table2[6,4]<-round(summary(lm.tl.prtz)$coef[2,1] + 1.96*summary(lm.tl.prtz)$coef[2,2],4)
table2[7,4]<-round(summary(lm.tl.prtz)$adj.r.squared,4)

table2[1,5]<-round(summary(lm.tl.par)$coef[1,1],4)
table2[2,5]<-round(summary(lm.tl.par)$coef[1,1] - 1.96*summary(lm.tl.par)$coef[1,2],4)
table2[3,5]<-round(summary(lm.tl.par)$coef[1,1] + 1.96*summary(lm.tl.par)$coef[1,2],4)
table2[4,5]<-round(summary(lm.tl.par)$coef[2,1],4)
table2[5,5]<-round(summary(lm.tl.par)$coef[2,1] - 1.96*summary(lm.tl.par)$coef[2,2],4)
table2[6,5]<-round(summary(lm.tl.par)$coef[2,1] + 1.96*summary(lm.tl.par)$coef[2,2],4)
table2[7,5]<-round(summary(lm.tl.par)$adj.r.squared,4)

# enter info about DMA
table2[1+7,3]<-round(summary(lm.dma.unp)$coef[1,1],4)
table2[2+7,3]<-round(summary(lm.dma.unp)$coef[1,1] - 1.96*summary(lm.dma.unp)$coef[1,2],4)
table2[3+7,3]<-round(summary(lm.dma.unp)$coef[1,1] + 1.96*summary(lm.dma.unp)$coef[1,2],4)
table2[4+7,3]<-round(summary(lm.dma.unp)$coef[2,1],4)
table2[5+7,3]<-round(summary(lm.dma.unp)$coef[2,1] - 1.96*summary(lm.dma.unp)$coef[2,2],4)
table2[6+7,3]<-round(summary(lm.dma.unp)$coef[2,1] + 1.96*summary(lm.dma.unp)$coef[2,2],4)
table2[7+7,3]<-round(summary(lm.dma.unp)$adj.r.squared,4)

table2[1+7,4]<-round(summary(lm.dma.prtz)$coef[1,1],4)
table2[2+7,4]<-round(summary(lm.dma.prtz)$coef[1,1] - 1.96*summary(lm.dma.prtz)$coef[1,2],4)
table2[3+7,4]<-round(summary(lm.dma.prtz)$coef[1,1] + 1.96*summary(lm.dma.prtz)$coef[1,2],4)
table2[4+7,4]<-round(summary(lm.dma.prtz)$coef[2,1],4)
table2[5+7,4]<-round(summary(lm.dma.prtz)$coef[2,1] - 1.96*summary(lm.dma.prtz)$coef[2,2],4)
table2[6+7,4]<-round(summary(lm.dma.prtz)$coef[2,1] + 1.96*summary(lm.dma.prtz)$coef[2,2],4)
table2[7+7,4]<-round(summary(lm.dma.prtz)$adj.r.squared,4)

table2[1+7,5]<-round(summary(lm.dma.par)$coef[1,1],4)
table2[2+7,5]<-round(summary(lm.dma.par)$coef[1,1] - 1.96*summary(lm.dma.par)$coef[1,2],4)
table2[3+7,5]<-round(summary(lm.dma.par)$coef[1,1] + 1.96*summary(lm.dma.par)$coef[1,2],4)
table2[4+7,5]<-round(summary(lm.dma.par)$coef[2,1],4)
table2[5+7,5]<-round(summary(lm.dma.par)$coef[2,1] - 1.96*summary(lm.dma.par)$coef[2,2],4)
table2[6+7,5]<-round(summary(lm.dma.par)$coef[2,1] + 1.96*summary(lm.dma.par)$coef[2,2],4)
table2[7+7,5]<-round(summary(lm.dma.par)$adj.r.squared,4)

# enter info about VMA
table2[1+14,3]<-round(summary(lm.vma.unp)$coef[1,1],4)
table2[2+14,3]<-round(summary(lm.vma.unp)$coef[1,1] - 1.96*summary(lm.vma.unp)$coef[1,2],4)
table2[3+14,3]<-round(summary(lm.vma.unp)$coef[1,1] + 1.96*summary(lm.vma.unp)$coef[1,2],4)
table2[4+14,3]<-round(summary(lm.vma.unp)$coef[2,1],4)
table2[5+14,3]<-round(summary(lm.vma.unp)$coef[2,1] - 1.96*summary(lm.vma.unp)$coef[2,2],4)
table2[6+14,3]<-round(summary(lm.vma.unp)$coef[2,1] + 1.96*summary(lm.vma.unp)$coef[2,2],4)
table2[7+14,3]<-round(summary(lm.vma.unp)$adj.r.squared,4)

table2[1+14,4]<-round(summary(lm.vma.prtz)$coef[1,1],4)
table2[2+14,4]<-round(summary(lm.vma.prtz)$coef[1,1] - 1.96*summary(lm.vma.prtz)$coef[1,2],4)
table2[3+14,4]<-round(summary(lm.vma.prtz)$coef[1,1] + 1.96*summary(lm.vma.prtz)$coef[1,2],4)
table2[4+14,4]<-round(summary(lm.vma.prtz)$coef[2,1],4)
table2[5+14,4]<-round(summary(lm.vma.prtz)$coef[2,1] - 1.96*summary(lm.vma.prtz)$coef[2,2],4)
table2[6+14,4]<-round(summary(lm.vma.prtz)$coef[2,1] + 1.96*summary(lm.vma.prtz)$coef[2,2],4)
table2[7+14,4]<-round(summary(lm.vma.prtz)$adj.r.squared,4)

table2[1+14,5]<-round(summary(lm.vma.par)$coef[1,1],4)
table2[2+14,5]<-round(summary(lm.vma.par)$coef[1,1] - 1.96*summary(lm.vma.par)$coef[1,2],4)
table2[3+14,5]<-round(summary(lm.vma.par)$coef[1,1] + 1.96*summary(lm.vma.par)$coef[1,2],4)
table2[4+14,5]<-round(summary(lm.vma.par)$coef[2,1],4)
table2[5+14,5]<-round(summary(lm.vma.par)$coef[2,1] - 1.96*summary(lm.vma.par)$coef[2,2],4)
table2[6+14,5]<-round(summary(lm.vma.par)$coef[2,1] + 1.96*summary(lm.vma.par)$coef[2,2],4)
table2[7+14,5]<-round(summary(lm.vma.par)$adj.r.squared,4)

kable(table2)
model values free.living.unparasitized free.living.parasitized parasite
TL log10a 0.806 0.2903 0.4333
TL log10a - 95%CI 0.7529 0.205 0.3553
TL log10a + 95%CI 0.859 0.3755 0.5114
TL b 1.6802 2.0193 2.102
TL b - 95%CI 1.6442 1.9743 2.057
TL b + 95%CI 1.7162 2.0642 2.147
TL adj.R2 0.9623 0.981 0.9708
DMA log10u 0.8577 1.9905 -0.2538
DMA log10u - 95%CI 0.732 1.738 -0.4741
DMA log10u + 95%CI 0.9834 2.2431 -0.0334
DMA v -0.2167 -0.7983 -0.9088
DMA v - 95%CI -0.296 -0.8808 -1.0642
DMA v + 95%CI -0.1375 -0.7159 -0.7535
DMA adj.R2 0.0779 0.7055 0.341
VMA log10c 2.2423 4.4773 -0.0572
VMA log10c - 95%CI 2.0236 4.0086 -0.5363
VMA log10c + 95%CI 2.461 4.946 0.4219
VMA d -0.2967 -1.6841 -1.8657
VMA d - 95%CI -0.4346 -1.8371 -2.2036
VMA d + 95%CI -0.1587 -1.5311 -1.5279
VMA adj.R2 0.0487 0.7559 0.3155

Figures

Reproducing Fig. 1

Test of TL. Log variance of population density was an approximately linear function of the log mean of population density for (A) free-living un- parasitized species, (B) free-living parasitized species, and (C) parasites. Each dot represents the log variance and log mean of population density of multiple samples of one life stage of one species at one lake in one season. The solid lines represent least-squares regressions. The estimated parameters and their 95% confidence intervals, the number of data points N (dots), and the adjusted R2 are given in Table 2.

First, explore the data by plotting untransformed variances versus untransformed means for each lifestyle:

# in ggplot2:
p <- qplot(Mean_density, Variance, data=dd)
p + 
theme_bw((base_size = 18)) + facet_wrap(~Life.style) + guides(fill=F) +
stat_smooth(method="lm", se=TRUE, fill=NA,formula=y ~ poly(x, 2, raw=TRUE),colour="black") + 
scale_x_continuous(name = "Mean density") + 
scale_y_continuous(name = "Variance") +
theme(text=element_text(family="Times New Roman"), axis.title.x = element_text(vjust=-0.4),axis.title.y = element_text(vjust=0.2))

Although large mean values are not very well represented, there is a general trend toward an exponential increase in variance with increasing mean. Here is the reproduction of figure 1, with log(variances) versus log(means):

par(mfrow=c(1,3))

plot(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living",],
     xlim=c(-3,5), ylim=c(-6,10), main="Free-living",
     xlab="log mean", ylab="log variance")
abline(lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living",]))

plot(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living parasitised",],
     xlim=c(-3,5), ylim=c(-6,10), main="Free-living parasitised",
     xlab="log mean", ylab="log variance")
abline(lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Free-living parasitised",]))

plot(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Parasite",],
     xlim=c(-3,5), ylim=c(-6,12), main="Parasite",
     xlab="log mean", ylab="log variance")
abline(lm(Log.variance ~ Log.mean, data=dd[dd$Life.style=="Parasite",]))

Reproducing Fig. 2

Test of DMA. Log mean of population density was a linear function of the log mean body mass for (A) free-living unparasitized species, (B) free-living parasitized species, and (C) parasites. Plotting symbols are as in Fig. 1. The estimated parameters and their 95% confidence intervals, the number of data points N (dots), and the adjusted R2 are given in Table 2.

p <- qplot(Log.body.mass, Log.mean, data=dd)
p + 
theme_bw((base_size = 18)) + facet_wrap(~Life.style) + guides(fill=F) +
stat_smooth(method="lm", se=TRUE, fill=NA,
              formula=y ~ x,colour="black") +
scale_x_continuous(name = "log body mass") + 
scale_y_continuous(name = "log mean") +
theme(text=element_text(family="Times New Roman"), axis.title.x = element_text(vjust=-0.4),axis.title.y = element_text(vjust=0.2))

Reproducing Fig. 3

Test of VMA. Log variance of population density was a linear function of the log mean body mass for (A) free-living unparasitized species, (B) free- living parasitized species, and (C) parasites. Plotting symbols are as in Fig. 1. The estimated parameters and their 95% confidence intervals, the number of data points N (dots), and the adjusted R2 are given in Table 2.

par(mfrow=c(1,3))
plot(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",], main="Free-living unparasitized",axes = FALSE)
box()
axis(lwd=0,side=1,line=-0.4,at=c(seq(from=-3,to=6,by=1)))
axis(lwd=0,side=2,line=-0.4,at=c(seq(from=-6,to=10,by=2)))
fit<-lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living",])  
abline(fit) 

plot(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",], main="Free-living parasitized",axes = FALSE)
box()
axis(lwd=0,side=1,line=-0.4,at=c(seq(from=0,to=6,by=1)))
axis(lwd=0,side=2,line=-0.4,at=c(seq(from=-6,to=10,by=2)))
fit2<-lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Free-living parasitised",])  
abline(fit2) 

plot(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",], main="Parasite",axes = FALSE)
box()
axis(lwd=0,side=1,line=-0.4,at=c(seq(from=-3,to=2,by=1)))
axis(lwd=0,side=2,line=-0.4,at=c(seq(from=-6,to=12,by=2)))
fit3<-lm(Log.variance ~ Log.body.mass, data=dd[dd$Life.style=="Parasite",])  
abline(fit3)