Introduction

This report is the reproduction of the analyses shown in Ward, E.J., Holmes, E.E., Thorson, J.T. & Collen, B. (2014) Complexity is costly: A meta-analysis of parametric and non-parametric methods for short-term population forecasting. Oikos, 123, 652–661.

Eric Ward kindly provided a subset of the raw data, processed data as well as the R scripts used for model fitting; they can be found in the respective folders. His original repository can be found here. Because the data originates from a collaborative project, not all data is freely available yet; however, the data provided by Eric allows to reproduce the analyses on the fish time-series shown in the paper.

Data processing

## 
## Attaching package: 'lubridate'
## 
## The following object is masked from 'package:plyr':
## 
##     here
## 
## Loading required package: bitops
## Loading required package: nlme
## This is mgcv 1.8-6. For overview type 'help("mgcv-package")'.
## Nonparametric Kernel Methods for Mixed Datatypes (version 0.60-2)
## [vignette("np_faq",package="np") provides answers to frequently asked questions]
## locfit 1.5-9.1    2013-03-22
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Loading required package: timeDate
## This is forecast 6.1 
## 
## 
## Attaching package: 'forecast'
## 
## The following object is masked from 'package:nlme':
## 
##     getResponse
## 
## randomForest 4.6-10
## Type rfNews() to see new features/changes/bug fixes.

Read time series data provided from repository:

ts <- read.csv(text=getURL("https://raw.githubusercontent.com/opetchey/RREEBES/WARD_etal_2014_Oikos/WARD_etal_2014_Oikos/processed%20data/masterDat%20052015.csv"), header=T, stringsAsFactors=F)
metainfo <- read.csv(text=getURL("https://raw.githubusercontent.com/opetchey/RREEBES/WARD_etal_2014_Oikos/WARD_etal_2014_Oikos/processed%20data/Data%20and%20metadata%20052015.csv"), header=T, stringsAsFactors=F)

#look at data structure
str(ts)
## 'data.frame':    53130 obs. of  8 variables:
##  $ X       : int  2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 ...
##  $ ID      : int  62 62 62 62 62 62 62 62 62 62 ...
##  $ Database: chr  "salmon" "salmon" "salmon" "salmon" ...
##  $ Spcode  : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ Year    : int  1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 ...
##  $ Species : chr  "Chinook" "Chinook" "Chinook" "Chinook" ...
##  $ Class   : chr  "Actinopterygii" "Actinopterygii" "Actinopterygii" "Actinopterygii" ...
##  $ Value   : num  10 10.1 10.2 10.3 9.1 ...
head(ts)
##      X ID Database Spcode Year Species          Class     Value
## 1 2380 62   salmon      1 1951 Chinook Actinopterygii  9.998798
## 2 2381 62   salmon      1 1952 Chinook Actinopterygii 10.126631
## 3 2382 62   salmon      1 1953 Chinook Actinopterygii 10.239960
## 4 2383 62   salmon      1 1954 Chinook Actinopterygii 10.275051
## 5 2384 62   salmon      1 1955 Chinook Actinopterygii  9.104980
## 6 2385 62   salmon      1 1956 Chinook Actinopterygii  8.294050
# number of fish ts
length(unique(ts$ID))
## [1] 1266
ts$ID_old <- as.factor(ts$ID)
ts$ID_new <- as.numeric(ts$ID_old)
ts$ID <- ts$ID_new

Explore some time series visually to get an idea of the variability. Figure does not appear in the paper. Time series were centered to easily plot them simultaneousl [e.g. Year-Mean(Year) and Value - Mean(Value)].

set.seed(12345678)
ggplot(data=subset(ts, ID %in% sample(ts$ID,size=15)), aes(x=Year-mean(Year), y=Value-mean(Value))) + geom_line() + facet_wrap(ID~Species,ncol=5,nrow=3, scales="free")