Thursday, 2 July 2015

Correlations Between Canadian ETFs

June was a difficult month for world equity markets. China was down double digits and most European markets fell between 3% and 5%. Canadian equity markets got caught up in the downdraft.

We know that equity correlation increased during the 2008-2009 financial crisis, but how does current equity correlation compare with the past. In order to investigate further, I took the position of a Canadian investor with a diversified portfolio of ETFs. For the analysis, I used daily data pulled from Yahoo Finance.

  "XIU.TO", # Canadian equities,
  "XRE.TO", # Canadian REITS,
  "XSP.TO", # US equities, SP500
  "XIN.TO", # EAFE equities,
  "XBB.TO"  # Canada bonds,


Here is how the ETFs have performed. Except for bonds, the other assets took a big hit during the financial crisis.




Here are what the correlations look like from 2002 until the end of June 2015. Bonds are negatively correlated with each of the equities with the largest negative correlation with the US. The largest positive correlation is between the US and other developed markets.



Here are what the correlations look like from 2002 until the end of December 2007. Each equity correlation is smaller (in absolute value) than the corresponding value over the full sample indicating greater diversification benefits over this first sub-sample. Notice that during this time period, bonds correlated positively with REITs.

 

Here are what the correlations look like during the financial crisis. Notice that correlations between equity ETFs were much higher during the financial crisis then during the previous period (2002 - 2007). For example, the correlation between XIU and XIN went from 58.6% to 85.9%!

 

Here are what the correlations look like over the post-crisis period (July 2009 until the end of June 2015). Correlations for the equity ETFs are smaller than they were during the financial crisis, but they have not yet returned to pre-crisis values. In some cases, bonds have actually gotten more negatively correlated with equities. The correlations between Canada and the US or Canada and developed international markets are very high (over 70%)

 

The overall takeaway is that in the post-crisis period, correlations between Canada, the US, and other developed markets remains high. This makes it difficult to diversify risk between these asset classes. The correlation between bonds and equities in the post-crisis period has, in some cases, gotten even more negative (possibly in anticipation of Central Bank interest rate increases).



Here is the R code.

#########################################################
#  Economic forecasting and analysis
#  Perry Sadorsky
#  Heat Map for a Canadian ETF portfolio
#  June 2015
##########################################################


rm(list=ls())
library(fpp)
library(quantmod)
library(gplots)



## define symbols
symbols = c(
  "XIU.TO", # Canadian equities,
  "XRE.TO", # Canadian REITS,
  "XSP.TO", # US equities, SP500
  "XIN.TO", # EAFE equities,
  "XBB.TO"  # Canada bonds,
)


## get data
getSymbols(symbols, from="1970-01-01")
m = length(symbols)
Y = Ad(XIU.TO)
for (i in 2:m) Y = cbind(Y, Ad(get(symbols[i])))
head(Y)
tail(Y)


## plots
par(mfrow = c(3, 2))
for (i in 1:m) plot(Y[, i],  main = symbols[i])
par(mfrow = c(1, 1))


## returns
ret <-  (na.omit(1.0 * diff(log(Y)) *
                    100))
colnames(ret) = c("XIU", "XRE", "XSP", "XIN", "XBB")
head(ret)
tail(ret)


## helper function from
## http://blog.revolutionanalytics.com/2014/08/quantitative-finance-applications-in-r-8.html
generate_heat_map <- function(correlationMatrix, title)
{
 
  heatmap.2(x = correlationMatrix,      # the correlation matrix input
            cellnote = correlationMatrix,    # places correlation value in each cell
            main = title,            # heat map title
            symm = TRUE,            # configure diagram as standard correlation matrix
            dendrogram="none",        # do not draw a row dendrogram
            Rowv = FALSE,            # keep ordering consistent
            trace="none",            # turns off trace lines inside the heat map
            density.info="none",        # turns off density plot inside color legend
            notecol="black")        # set font color of cell labels to black
 
}




# Convert each to percent format
corr1 <- cor(ret) * 100
corr2 <- cor(ret['2002-10/2007-12']) * 100
corr3 <- cor(ret['2008-10/2009-05']) * 100
corr4 <- cor(ret['2009-07/2015-06']) * 100

heatmap.2(corr1)

corr1 = round(corr1, digits = 1)
corr2 = round(corr2, digits = 1)
corr3 = round(corr3, digits = 1)
corr4 = round(corr4, digits = 1)

par(cex.main=.8)

generate_heat_map(corr1, "Correlations of Canadian ETF Returns, Oct 2002 - June 2015")
generate_heat_map(corr2, "Correlations of Canadian ETF Returns, Oct 2002 - Dec 2007")
generate_heat_map(corr3, "Correlations of Canadian ETF Returns, Oct 2008 - May 2009")
generate_heat_map(corr4, "Correlations of Canadian ETF Returns, July 2009 - June 2015")
par(cex.main=1.0)



Sunday, 21 June 2015

A Closer Look at the Oil Price - Gasoline Price Relationship in Ontario

In a recent post, Myles Harrison investigated the relationship between Ontario average gasoline prices and US oil prices. He concluded by showing a strong positive relationship between the two. The recent drop in oil prices has led to a drop in gasoline prices, but my question is whether gas prices are as low as they should be given that oil is currently a few cents below $60 per barrel.

In this post, I will conduct further analysis by extending his work in several directions. First, Myles loaded the gasoline price data from the Ontario government website using a call to wget. This is certainly an easy way to pull down data from websites. The only drawback is that two separate programs are being used: one to load the data, and another to analyze the data. I will load the data directly into R. In addition I will load additional data on oil prices, exchange rates, and the Canadian CPI. My approach results in code that is more lengthy than Myles, but has the advantage of being contained in one program. The data set is monthly starting in January 1990 and continuing until March of 2015. Second, I will investigate the gasoline price - oil price relationship using 4 different specifications of the data. From these relationships I will make a forecast of current gasoline prices based off of current oil prices. Third, I will provide some tests for residual stationarity to ensure that the the variables are cointegrated.

The gasoline price data comes from an Ontario government website. Additional data on oil prices, exchange rates, and Canadian consumer price index (CPI) comes from the Federal Reserve of St. Louis database (FRED data).

First. here are some plots of Ontario gasoline prices and US oil prices. As expected, the two series move together closely.



 I estimate four specifications.

S1: gasoline prices ~ oil prices (nominal values not adjusted for exchange rates)
S2: log(gasoline prices) ~ log(oil prices) (nominal values not adjusted for exchange rates)
S3:  gasoline prices ~ oil prices (adjusted for exchange rates and inflation)
S4: log( gasoline prices) ~ log(oil prices) (adjusted for exchange rates and inflation)

Specification S1 is the most basic and does not adjust for inflation or different currencies. Notice the strong positive correlation between gasoline prices and oil prices. The simple correlation between the two is 0.973.

 


Using this specification to forecast gas prices (for oil prices of $50, $55, $60, $65, labelled as points 1 through 4 respectively) results in the following table. For example, at oil prices of $60 (point 3) the forecast of average Ontario gasoline prices is 90.9 cents/liter. The most recent actual value is $1.16 which is beyond the upper 95% confidence level. Based on this analysis, gas prices are too high.


Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
1           82.5  74.3  90.6  70.0  95.0
2           86.7  78.5  94.9  74.2  99.2
3           90.9  82.8  99.1  78.4 103.4
4           95.1  87.0 103.3  82.6 107.7
 

Also notice the increase in variability towards the right of the plot. This can be controlled for by estimating a log - log specification (S2).

Specification 4 is the one that is most consistent because each variable is expressed in real Canadian dollars. The log - log specification accounts for changing variability in the data.




Here is a table comparing the gasoline price forecasts from each of the four specifications.


                      S1     S2     S3     S4
Oil coefficient    0.845  0.477  0.702  0.439
t stat            72.871 81.483 55.576 51.631
R-squared          0.946  0.957  0.911  0.899
Forecast (oil=50) 82.472 86.766 86.606 88.295
Forecast (oil=55) 86.696 90.801 90.116 92.068
Forecast (oil=60) 90.921 94.648 93.626 95.653
Forecast (oil=65) 95.146 98.330 97.136 99.074


The estimated coefficient on the oil price variable is positive and statistically significant in each specification. The dependent variable in each specification is different so it is not possible to compare across models, but clearly for each specification, oil prices have a statistically significant positive impact on Ontario gasoline prices. The gasoline price forecast for oil prices at $60 range between 90.9 cents/liter to 95.7 cents/liter. The interesting point is that the current price of gasoline at $1.16 per liter is higher than the upper 95% confidence bound in all cases.

Lastly, the table below shows the p-values from an ADF test on the residuals from each specification. The residuals from each specification are stationary at 5% which is consistent with a cointegrating relationship between gasoline prices and oil prices.


                            S1       S2       S3     S4
Residuals adf p value 0.000867 0.000264 0.000044 0.0464


The main take-away is that Ontario gasoline prices and oil prices are cointegrated and this is robust to several different regression specifications. Based on this analysis, gasoline prices are higher than the upper 95% confidence band.



Here is the R code.

#########################################################
#  Economic forecasting and analysis
#  Perry Sadorsky
#  June 2015
#  Forecasting Ontario gasoline prices
##########################################################

## basic reference is from this post
## http://www.r-bloggers.com/what-a-gas-the-falling-price-of-oil-and-ontario-gasoline-prices/


# load libraries
library(fpp)
library(quantmod)
library(reshape2)
library(urca)


##########################################################
## collect data from Ontario government
##########################################################

furl = c("http://www.energy.gov.on.ca/fuelupload/ONTREG1991.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1992.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1993.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1994.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1995.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1996.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1997.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1998.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1999.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2000.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2001.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2002.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2003.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2004.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2005.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2006.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2007.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2008.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2009.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2010.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2011.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2012.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2013.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2014.csv"        
         )

gasp = c(1,2,3,4,5,6,7,8,9,10,11,12)
xx = read.csv("http://www.energy.gov.on.ca/fuelupload/ONTREG1990.csv",skip=1)
xx = head(xx,-3)
xx = tail(xx,12)
xx = xx[,"ON.Avg"]
gasp = cbind(gasp,xx)


for (i in 1:24) {
  xx = read.csv(furl[i],skip=1)
  xx = head(xx,-3)
  xx = tail(xx,12)
  xx = xx[,"ON.Avg"]
  gasp =cbind(gasp,xx)
}

## output is a nice matrix
View(gasp)


## stack the data into a column
s_gasp = melt(gasp)
s_gasp = s_gasp[-c(1:12),] 
View(s_gasp)


on_avg = s_gasp[,3]
tail(on_avg)
View(on_avg)


## append 2015 data
# "http://www.energy.gov.on.ca/fuelupload/ONTREG2015.csv"

yy = read.csv("http://www.energy.gov.on.ca/fuelupload/ONTREG2015.csv",skip=1)
yy = head(yy,-3)
yy = tail(yy,6)
yy= yy[,"ON.Avg"]
yy

on_avg_a = append(on_avg,yy)
tail(on_avg_a)
View(on_avg_a)

gas = ts(on_avg_a, start=1990, end=c(2015,3), frequency=12)
plot(gas,main="Ontario gas prices (cents per liter)", xlab="",ylab="")


##########################################################
## now load data on oil prices and FX
##########################################################

# load data from FRED into new environment
symbol.vec = c("MCOILWTICO" , "EXCAUS", "CANCPIALLMINMEI")
data <- new.env()
getSymbols(symbol.vec, src="FRED", env = data)


(data$MCOILWTICO["1990-01-01::"])-> oil
(data$EXCAUS["1990-01-01::"])-> fx
(data$CANCPIALLMINMEI["1990-01-01::"])-> cpi

# rebase cpi
tail(cpi,14)
cpi2014 = mean(  cpi[(300-11):300]  )
cpi_2014 = cpi/cpi2014 * 100
mean(  cpi_2014[(300-11):300]  )


par(mfrow = c(3, 1))
plot(oil,main="Oil prices")
plot(fx,main="$C/$US")
plot(cpi_2014,main="CPI")
par(mfrow = c(1, 1))


df_fred = cbind(oil,fx,cpi_2014)
head(df_fred)
tail(df_fred)
## latest cpi value is for March

df_fred_1 =  ts(df_fred, start=1990, end=c(2015,3), frequency=12)
nrow(df_fred_1)
length(gas)

oil_n = df_fred_1[,1]

par(mfrow = c(2, 1))
plot(gas,main="Ontario gas prices (cents per liter)", xlab="",ylab="")
plot(df_fred_1[,1],main="Oil prices ($/bbl)", xlab="",ylab="")
par(mfrow = c(1, 1))

cor(gas,df_fred_1[,1])


##########################################################
## linear regressions
##########################################################

table = matrix( NA, nrow=7, ncol=4)
table2 = matrix( NA, nrow=1, ncol=4)

lm1 = lm(gas ~ oil_n)

lm1s = summary(lm1)
table[1,1] = lm1s$coefficients[2,1]
table[2,1] = lm1s$coefficients[2,3]
table[3,1] = lm1s$r.squared


plot(gas ~ oil_n,
     ylab="Ontario gas prices (cents per liter)", xlab="WTI oil prices ($/bbl)")
abline(lm1)
adf1 =  ur.df(lm1$residuals, type="drift",selectlags="AIC", lags=12)
adf1@testreg$coefficients
table2[1,1] = adf1@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm1, newdata=data.frame(oil_n=c(50, 55, 60, 65)))
fcast
plot(fcast)


table[4,1] = fcast$mean[1]
table[5,1] = fcast$mean[2]
table[6,1] = fcast$mean[3]
table[7,1] = fcast$mean[4]
table



## log- log specification
lm2 = lm(log(gas) ~ log(oil_n))
summary(lm2)

lm2s = summary(lm2)
table[1,2] = lm2s$coefficients[2,1]
table[2,2] = lm2s$coefficients[2,3]
table[3,2] = lm2s$r.squared

plot(log(gas) ~ log(oil_n),
     ylab="Log of Ontario gas prices (cents per liter)", xlab="Log of WTI oil prices ($/bbl)")
abline(lm2)
adf2 =  ur.df(lm2$residuals, type="drift",selectlags="AIC", lags=12)
adf2@testreg$coefficients
table2[1,2] = adf2@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm2, newdata=data.frame(oil_n=c(50, 55, 60, 65)))
fcast
plot(fcast)

exp(fcast$mean)

table[4,2] = exp(fcast$mean[1])
table[5,2] = exp(fcast$mean[2])
table[6,2] = exp(fcast$mean[3])
table[7,2] = exp(fcast$mean[4])



## now do this in real Canadian dollars
gas_r = gas/df_fred_1[,3]*100
oil_r = (df_fred_1[,1]*df_fred_1[,2])/df_fred_1[,3]*100


lm3 = lm(gas_r ~ oil_r)
summary(lm3)

lm3s = summary(lm3)
table[1,3] = lm3s$coefficients[2,1]
table[2,3] = lm3s$coefficients[2,3]
table[3,3] = lm3s$r.squared

plot(gas_r ~ oil_r,
     ylab="Ontario real gas prices (cents per liter)", xlab="Real oil prices ($/bbl)")
abline(lm3)
adf3 =  ur.df(lm3$residuals, type="drift",selectlags="AIC", lags=12)
adf3@testreg$coefficients
table2[1,3] = adf3@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm3, newdata=data.frame(oil_r=c(50, 55, 60, 65)))
fcast
plot(fcast)

table[4,3] = fcast$mean[1]
table[5,3] = fcast$mean[2]
table[6,3] = fcast$mean[3]
table[7,3] = fcast$mean[4]



## log- log specification in real Canadian dollars
lm4 = lm(log(gas_r) ~ log(oil_r))
summary(lm4)

lm4s = summary(lm4)
table[1,4] = lm4s$coefficients[2,1]
table[2,4] = lm4s$coefficients[2,3]
table[3,4] = lm4s$r.squared


plot(log(gas_r) ~ log(oil_r),
     ylab="Log of real Ontario gas prices (cents per liter)", xlab="Log of real oil prices ($/bbl)")
abline(lm4)
adf4 =  ur.df(lm4$residuals, type="drift",selectlags="AIC", lags=12)
adf4@testreg$coefficients
table2[1,4] = adf4@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm4, newdata=data.frame(oil_r=c(50, 55, 60, 65)))
fcast
(fcast$mean)
(fcast$lower)
(fcast$upper)
# plot(fcast)
exp(fcast$mean)
exp(fcast$lower)
exp(fcast$upper)

fcast4 = cbind(exp(fcast$mean),exp(fcast$lower),  exp(fcast$upper))
colnames(fcast4) = c("Forecast", "Lo 80", "Lo 95", "Hi 80", "Hi 95")
fcast4

table[4,4] = exp(fcast$mean[1])
table[5,4] = exp(fcast$mean[2])
table[6,4] = exp(fcast$mean[3])
table[7,4] = exp(fcast$mean[4])


colnames(table) = c("S1", "S2", "S3", "S4")
rownames(table) = c("Oil coefficient", "t stat", "R-squared", "Forecast (oil=50)", "Forecast (oil=55)", "Forecast (oil=60)", "Forecast (oil=65)")

options(digits=3, scipen=10)
table


colnames(table2) = c("S1", "S2", "S3", "S4")
rownames(table2) = c("Residuals adf p value")
table2