In this post, I will conduct further analysis by extending his work in several directions. First, Myles loaded the gasoline price data from the Ontario government website using a call to wget. This is certainly an easy way to pull down data from websites. The only drawback is that two separate programs are being used: one to load the data, and another to analyze the data. I will load the data directly into R. In addition I will load additional data on oil prices, exchange rates, and the Canadian CPI. My approach results in code that is more lengthy than Myles, but has the advantage of being contained in one program. The data set is monthly starting in January 1990 and continuing until March of 2015. Second, I will investigate the gasoline price - oil price relationship using 4 different specifications of the data. From these relationships I will make a forecast of current gasoline prices based off of current oil prices. Third, I will provide some tests for residual stationarity to ensure that the the variables are cointegrated.
The gasoline price data comes from an Ontario government website. Additional data on oil prices, exchange rates, and Canadian consumer price index (CPI) comes from the Federal Reserve of St. Louis database (FRED data).
First. here are some plots of Ontario gasoline prices and US oil prices. As expected, the two series move together closely.
I estimate four specifications.
S1: gasoline prices ~ oil prices (nominal values not adjusted for exchange rates)
S2: log(gasoline prices) ~ log(oil prices) (nominal values not adjusted for exchange rates)
S3: gasoline prices ~ oil prices (adjusted for exchange rates and inflation)
S4: log( gasoline prices) ~ log(oil prices) (adjusted for exchange rates and inflation)
Specification S1 is the most basic and does not adjust for inflation or different currencies. Notice the strong positive correlation between gasoline prices and oil prices. The simple correlation between the two is 0.973.
 
Using this specification to forecast gas prices (for oil prices of $50, $55, $60, $65, labelled as points 1 through 4 respectively) results in the following table. For example, at oil prices of $60 (point 3) the forecast of average Ontario gasoline prices is 90.9 cents/liter. The most recent actual value is $1.16 which is beyond the upper 95% confidence level. Based on this analysis, gas prices are too high.
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
1           82.5  74.3  90.6  70.0  95.0
2           86.7  78.5  94.9  74.2  99.2
3           90.9  82.8  99.1  78.4 103.4
4           95.1  87.0 103.3  82.6 107.7
 
Also notice the increase in variability towards the right of the plot. This can be controlled for by estimating a log - log specification (S2).
Specification 4 is the one that is most consistent because each variable is expressed in real Canadian dollars. The log - log specification accounts for changing variability in the data.
Here is a table comparing the gasoline price forecasts from each of the four specifications.
                      S1     S2     S3     S4
Oil coefficient    0.845  0.477  0.702  0.439
t stat            72.871 81.483 55.576 51.631
R-squared          0.946  0.957  0.911  0.899
Forecast (oil=50) 82.472 86.766 86.606 88.295
Forecast (oil=55) 86.696 90.801 90.116 92.068
Forecast (oil=60) 90.921 94.648 93.626 95.653
Forecast (oil=65) 95.146 98.330 97.136 99.074
The estimated coefficient on the oil price variable is positive and statistically significant in each specification. The dependent variable in each specification is different so it is not possible to compare across models, but clearly for each specification, oil prices have a statistically significant positive impact on Ontario gasoline prices. The gasoline price forecast for oil prices at $60 range between 90.9 cents/liter to 95.7 cents/liter. The interesting point is that the current price of gasoline at $1.16 per liter is higher than the upper 95% confidence bound in all cases.
Lastly, the table below shows the p-values from an ADF test on the residuals from each specification. The residuals from each specification are stationary at 5% which is consistent with a cointegrating relationship between gasoline prices and oil prices.
S1 S2 S3 S4 Residuals adf p value 0.000867 0.000264 0.000044 0.0464
The main take-away is that Ontario gasoline prices and oil prices are cointegrated and this is robust to several different regression specifications. Based on this analysis, gasoline prices are higher than the upper 95% confidence band.
Here is the R code.
#########################################################
# Economic forecasting and analysis
# Perry Sadorsky
# June 2015
# Forecasting Ontario gasoline prices
##########################################################
## basic reference is from this post
## http://www.r-bloggers.com/what-a-gas-the-falling-price-of-oil-and-ontario-gasoline-prices/
# load libraries
library(fpp)
library(quantmod)
library(reshape2)
library(urca)
##########################################################
## collect data from Ontario government
##########################################################
furl = c("http://www.energy.gov.on.ca/fuelupload/ONTREG1991.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1992.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1993.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1994.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1995.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1996.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1997.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1998.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG1999.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2000.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2001.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2002.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2003.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2004.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2005.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2006.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2007.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2008.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2009.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2010.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2011.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2012.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2013.csv",
"http://www.energy.gov.on.ca/fuelupload/ONTREG2014.csv"
)
gasp = c(1,2,3,4,5,6,7,8,9,10,11,12)
xx = read.csv("http://www.energy.gov.on.ca/fuelupload/ONTREG1990.csv",skip=1)
xx = head(xx,-3)
xx = tail(xx,12)
xx = xx[,"ON.Avg"]
gasp = cbind(gasp,xx)
for (i in 1:24) {
xx = read.csv(furl[i],skip=1)
xx = head(xx,-3)
xx = tail(xx,12)
xx = xx[,"ON.Avg"]
gasp =cbind(gasp,xx)
}
## output is a nice matrix
View(gasp)
## stack the data into a column
s_gasp = melt(gasp)
s_gasp = s_gasp[-c(1:12),]
View(s_gasp)
on_avg = s_gasp[,3]
tail(on_avg)
View(on_avg)
## append 2015 data
# "http://www.energy.gov.on.ca/fuelupload/ONTREG2015.csv"
yy = read.csv("http://www.energy.gov.on.ca/fuelupload/ONTREG2015.csv",skip=1)
yy = head(yy,-3)
yy = tail(yy,6)
yy= yy[,"ON.Avg"]
yy
on_avg_a = append(on_avg,yy)
tail(on_avg_a)
View(on_avg_a)
gas = ts(on_avg_a, start=1990, end=c(2015,3), frequency=12)
plot(gas,main="Ontario gas prices (cents per liter)", xlab="",ylab="")
##########################################################
## now load data on oil prices and FX
##########################################################
# load data from FRED into new environment
symbol.vec = c("MCOILWTICO" , "EXCAUS", "CANCPIALLMINMEI")
data <- new.env()
getSymbols(symbol.vec, src="FRED", env = data)
(data$MCOILWTICO["1990-01-01::"])-> oil
(data$EXCAUS["1990-01-01::"])-> fx
(data$CANCPIALLMINMEI["1990-01-01::"])-> cpi
# rebase cpi
tail(cpi,14)
cpi2014 = mean( cpi[(300-11):300] )
cpi_2014 = cpi/cpi2014 * 100
mean( cpi_2014[(300-11):300] )
par(mfrow = c(3, 1))
plot(oil,main="Oil prices")
plot(fx,main="$C/$US")
plot(cpi_2014,main="CPI")
par(mfrow = c(1, 1))
df_fred = cbind(oil,fx,cpi_2014)
head(df_fred)
tail(df_fred)
## latest cpi value is for March
df_fred_1 = ts(df_fred, start=1990, end=c(2015,3), frequency=12)
nrow(df_fred_1)
length(gas)
oil_n = df_fred_1[,1]
par(mfrow = c(2, 1))
plot(gas,main="Ontario gas prices (cents per liter)", xlab="",ylab="")
plot(df_fred_1[,1],main="Oil prices ($/bbl)", xlab="",ylab="")
par(mfrow = c(1, 1))
cor(gas,df_fred_1[,1])
##########################################################
## linear regressions
##########################################################
table = matrix( NA, nrow=7, ncol=4)
table2 = matrix( NA, nrow=1, ncol=4)
lm1 = lm(gas ~ oil_n)
lm1s = summary(lm1)
table[1,1] = lm1s$coefficients[2,1]
table[2,1] = lm1s$coefficients[2,3]
table[3,1] = lm1s$r.squared
plot(gas ~ oil_n,
ylab="Ontario gas prices (cents per liter)", xlab="WTI oil prices ($/bbl)")
abline(lm1)
adf1 = ur.df(lm1$residuals, type="drift",selectlags="AIC", lags=12)
adf1@testreg$coefficients
table2[1,1] = adf1@testreg$coefficients[2,4]
## some forecasting
fcast <- forecast(lm1, newdata=data.frame(oil_n=c(50, 55, 60, 65)))
fcast
plot(fcast)
table[4,1] = fcast$mean[1]
table[5,1] = fcast$mean[2]
table[6,1] = fcast$mean[3]
table[7,1] = fcast$mean[4]
table
## log- log specification
lm2 = lm(log(gas) ~ log(oil_n))
summary(lm2)
lm2s = summary(lm2)
table[1,2] = lm2s$coefficients[2,1]
table[2,2] = lm2s$coefficients[2,3]
table[3,2] = lm2s$r.squared
plot(log(gas) ~ log(oil_n),
ylab="Log of Ontario gas prices (cents per liter)", xlab="Log of WTI oil prices ($/bbl)")
abline(lm2)
adf2 = ur.df(lm2$residuals, type="drift",selectlags="AIC", lags=12)
adf2@testreg$coefficients
table2[1,2] = adf2@testreg$coefficients[2,4]
## some forecasting
fcast <- forecast(lm2, newdata=data.frame(oil_n=c(50, 55, 60, 65)))
fcast
plot(fcast)
exp(fcast$mean)
table[4,2] = exp(fcast$mean[1])
table[5,2] = exp(fcast$mean[2])
table[6,2] = exp(fcast$mean[3])
table[7,2] = exp(fcast$mean[4])
## now do this in real Canadian dollars
gas_r = gas/df_fred_1[,3]*100
oil_r = (df_fred_1[,1]*df_fred_1[,2])/df_fred_1[,3]*100
lm3 = lm(gas_r ~ oil_r)
summary(lm3)
lm3s = summary(lm3)
table[1,3] = lm3s$coefficients[2,1]
table[2,3] = lm3s$coefficients[2,3]
table[3,3] = lm3s$r.squared
plot(gas_r ~ oil_r,
ylab="Ontario real gas prices (cents per liter)", xlab="Real oil prices ($/bbl)")
abline(lm3)
adf3 = ur.df(lm3$residuals, type="drift",selectlags="AIC", lags=12)
adf3@testreg$coefficients
table2[1,3] = adf3@testreg$coefficients[2,4]
## some forecasting
fcast <- forecast(lm3, newdata=data.frame(oil_r=c(50, 55, 60, 65)))
fcast
plot(fcast)
table[4,3] = fcast$mean[1]
table[5,3] = fcast$mean[2]
table[6,3] = fcast$mean[3]
table[7,3] = fcast$mean[4]
## log- log specification in real Canadian dollars
lm4 = lm(log(gas_r) ~ log(oil_r))
summary(lm4)
lm4s = summary(lm4)
table[1,4] = lm4s$coefficients[2,1]
table[2,4] = lm4s$coefficients[2,3]
table[3,4] = lm4s$r.squared
plot(log(gas_r) ~ log(oil_r),
ylab="Log of real Ontario gas prices (cents per liter)", xlab="Log of real oil prices ($/bbl)")
abline(lm4)
adf4 = ur.df(lm4$residuals, type="drift",selectlags="AIC", lags=12)
adf4@testreg$coefficients
table2[1,4] = adf4@testreg$coefficients[2,4]
## some forecasting
fcast <- forecast(lm4, newdata=data.frame(oil_r=c(50, 55, 60, 65)))
fcast
(fcast$mean)
(fcast$lower)
(fcast$upper)
# plot(fcast)
exp(fcast$mean)
exp(fcast$lower)
exp(fcast$upper)
fcast4 = cbind(exp(fcast$mean),exp(fcast$lower), exp(fcast$upper))
colnames(fcast4) = c("Forecast", "Lo 80", "Lo 95", "Hi 80", "Hi 95")
fcast4
table[4,4] = exp(fcast$mean[1])
table[5,4] = exp(fcast$mean[2])
table[6,4] = exp(fcast$mean[3])
table[7,4] = exp(fcast$mean[4])
colnames(table) = c("S1", "S2", "S3", "S4")
rownames(table) = c("Oil coefficient", "t stat", "R-squared", "Forecast (oil=50)", "Forecast (oil=55)", "Forecast (oil=60)", "Forecast (oil=65)")
options(digits=3, scipen=10)
table
colnames(table2) = c("S1", "S2", "S3", "S4")
rownames(table2) = c("Residuals adf p value")
table2
 
 
