Sunday, 21 June 2015

A Closer Look at the Oil Price - Gasoline Price Relationship in Ontario

In a recent post, Myles Harrison investigated the relationship between Ontario average gasoline prices and US oil prices. He concluded by showing a strong positive relationship between the two. The recent drop in oil prices has led to a drop in gasoline prices, but my question is whether gas prices are as low as they should be given that oil is currently a few cents below $60 per barrel.

In this post, I will conduct further analysis by extending his work in several directions. First, Myles loaded the gasoline price data from the Ontario government website using a call to wget. This is certainly an easy way to pull down data from websites. The only drawback is that two separate programs are being used: one to load the data, and another to analyze the data. I will load the data directly into R. In addition I will load additional data on oil prices, exchange rates, and the Canadian CPI. My approach results in code that is more lengthy than Myles, but has the advantage of being contained in one program. The data set is monthly starting in January 1990 and continuing until March of 2015. Second, I will investigate the gasoline price - oil price relationship using 4 different specifications of the data. From these relationships I will make a forecast of current gasoline prices based off of current oil prices. Third, I will provide some tests for residual stationarity to ensure that the the variables are cointegrated.

The gasoline price data comes from an Ontario government website. Additional data on oil prices, exchange rates, and Canadian consumer price index (CPI) comes from the Federal Reserve of St. Louis database (FRED data).

First. here are some plots of Ontario gasoline prices and US oil prices. As expected, the two series move together closely.



 I estimate four specifications.

S1: gasoline prices ~ oil prices (nominal values not adjusted for exchange rates)
S2: log(gasoline prices) ~ log(oil prices) (nominal values not adjusted for exchange rates)
S3:  gasoline prices ~ oil prices (adjusted for exchange rates and inflation)
S4: log( gasoline prices) ~ log(oil prices) (adjusted for exchange rates and inflation)

Specification S1 is the most basic and does not adjust for inflation or different currencies. Notice the strong positive correlation between gasoline prices and oil prices. The simple correlation between the two is 0.973.

 


Using this specification to forecast gas prices (for oil prices of $50, $55, $60, $65, labelled as points 1 through 4 respectively) results in the following table. For example, at oil prices of $60 (point 3) the forecast of average Ontario gasoline prices is 90.9 cents/liter. The most recent actual value is $1.16 which is beyond the upper 95% confidence level. Based on this analysis, gas prices are too high.


Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
1           82.5  74.3  90.6  70.0  95.0
2           86.7  78.5  94.9  74.2  99.2
3           90.9  82.8  99.1  78.4 103.4
4           95.1  87.0 103.3  82.6 107.7
 

Also notice the increase in variability towards the right of the plot. This can be controlled for by estimating a log - log specification (S2).

Specification 4 is the one that is most consistent because each variable is expressed in real Canadian dollars. The log - log specification accounts for changing variability in the data.




Here is a table comparing the gasoline price forecasts from each of the four specifications.


                      S1     S2     S3     S4
Oil coefficient    0.845  0.477  0.702  0.439
t stat            72.871 81.483 55.576 51.631
R-squared          0.946  0.957  0.911  0.899
Forecast (oil=50) 82.472 86.766 86.606 88.295
Forecast (oil=55) 86.696 90.801 90.116 92.068
Forecast (oil=60) 90.921 94.648 93.626 95.653
Forecast (oil=65) 95.146 98.330 97.136 99.074


The estimated coefficient on the oil price variable is positive and statistically significant in each specification. The dependent variable in each specification is different so it is not possible to compare across models, but clearly for each specification, oil prices have a statistically significant positive impact on Ontario gasoline prices. The gasoline price forecast for oil prices at $60 range between 90.9 cents/liter to 95.7 cents/liter. The interesting point is that the current price of gasoline at $1.16 per liter is higher than the upper 95% confidence bound in all cases.

Lastly, the table below shows the p-values from an ADF test on the residuals from each specification. The residuals from each specification are stationary at 5% which is consistent with a cointegrating relationship between gasoline prices and oil prices.


                            S1       S2       S3     S4
Residuals adf p value 0.000867 0.000264 0.000044 0.0464


The main take-away is that Ontario gasoline prices and oil prices are cointegrated and this is robust to several different regression specifications. Based on this analysis, gasoline prices are higher than the upper 95% confidence band.



Here is the R code.

#########################################################
#  Economic forecasting and analysis
#  Perry Sadorsky
#  June 2015
#  Forecasting Ontario gasoline prices
##########################################################

## basic reference is from this post
## http://www.r-bloggers.com/what-a-gas-the-falling-price-of-oil-and-ontario-gasoline-prices/


# load libraries
library(fpp)
library(quantmod)
library(reshape2)
library(urca)


##########################################################
## collect data from Ontario government
##########################################################

furl = c("http://www.energy.gov.on.ca/fuelupload/ONTREG1991.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1992.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1993.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1994.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1995.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1996.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1997.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1998.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG1999.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2000.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2001.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2002.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2003.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2004.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2005.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2006.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2007.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2008.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2009.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2010.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2011.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2012.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2013.csv",
         "http://www.energy.gov.on.ca/fuelupload/ONTREG2014.csv"        
         )

gasp = c(1,2,3,4,5,6,7,8,9,10,11,12)
xx = read.csv("http://www.energy.gov.on.ca/fuelupload/ONTREG1990.csv",skip=1)
xx = head(xx,-3)
xx = tail(xx,12)
xx = xx[,"ON.Avg"]
gasp = cbind(gasp,xx)


for (i in 1:24) {
  xx = read.csv(furl[i],skip=1)
  xx = head(xx,-3)
  xx = tail(xx,12)
  xx = xx[,"ON.Avg"]
  gasp =cbind(gasp,xx)
}

## output is a nice matrix
View(gasp)


## stack the data into a column
s_gasp = melt(gasp)
s_gasp = s_gasp[-c(1:12),] 
View(s_gasp)


on_avg = s_gasp[,3]
tail(on_avg)
View(on_avg)


## append 2015 data
# "http://www.energy.gov.on.ca/fuelupload/ONTREG2015.csv"

yy = read.csv("http://www.energy.gov.on.ca/fuelupload/ONTREG2015.csv",skip=1)
yy = head(yy,-3)
yy = tail(yy,6)
yy= yy[,"ON.Avg"]
yy

on_avg_a = append(on_avg,yy)
tail(on_avg_a)
View(on_avg_a)

gas = ts(on_avg_a, start=1990, end=c(2015,3), frequency=12)
plot(gas,main="Ontario gas prices (cents per liter)", xlab="",ylab="")


##########################################################
## now load data on oil prices and FX
##########################################################

# load data from FRED into new environment
symbol.vec = c("MCOILWTICO" , "EXCAUS", "CANCPIALLMINMEI")
data <- new.env()
getSymbols(symbol.vec, src="FRED", env = data)


(data$MCOILWTICO["1990-01-01::"])-> oil
(data$EXCAUS["1990-01-01::"])-> fx
(data$CANCPIALLMINMEI["1990-01-01::"])-> cpi

# rebase cpi
tail(cpi,14)
cpi2014 = mean(  cpi[(300-11):300]  )
cpi_2014 = cpi/cpi2014 * 100
mean(  cpi_2014[(300-11):300]  )


par(mfrow = c(3, 1))
plot(oil,main="Oil prices")
plot(fx,main="$C/$US")
plot(cpi_2014,main="CPI")
par(mfrow = c(1, 1))


df_fred = cbind(oil,fx,cpi_2014)
head(df_fred)
tail(df_fred)
## latest cpi value is for March

df_fred_1 =  ts(df_fred, start=1990, end=c(2015,3), frequency=12)
nrow(df_fred_1)
length(gas)

oil_n = df_fred_1[,1]

par(mfrow = c(2, 1))
plot(gas,main="Ontario gas prices (cents per liter)", xlab="",ylab="")
plot(df_fred_1[,1],main="Oil prices ($/bbl)", xlab="",ylab="")
par(mfrow = c(1, 1))

cor(gas,df_fred_1[,1])


##########################################################
## linear regressions
##########################################################

table = matrix( NA, nrow=7, ncol=4)
table2 = matrix( NA, nrow=1, ncol=4)

lm1 = lm(gas ~ oil_n)

lm1s = summary(lm1)
table[1,1] = lm1s$coefficients[2,1]
table[2,1] = lm1s$coefficients[2,3]
table[3,1] = lm1s$r.squared


plot(gas ~ oil_n,
     ylab="Ontario gas prices (cents per liter)", xlab="WTI oil prices ($/bbl)")
abline(lm1)
adf1 =  ur.df(lm1$residuals, type="drift",selectlags="AIC", lags=12)
adf1@testreg$coefficients
table2[1,1] = adf1@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm1, newdata=data.frame(oil_n=c(50, 55, 60, 65)))
fcast
plot(fcast)


table[4,1] = fcast$mean[1]
table[5,1] = fcast$mean[2]
table[6,1] = fcast$mean[3]
table[7,1] = fcast$mean[4]
table



## log- log specification
lm2 = lm(log(gas) ~ log(oil_n))
summary(lm2)

lm2s = summary(lm2)
table[1,2] = lm2s$coefficients[2,1]
table[2,2] = lm2s$coefficients[2,3]
table[3,2] = lm2s$r.squared

plot(log(gas) ~ log(oil_n),
     ylab="Log of Ontario gas prices (cents per liter)", xlab="Log of WTI oil prices ($/bbl)")
abline(lm2)
adf2 =  ur.df(lm2$residuals, type="drift",selectlags="AIC", lags=12)
adf2@testreg$coefficients
table2[1,2] = adf2@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm2, newdata=data.frame(oil_n=c(50, 55, 60, 65)))
fcast
plot(fcast)

exp(fcast$mean)

table[4,2] = exp(fcast$mean[1])
table[5,2] = exp(fcast$mean[2])
table[6,2] = exp(fcast$mean[3])
table[7,2] = exp(fcast$mean[4])



## now do this in real Canadian dollars
gas_r = gas/df_fred_1[,3]*100
oil_r = (df_fred_1[,1]*df_fred_1[,2])/df_fred_1[,3]*100


lm3 = lm(gas_r ~ oil_r)
summary(lm3)

lm3s = summary(lm3)
table[1,3] = lm3s$coefficients[2,1]
table[2,3] = lm3s$coefficients[2,3]
table[3,3] = lm3s$r.squared

plot(gas_r ~ oil_r,
     ylab="Ontario real gas prices (cents per liter)", xlab="Real oil prices ($/bbl)")
abline(lm3)
adf3 =  ur.df(lm3$residuals, type="drift",selectlags="AIC", lags=12)
adf3@testreg$coefficients
table2[1,3] = adf3@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm3, newdata=data.frame(oil_r=c(50, 55, 60, 65)))
fcast
plot(fcast)

table[4,3] = fcast$mean[1]
table[5,3] = fcast$mean[2]
table[6,3] = fcast$mean[3]
table[7,3] = fcast$mean[4]



## log- log specification in real Canadian dollars
lm4 = lm(log(gas_r) ~ log(oil_r))
summary(lm4)

lm4s = summary(lm4)
table[1,4] = lm4s$coefficients[2,1]
table[2,4] = lm4s$coefficients[2,3]
table[3,4] = lm4s$r.squared


plot(log(gas_r) ~ log(oil_r),
     ylab="Log of real Ontario gas prices (cents per liter)", xlab="Log of real oil prices ($/bbl)")
abline(lm4)
adf4 =  ur.df(lm4$residuals, type="drift",selectlags="AIC", lags=12)
adf4@testreg$coefficients
table2[1,4] = adf4@testreg$coefficients[2,4]

## some forecasting
fcast <- forecast(lm4, newdata=data.frame(oil_r=c(50, 55, 60, 65)))
fcast
(fcast$mean)
(fcast$lower)
(fcast$upper)
# plot(fcast)
exp(fcast$mean)
exp(fcast$lower)
exp(fcast$upper)

fcast4 = cbind(exp(fcast$mean),exp(fcast$lower),  exp(fcast$upper))
colnames(fcast4) = c("Forecast", "Lo 80", "Lo 95", "Hi 80", "Hi 95")
fcast4

table[4,4] = exp(fcast$mean[1])
table[5,4] = exp(fcast$mean[2])
table[6,4] = exp(fcast$mean[3])
table[7,4] = exp(fcast$mean[4])


colnames(table) = c("S1", "S2", "S3", "S4")
rownames(table) = c("Oil coefficient", "t stat", "R-squared", "Forecast (oil=50)", "Forecast (oil=55)", "Forecast (oil=60)", "Forecast (oil=65)")

options(digits=3, scipen=10)
table


colnames(table2) = c("S1", "S2", "S3", "S4")
rownames(table2) = c("Residuals adf p value")
table2


Friday, 5 June 2015

Forecasting Ontario Housing Starts


With all of the talk about the housing market in Ontario, I decided to forecast Ontario housing starts using many of the methods that I teach in my Economic Forecasting and Analysis course. These methods are all covered in Rob Hyndmans's textbook Forecasting: Principles and Practice which I use as a textbook for my course.

Forecasting housing starts is important for the real estate market because housing starts are one of the key drivers of the real estate market.

Here are what Ontario housing starts look like over a 65 year period. Notice two peaks in the housing starts data: one in the early 1970s and the other in the late 1980s. The data comes from CANSIM.

 
 
The two peaks in housing starts are unlikely to be repeated in the near future so for forecasting, a shorter time frame is desirable. For the forecasts, the training period was set as 1994 quarter 1 to 2010 quarter 4. The test period was set as 2011 first quarter to 2015 first quarter. Forecasts were produced for each model and their accuracy measures compared.

Here are what the forecast accuracy measures look like.


For model comparison, I like to focus on the out-of-sample or test values. From the table, Seasonal Naive has the lowest MASE among the test value, indicating that based on MASE this is the best forecasting method. The next best methods are STL and ARIMA. Notice, that from the MPE values there may be some opportunity to combine forecasts, but I do not do this here.


The Seasonal naive method produces forecasts for Ontario housing starts of 59,688 units for each of 2015, 2016, and 2017. By comparison, Canada Mortgage and Housing Corporation (CMHC) forecasts Ontario housing starts of 61,700 and 60,600 for 2015 and 2016 respectively. Both sets of forecasts agree that Ontario housing starts are likely to remain flat for the next few years. Flat housing starts combined with strong demand for houses implies upward pressure on house prices. This is especially the case in the GTA. A good time for sellers, but not such a good time to be a buyer. The forecasts are also useful for helping predict future sales for any company that is directly linked to housing starts.


The R code is posted below. The data you have to source yourself from CANSIM (CANSIM data label is v730423).

#########################################################
#  Economic forecasting and analysis
#  Perry Sadorsky
#  Forecasting Ontario housing starts
##########################################################


# load libraries
library(fpp)


# import data
ontario.housing.starts <- read.csv("C:/house prices/ontario housing starts.csv")
View(ontario.housing.starts)

                                             
# tell R that data set is a time series
df = ts(ontario.housing.starts, start=1948, frequency=4)
df

hist(df[,"hs"], breaks=100)


# create a named variable
y = df[,"hs"]
                      
                      
# some graphs
hist(y)
plot(y)
Acf(y)


# add labels to the plot
plot(y, main="Ontario housing starts",  xlab="", ylab="",col = 'blue', lwd=4)



##########################################################
##########################################################
# set training data, test data, out of sample
train <- window(y,start=c(1994, 1),end=c(2010, 4))
train
plot(train)

test <- window(y, start=2011)
both <- window(y,start=c(1994, 1))
h=17

##########################################################
##########################################################


##########################################################
## forecast with benchmarks
##########################################################

fit1 <- meanf(train, h=h)
fit2 <- naive(train, h=h)
fit3 <- snaive(train, h=h)

plot(fit1)
plot(fit2)
plot(fit3)


# make a nice plot showing the forecasts
plot(fit1, plot.conf=FALSE,
main="Forecasts for Ontario housing starts")
lines(fit2$mean,col=2)
lines(fit3$mean,col=3)
legend("topleft",lty=1,col=c(4,2,3),
legend=c("Mean method","Naive method","Seasonal naive method"),bty="n")


# plot with forecasts and actual values
plot(fit1, plot.conf=FALSE,
     main="Forecasts for Ontario housing starts")
lines(fit2$mean,col=2)
lines(fit3$mean,col=3)
lines(hs)
legend("topleft",lty=1,col=c(4,2,3),
       legend=c("Mean method","Naive method","Seasonal naive method"),bty="n")



##########################################################
# exponential smoothing approaches
##########################################################

# simple exponential moving averages
fit4 <- ses(train, h = h)
summary(fit4)
plot(fit4)


# holt's linear trend method
fit5 <- holt(train,  h=h)
summary(fit5)
plot(fit5)


# holt's exponential trend method
fit6 <- holt(train, exponential=TRUE, h=h)
summary(fit6)
plot(fit6)


# holt's damped trend method
fit7 <- holt(train, damped=TRUE, h=h)
summary(fit7)
plot(fit7)


# holt winter's  method
fit8 <- hw(train, seasonal="multiplicative", h=h)
summary(fit8)
plot(fit8)


# ETS  method
y.ets <- ets(train, model="ZZZ")
summary(y.ets)
fit9 <- forecast(y.ets, h=h)
summary(fit9)
plot(fit9)


# STL  method
y.stl <- stl(train, t.window=15, s.window="periodic", robust=TRUE)
summary(y.stl)
fit10 <- forecast(y.stl, method="rwdrift",h=h)
summary(fit10)
plot(fit10)


# trend plus seasonal
tps <- tslm(train ~ trend + season)
summary(tps)
fit11 = forecast(tps,h=h)
plot(fit11)


# arima modelling and forecasting
# test for unit root
adf.test(train, alternative = "stationary")
kpss.test(train)
ndiffs(train)


y.arima <- auto.arima(train)
fit12 <- forecast(y.arima, h=h)
plot(fit12)

# take logs to reduce variability
y.arima.lambda <- auto.arima(train, lambda=0)
fit13 <- forecast(y.arima.lambda, h=h)
plot(fit13)
tsdiag(y.arima.lambda)


## ANN
ann <- nnetar(train, repeats=100,lambda=0)
fit14 = forecast(ann,h=h)
plot(fit14)

## bats
bats_fit <- bats(train)
fit15 = forecast(bats_fit,h=h)
plot(fit15)



# accuracy measures
a1 = accuracy(fit1, test)
a2 = accuracy(fit2, test)
a3 = accuracy(fit3, test)
a4 = accuracy(fit4, test)
a5 = accuracy(fit5, test)
a6 = accuracy(fit6, test)
a7 = accuracy(fit7, test)
a8 = accuracy(fit8, test)
a9 = accuracy(fit9, test)
a10 = accuracy(fit10, test)
a11 = accuracy(fit11, test)
a12 = accuracy(fit12, test)
a13 = accuracy(fit13, test)
a14 = accuracy(fit14, test)
a15 = accuracy(fit15, test)


#Combining forecast summary statistics into a table with row names
a.table<-rbind(a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15)

row.names(a.table)<-c('Mean training','Mean test', 'Naive training', 'Naive test', 'S. Naive training', 'S. Naive test' ,
                      'ES training','ES test', 'Holt linear training', 'Holt linear test', 'Holt ES training', 'Holt ES test' ,
                      'Holt dampled training','Holt damped test', 'Holt Winters training', 'Holt Winters test', 'ETS training', 'ETS test' ,     
                  'STL training','STL test','linear trend training','linear trend test',
                  'ARIMA training', 'ARIMA test',
                  'ARIMA training (log)', 'ARIMA test (log)',
                 'ANN training', 'ANN test',
                  'BATS training', 'BATS test'
                )

# order the table according to MASE
a.table<-as.data.frame(a.table)
a.table<-a.table[order(a.table$MASE),]
a.table

# write table to csv file
## write.csv(a.table, "C:/house prices/atable.csv")
##########################################################
##########################################################


## based on MASE to choose model, forecast hs periods into the future
hs = 11

fit.os.1 <- snaive(both, h=hs)
plot(fit.os.1, include=48)
fit.os.1


fit.os.2 = stl(both, t.window=15, s.window="periodic", robust=TRUE)
fit.os.2.f  <- forecast(fit.os.2 , method="rwdrift",h=hs)
plot(fit.os.2.f, include=48)



Thursday, 21 May 2015

Carbon Trading Coming to Ontario

I recently gave an interview to 360 Energy about the newly announced plan to start a cap and trade
program for carbon dioxide trading in Ontario.

Here is the link to the interview.

In the interview I cover some of the pros and cons of carbon trading and provide some thoughts on other carbon related topics.

Friday, 13 March 2015

Oil Prices at $20 a Barrel?

I recently read an interesting article by Anatole Kaletsky on $50 a barrel representing a new ceiling for oil prices. In the article, he describes the tradeoff between competition and monopoly power as prime drivers of oil prices. Over the past 40 years real (inflation adjusted) oil prices have traded in two regimes: 1) a high price regime associated with monopoly power, and 2) a low price regime associated with competitive market power. The dividing line between the two regimes is an oil price of approximately $50 a barrel. Consider the following chart of real oil prices with 2014 as the base year. Oil priced at $50 a barrel does seem to represent a dividing line between high priced monopoly regimes and low priced competitive regimes.



From 1974 to 1985 the oil market operated in a monopoly pricing regime as OPEC exerted its monopoly power.  Between 1986 and 2004 oil prices were determined by competitive pricing. During this time period, new oil from the North Sea and Alaska helped to restore more competition in the world oil market. More recently, from 2005 until the summer of 2014, oil  prices were again determined by monopoly pricing power as oil demand increased faster than supply. If $50 is the divideing line between the competitive and monopolistic regimes, it also looks like $20 is the floor for oil prices in the competitive regime. It is thus possible that if oil prices have switched into a competitive regime, then we could see prices bottom out at $20.

In a competitive market, the price of oil is determined by the marginal cost of production. According to Kaletsky, the marginal cost of production for conventional oil is around $20 per barrel, while the marginal cost of production from US fracking is around $50 per barrel. If we have entered a new competitive pricing regime, then US frackers are the new "swing producers".

This led me to think about a way to empirically determine the threshold price for oil. In response, I estimated a self exciting threshold autoregressive (SETAR) model for real oil prices. SETAR models are designed to model data that changes its pattern depending upon what regime it is in. The SETAR approach chose $57.42 as the threshold value. This is slightly higher than $50 but consistent with what the marginal cost of US fracking could be. Here is what the regime switching plot looks like.


According to the plot, oil prices are still in a high price regime. Remember, one sudden drop in prices does not indicate a regime switch. If, however, oil prices do continue to stay below $57 a barrel signifying a switch into a low price regime, then $20 looks like a price floor.


The R code is posted below. The monthly oil price and CPI data comes from FRED. Notice, that the oil price series needs to be constructed by splicing the old FRED oil price series (OILPRICE) with the more recent one (MCOILWTICO). The resulting SETAR model fits well and the residuals show no discernible pattern.


#########################################################
#  Economic forecasting and analysis
#  Perry Sadorsky
#  March 2015
#  SETAR model of oil prices
##########################################################


# load libraries
library(fpp)
library(tsDyn)


##########################################################
## read in data and prepare data
##########################################################


# import data on monthly oil prices
cpi_oil <- read.csv("C:/oil prices/cpi_oil.csv")
View(cpi_oil)


# tell R that data set is a time series
cpi_oil = ts(cpi_oil, start=1947, frequency=12)
cpi_oil


oil = cpi_oil[,"OILPRICE"]
cpi = cpi_oil[,"CPIAUCSL"]


oil =window(oil,start=1973)
cpi =window(cpi,start=1973)


# rebase cpi
tail(cpi,14)
cpi2014 = mean(  cpi[(816-11):816]  )
cpi_2014 = cpi/cpi2014 * 100


## http://www.project-syndicate.org/commentary/oil-prices-ceiling-and-floor-by-anatole-kaletsky-2015-01
rop = oil/cpi_2014 *100
plot(rop,main="Real oil prices ($/bbl)",xlab="",ylab="",xaxt="n",lwd=4)
axis(1,at=c(seq(from=1974,to=2014,by=2)), las=2)
abline(h=50)
abline(h=20)


rop.ret = 100* diff(log(rop))
plot(rop.ret)
tsdisplay(rop.ret)



#### SETAR model
# select model
selectSETAR(rop, m=6, d=2, nthresh=1)


# estimate model
mod.setar <- setar(rop, m=6, thDelay=0, th= 57.42232 , mL=6, mH=3)
mod.setar
summary(mod.setar)
plot(mod.setar)
par(mfrow=c(1,1))