Advertisement
MetricT

ARIMA.r

Feb 18th, 2019
620
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
R 4.68 KB | None | 0 0
  1. ### This R script downloads 10 yr/2 yr yield curve data from FRED and computes
  2. ### the date the yield curve will invert, first using a traditional least-squares
  3. ### method, then using a more accurate ARIMA model.
  4.  
  5. ### Go to https://www.quandl.com and sign up for a free account.   Once done, go
  6. ### to "Account Settings", and copy your API key.
  7.  
  8. ### Now open R and run the following commands.  You can either save it as a .R
  9. ### script or just enter the commands manually.   Lines starting with "#" are
  10. ### comments and can be ignored.
  11.  
  12. ### We need the following packages for this example.
  13. packages <- c("Quandl", "lubridate", "lmtest", "forecast")
  14.  
  15. ### Install (if needed) and load needed packages.
  16. new.packages <- packages[!(packages %in% installed.packages()[,"Package"])]
  17. if(length(new.packages)) install.packages(new.packages, quiet=TRUE)
  18. lapply(packages, "library", character.only=TRUE)
  19.  
  20. ### Initialize Quandl with our API key
  21. Quandl.api_key("<INSERT_YOUR_QUANDL_KEY_HERE>")
  22.    
  23. ### Load the 10-2 yield curve data from FRED.   The FRED code for this data
  24. ### set is "T10Y2Y".   If you want to analyze a different data set, just
  25. ### Google for "FRED GRAPHS <insert data set here>" and it will usually
  26. ### return the code you want.   In this example, I'm graphing data since
  27. ### December 13, 2016 (two complete years of data).
  28. Data <- Quandl("FRED/T10Y2Y", start_date="2017-01-01")
  29. attach(Data)
  30.  
  31. ### Note that when importing dates, R imports them as t"2018-03-05".  For
  32. ### purposes of computing a fit, something like "2018.165" is more useful.
  33. Decimal_Date <- decimal_date(Date)
  34.  
  35. ### Do a linear regression fit of the data
  36. Fit.lm <- lm(Value ~ Decimal_Date)
  37. summary(Fit.lm)
  38.  
  39. ### Calculate the inversion date given the simple LM fit
  40. InversionDate.lm <- - Fit.lm$coefficients[1] / Fit.lm$coefficients[2]
  41. date_decimal(InversionDate.lm)
  42.  
  43. ### According to a simple linear regression, the yield curve will invert
  44. ### on March 19, 2019.  Now let's examine the residuals of the fit...
  45. plot(residuals(Fit.lm), type="l")
  46.  
  47. ### The "eyeball test" shows that the residuals aren't random.   There appears
  48. ### to be a strong quasi-perioidic component embedded in the residuals which
  49. ### renders the data "non-stationary".  So we need to examine the residuals to
  50. ### see what's going on.
  51. ###
  52. ### The Durbin-Watson test attempts to see if there is significant
  53. ### autocorrelation in the residuals.  The test is bounded between 0 and 4.
  54. ###
  55. ### d = 2  -> residuals show no correlation
  56. ### d > 2  -> residuals show positive correlation
  57. ### d < 2  -> residuals show negative correlation
  58. dwtest(Value ~ Decimal_Date)
  59.  
  60. ### The Durbin-Watson test score is 0.099033, indicating a strong
  61. ### positive correlation in the residuals.   This is due to the fact
  62. ### that bond prices residuals *aren't* random (which a simple linear
  63. ### regression assumes).  Rather, the price of the bond on day N is
  64. ### usually strongly correlated with the price on day N-1.   So we need
  65. ### to use a more advanced model to capture that autocorrelation.
  66. ###
  67. ### The ARIMA regression model is designed to do exactly that.  Explaining
  68. ### it is well beyond what a basic tutorial can handle, so I'm just going
  69. ### to do it, and you can Google more about it if you want.
  70. ###
  71. ### A generic ARIMA model has three parameters:
  72. ###
  73. ### p -> the order (number of time lags) of the autoregressive model
  74. ### d -> the degree of differencing (number of times past values were subtracted)
  75. ### q -> the order of the moving-average model
  76. ###
  77. ### As I write this, the data is best modeled by a (1,0,0) model.  Sometimes
  78. ### the auto.arima function seems to pick an unnecessarily complex model to
  79. ### fit because it minimizes AIC by a microscopic amount compared to a
  80. ### simple (1,0,0) model.  This is one of those areas where you need a bit
  81. ### of experience so you know when to use auto.arima and when to ignore it.
  82. ### Try to use both "aic" and "bic" in the "ic=" parameter to see which models
  83. ### Fit.arima <- auto.arima(Value, xreg=Decimal_Date, stepwise=FALSE, approximation=FALSE, ic="bic")
  84.  
  85. Fit.arima <- Arima(Value, xreg=Decimal_Date, order=c(1,0,0))
  86. summary(Fit.arima)
  87.  
  88. ### And the ARIMA model thinks the inversion date is...
  89. InversionDate.arima = - Fit.arima$coef[2] / Fit.arima$coef[3]
  90. date_decimal(InversionDate.arima)
  91.  
  92. ### So the more advanced model predicts the yield curve will invert a few days
  93. ### later than the simple linear regression (03/25/19 vs 03/19/19).
  94.  
  95. ### Now, let's check the residuals of the ARIMA fit againt the residuals from
  96. ### our original LM fit...
  97. plot(residuals(Fit.lm), type="l", col="red")
  98. lines(residuals(Fit.arima), type="l", col="blue")
  99.  
  100. ### Our residuals are now smaller, and look much more random.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement