Compute prediction intervals and other information by applying the conformal PID control method.
Arguments
- object
An object of class
"cvforecast"
. It must have an argumentx
for original univariate time series, an argumentMEAN
for point forecasts andERROR
for forecast errors on validation set. See the results of a call tocvforecast
.- alpha
A numeric vector of significance levels to achieve a desired coverage level \(1-\alpha\).
- symmetric
If
TRUE
, symmetric nonconformity scores (i.e. \(|e_{t+h|t}|\)) are used. IfFALSE
, asymmetric nonconformity scores (i.e. \(e_{t+h|t}\)) are used, and then upper bounds and lower bounds are produced separately.- ncal
Length of the burn-in period for training the scorecaster. If
rolling = TRUE
, it is also used as the length of the trailing windows for learning rate calculation and the windows for the calibration set. Ifrolling = FALSE
, it is used as initial period of calibration sets and trailing windows for learning rate calculation.- rolling
If
TRUE
, a rolling window strategy will be adopted to form the trailing window for learning rate calculation and the calibration set for scorecaster if applicable. Otherwise, expanding window strategy will be used.- integrate
If
TRUE
, error integration will be included in the update process.- scorecast
If
TRUE
, scorecasting will be included in the update process, andscorecastfun
should be given.- scorecastfun
A scorecaster function to return an object of class
forecast
. Its first argument must be a univariate time series, and it must have an argumenth
for the forecast horizon.- lr
Initial learning rate used for quantile tracking.
- Tg
The time that is set to achieve the target absolute coverage guarantee before this.
- delta
The target absolute coverage guarantee is set to \(1-\alpha-\delta\).
- Csat
A positive constant ensuring that by time
Tg
, an absolute guarantee is of at least \(1-\alpha-\delta\) coverage.- KI
A positive constant to place the integrator on the same scale as the scores.
- ...
Other arguments are passed to the
scorecastfun
function.
Value
A list of class c("pid", "cpforecast", "forecast")
with the following components:
- x
The original time series.
- series
The name of the series
x
.- method
A character string "pid".
- cp_times
The number of times the conformal prediction is performed in cross-validation.
- MEAN
Point forecasts as a multivariate time series, where the \(h\)th column holds the point forecasts for forecast horizon \(h\). The time index corresponds to the period for which the forecast is produced.
- ERROR
Forecast errors given by \(e_{t+h|t} = y_{t+h}-\hat{y}_{t+h|t}\).
- LOWER
A list containing lower bounds for prediction intervals for each
level
. Each element within the list will be a multivariate time series with the same dimensional characteristics asMEAN
.- UPPER
A list containing upper bounds for prediction intervals for each
level
. Each element within the list will be a multivariate time series with the same dimensional characteristics asMEAN
.- level
The confidence values associated with the prediction intervals.
- call
The matched call.
- model
A list containing information abouth the conformal prediction model.
If mean
is included in the object
, the components mean
,
lower
, and upper
will also be returned, showing the information
about the forecasts generated using all available observations.
Details
The PID method combines three modules to make the final iteration:
$$q_{t+h|t}=\underbrace{q_{t+h-1|t-1} + \eta(\mathrm{err}_{t|t-h}-\alpha)}_{\mathrm{P}}+\underbrace{r_t\left(\sum_{i=1}^t\left(\mathrm{err}_{i|i-h}-\alpha\right)\right)}_{\mathrm{I}}+\underbrace{\hat{s}_{t+h|t}}_{\mathrm{D}}$$
for each individual forecast horizon h
, respectively, where
Quantile tracking part (P) is \(q_{t+h-1|t-1} + \eta(\mathrm{err}_{t|t-h}-\alpha)\), where \(q_{1+h|1}\) is set to 0 without a loss of generality, \(\mathrm{err}_{t|t-h}=1\) if \(s_{t|t-h}>q_{t|t-h}\), and \(\mathrm{err}_{t|t-h}=0\) if \(s_{t|t-h} \leq q_{t|t-h}\).
Error integration part (I) is \(r_t\left(\sum_{i=1}^t\left(\mathrm{err}_{i|i-h}-\alpha\right)\right)\). Here we use a nonlinear saturation function \(r_t(x)=K_{\mathrm{I}} \tan \left(x \log (t) /\left(t C_{\text {sat }}\right)\right)\), where we set \(\tan (x)=\operatorname{sign}(x) \cdot \infty\) for \(x \notin[-\pi / 2, \pi / 2]\), and \(C_{\text {sat }}, K_{\mathrm{I}}>0\) are constants that we choose heuristically.
Scorecasting part (D) is \(\hat{s}_{t+h|t}\) is forecast generated by training a scorecaster based on nonconformity scores available at time \(t\).
References
Angelopoulos, A., Candes, E., and Tibshirani, R. J. (2024). "Conformal PID control for time series prediction", Advances in Neural Information Processing Systems, 36, 23047–23074.
Examples
# Simulate time series from an AR(2) model
library(forecast)
series <- arima.sim(n = 1000, list(ar = c(0.8, -0.5)), sd = sqrt(1))
# Cross-validation forecasting
far2 <- function(x, h, level) {
Arima(x, order = c(2, 0, 0)) |>
forecast(h = h, level)
}
fc <- cvforecast(series, forecastfun = far2, h = 3, level = c(80, 95),
forward = TRUE, initial = 1, window = 100)
# PID setup
Tg <- 1000; delta <- 0.01
Csat <- 2 / pi * (ceiling(log(Tg) * delta) - 1 / log(Tg))
KI <- 2
lr <- 0.1
# PID without scorecaster
pidfc_nsf <- pid(fc, symmetric = FALSE, ncal = 100, rolling = TRUE,
integrate = TRUE, scorecast = FALSE,
lr = lr, KI = KI, Csat = Csat)
print(pidfc_nsf)
#> PID
#>
#> Call:
#> pid(object = fc, symmetric = FALSE, ncal = 100, rolling = TRUE,
#> integrate = TRUE, scorecast = FALSE, lr = lr, Csat = Csat,
#> KI = KI)
#>
#> cp_times = 898 (the forward step included)
#>
#> Forecasts of the forward step:
#> Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
#> 1001 0.264457085 -0.8488525 0.9538264 -1.496723 2.524613
#> 1002 0.232331391 -1.0526208 1.9153472 -2.058756 3.493363
#> 1003 -0.003700809 -1.6744563 1.3092852 -2.192779 2.643844
summary(pidfc_nsf)
#> PID
#>
#> Call:
#> pid(object = fc, symmetric = FALSE, ncal = 100, rolling = TRUE,
#> integrate = TRUE, scorecast = FALSE, lr = lr, Csat = Csat,
#> KI = KI)
#>
#> cp_times = 898 (the forward step included)
#>
#> Forecasts of the forward step:
#> Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
#> 1001 0.264457085 -0.8488525 0.9538264 -1.496723 2.524613
#> 1002 0.232331391 -1.0526208 1.9153472 -2.058756 3.493363
#> 1003 -0.003700809 -1.6744563 1.3092852 -2.192779 2.643844
#>
#> Cross-validation error measures:
#> ME MAE MSE RMSE MPE MAPE MASE RMSSE Winkler_95 MSIS_95
#> CV 0.014 0.961 1.444 1.079 182.346 263.405 0.922 0.814 6.053 5.898
# PID with a Naive model for the scorecaster
naivefun <- function(x, h) {
naive(x) |> forecast(h = h)
}
pidfc <- pid(fc, symmetric = FALSE, ncal = 100, rolling = TRUE,
integrate = TRUE, scorecast = TRUE, scorecastfun = naivefun,
lr = lr, KI = KI, Csat = Csat)