Skip to contents

Compute prediction intervals and other information by applying the conformal PID control method.

Usage

pid(
  object,
  alpha = 1 - 0.01 * object$level,
  symmetric = FALSE,
  ncal = 10,
  rolling = FALSE,
  integrate = TRUE,
  scorecast = !symmetric,
  scorecastfun = NULL,
  lr = 0.1,
  Tg = NULL,
  delta = NULL,
  Csat = 2/pi * (ceiling(log(Tg) * delta) - 1/log(Tg)),
  KI = max(abs(object$errors), na.rm = TRUE),
  ...
)

Arguments

object

An object of class "cvforecast". It must have an argument x for original univariate time series, an argument MEAN for point forecasts and ERROR for forecast errors on validation set. See the results of a call to cvforecast.

alpha

A numeric vector of significance levels to achieve a desired coverage level \(1-\alpha\).

symmetric

If TRUE, symmetric nonconformity scores (i.e. \(|e_{t+h|t}|\)) are used. If FALSE, asymmetric nonconformity scores (i.e. \(e_{t+h|t}\)) are used, and then upper bounds and lower bounds are produced separately.

ncal

Length of the burn-in period for training the scorecaster. If rolling = TRUE, it is also used as the length of the trailing windows for learning rate calculation and the windows for the calibration set. If rolling = FALSE, it is used as initial period of calibration sets and trailing windows for learning rate calculation.

rolling

If TRUE, a rolling window strategy will be adopted to form the trailing window for learning rate calculation and the calibration set for scorecaster if applicable. Otherwise, expanding window strategy will be used.

integrate

If TRUE, error integration will be included in the update process.

scorecast

If TRUE, scorecasting will be included in the update process, and scorecastfun should be given.

scorecastfun

A scorecaster function to return an object of class forecast. Its first argument must be a univariate time series, and it must have an argument h for the forecast horizon.

lr

Initial learning rate used for quantile tracking.

Tg

The time that is set to achieve the target absolute coverage guarantee before this.

delta

The target absolute coverage guarantee is set to \(1-\alpha-\delta\).

Csat

A positive constant ensuring that by time Tg, an absolute guarantee is of at least \(1-\alpha-\delta\) coverage.

KI

A positive constant to place the integrator on the same scale as the scores.

...

Other arguments are passed to the scorecastfun function.

Value

A list of class c("pid", "cpforecast", "forecast") with the following components:

x

The original time series.

series

The name of the series x.

method

A character string "pid".

cp_times

The number of times the conformal prediction is performed in cross-validation.

MEAN

Point forecasts as a multivariate time series, where the \(h\)th column holds the point forecasts for forecast horizon \(h\). The time index corresponds to the period for which the forecast is produced.

ERROR

Forecast errors given by \(e_{t+h|t} = y_{t+h}-\hat{y}_{t+h|t}\).

LOWER

A list containing lower bounds for prediction intervals for each level. Each element within the list will be a multivariate time series with the same dimensional characteristics as MEAN.

UPPER

A list containing upper bounds for prediction intervals for each level. Each element within the list will be a multivariate time series with the same dimensional characteristics as MEAN.

level

The confidence values associated with the prediction intervals.

call

The matched call.

model

A list containing information abouth the conformal prediction model.

If mean is included in the object, the components mean, lower, and upper will also be returned, showing the information about the forecasts generated using all available observations.

Details

The PID method combines three modules to make the final iteration: $$q_{t+h|t}=\underbrace{q_{t+h-1|t-1} + \eta(\mathrm{err}_{t|t-h}-\alpha)}_{\mathrm{P}}+\underbrace{r_t\left(\sum_{i=1}^t\left(\mathrm{err}_{i|i-h}-\alpha\right)\right)}_{\mathrm{I}}+\underbrace{\hat{s}_{t+h|t}}_{\mathrm{D}}$$ for each individual forecast horizon h, respectively, where

  • Quantile tracking part (P) is \(q_{t+h-1|t-1} + \eta(\mathrm{err}_{t|t-h}-\alpha)\), where \(q_{1+h|1}\) is set to 0 without a loss of generality, \(\mathrm{err}_{t|t-h}=1\) if \(s_{t|t-h}>q_{t|t-h}\), and \(\mathrm{err}_{t|t-h}=0\) if \(s_{t|t-h} \leq q_{t|t-h}\).

  • Error integration part (I) is \(r_t\left(\sum_{i=1}^t\left(\mathrm{err}_{i|i-h}-\alpha\right)\right)\). Here we use a nonlinear saturation function \(r_t(x)=K_{\mathrm{I}} \tan \left(x \log (t) /\left(t C_{\text {sat }}\right)\right)\), where we set \(\tan (x)=\operatorname{sign}(x) \cdot \infty\) for \(x \notin[-\pi / 2, \pi / 2]\), and \(C_{\text {sat }}, K_{\mathrm{I}}>0\) are constants that we choose heuristically.

  • Scorecasting part (D) is \(\hat{s}_{t+h|t}\) is forecast generated by training a scorecaster based on nonconformity scores available at time \(t\).

References

Angelopoulos, A., Candes, E., and Tibshirani, R. J. (2024). "Conformal PID control for time series prediction", Advances in Neural Information Processing Systems, 36, 23047–23074.

Examples

# Simulate time series from an AR(2) model
library(forecast)
series <- arima.sim(n = 1000, list(ar = c(0.8, -0.5)), sd = sqrt(1))

# Cross-validation forecasting
far2 <- function(x, h, level) {
  Arima(x, order = c(2, 0, 0)) |>
    forecast(h = h, level)
}
fc <- cvforecast(series, forecastfun = far2, h = 3, level = c(80, 95),
                 forward = TRUE, initial = 1, window = 100)

# PID setup
Tg <- 1000; delta <- 0.01
Csat <- 2 / pi * (ceiling(log(Tg) * delta) - 1 / log(Tg))
KI <- 2
lr <- 0.1

# PID without scorecaster
pidfc_nsf <- pid(fc, symmetric = FALSE, ncal = 100, rolling = TRUE,
                 integrate = TRUE, scorecast = FALSE,
                 lr = lr, KI = KI, Csat = Csat)
print(pidfc_nsf)
#> PID 
#> 
#> Call:
#>  pid(object = fc, symmetric = FALSE, ncal = 100, rolling = TRUE,  
#>      integrate = TRUE, scorecast = FALSE, lr = lr, Csat = Csat,  
#>      KI = KI) 
#> 
#>  cp_times = 898 (the forward step included) 
#> 
#> Forecasts of the forward step:
#>      Point Forecast      Lo 80     Hi 80     Lo 95    Hi 95
#> 1001    0.264457085 -0.8488525 0.9538264 -1.496723 2.524613
#> 1002    0.232331391 -1.0526208 1.9153472 -2.058756 3.493363
#> 1003   -0.003700809 -1.6744563 1.3092852 -2.192779 2.643844
summary(pidfc_nsf)
#> PID 
#> 
#> Call:
#>  pid(object = fc, symmetric = FALSE, ncal = 100, rolling = TRUE,  
#>      integrate = TRUE, scorecast = FALSE, lr = lr, Csat = Csat,  
#>      KI = KI) 
#> 
#>  cp_times = 898 (the forward step included) 
#> 
#> Forecasts of the forward step:
#>      Point Forecast      Lo 80     Hi 80     Lo 95    Hi 95
#> 1001    0.264457085 -0.8488525 0.9538264 -1.496723 2.524613
#> 1002    0.232331391 -1.0526208 1.9153472 -2.058756 3.493363
#> 1003   -0.003700809 -1.6744563 1.3092852 -2.192779 2.643844
#> 
#> Cross-validation error measures:
#>       ME   MAE   MSE  RMSE     MPE    MAPE  MASE RMSSE Winkler_95 MSIS_95
#> CV 0.014 0.961 1.444 1.079 182.346 263.405 0.922 0.814      6.053   5.898

# PID with a Naive model for the scorecaster
naivefun <- function(x, h) {
  naive(x) |> forecast(h = h)
}
pidfc <- pid(fc, symmetric = FALSE, ncal = 100, rolling = TRUE,
             integrate = TRUE, scorecast = TRUE, scorecastfun = naivefun,
             lr = lr, KI = KI, Csat = Csat)