Gaussian Processes
Introduction
For a given set of training points, there are potentially infinitely many functions that fit the data. Gaussian processes offer an elegant solution to this problem by assigning a probability to each of these functions. The mean of this probability distribution then represents the most probable characterization of the data. Furthermore, using a probabilistic approach allows us to incorporate the confidence of the prediction into the regression result. They also allow us to make predictions about our data by incorporating prior knowledge. visual exploration
In order to set up the distribution, we need to define mean and covariance matrix . In Gaussian processes, it is often assumed that , which simplifies the necessary equations for conditioning, and this assumption could be easily achieved by centering the data. So the most important parameter is , which determines the characteristics of a Gaussian process.
The covariance matrix is generated by the kernel function, which is often also called covariance function, pairwise on all the points. The choice of kernels are based on our prior beliefs. Kernels can be separated into stationary and non-stationary kernels. Stationary kernels, such as the RBF kernel or the periodic kernel, are functions invariant to translations, and the covariance of two points is only dependent on their relative position. Non-stationary kernels, such as the linear kernel, do not have this constraint and depend on an absolute location. Most importantly, kernels could be combined by addition or multiplication to form a new kernel, which is a powerful property to model complex data.
Problems
GPs are generally a go-to approach for non-linear time series, but the reservation is that it is not a mechanistic model. So it is not good for modeling compartmental models such as SIR, SEIR, etc. A GP model might be useful for very short-term predictions, but since it does not account for changes in behavior (including specific interventions), it has limited use. That said, it might be interesting to incorporate a GP into a mechanistic model as a way of estimating some of the latent parameters, and their dynamics 1.
However, there seems to be a solution Stationarity without mean reversion in improper Gaussian processes.
Examples
- Forecasting of CO2 level on Mona Loa using Gaussian Process regression another blog on CO2 pymc example pymc improved version pymc discussion
- An Introduction to Gaussian Process Regression
- PyData Berlin 2019: Gaussian Processes for Time Series Forecasting (scikit-learn)
- Gaussian Processes for Time Series Forecasting with PyMC3
- Time Series Modeling with HSGP: Baby Births Example