Rates of Convergence for Sparse Variational Inference in Gaussian Process Regression


Sparse variational inference in Gaussian process regression has theoretical guarantees that make it robust to overfitting. Additionally, it is well-known that in the non-sparse regime, with $M \geq N,$ full inference can be recovered. In this paper, we derive bounds on the KL-divergence between the true posterior and a sparse variational approximation that show convergence for $M \asymp log(N)$ inducing features with the squared exponential kernel. We additionally show that these bounds are sharp in a certain sense.

Proceedings of the 36th International Conference on Machine Learning (ICML 2019)
Best paper award at ICML 2019.
Previously presented at the first Symposium on Advances in Approximate Bayesian Inference (Dec 2018).