Advanced Sampling Methods for Solving Large-Scale Inverse Problems

Ahmed Attia
Seminar

Ensemble and variational techniques have gained wide popularity as the two main approaches for solving data assimilation (DA) and inverse problems. The majority of the methods in these two approaches are derived (at least implicitly) under the assumption that the underlying probability distributions are Gaussian. It is well accepted, however, that the Gaussianity assumption is too restrictive when applied to large nonlinear models, nonlinear observation operators, and large levels of uncertainty.

In this talk, I will present a family of fully non-Gaussian ensemble-based DA algorithms that work by directly sampling the posterior distribution. The sampling strategy is based on a Hybrid/Hamiltonian Monte Carlo (HMC) approach that can handle non-normal probability distributions.

We start with the “HMC sampling filter”, an ensemble-based DA algorithm for solving the sequential filtering problem. Next, the HMC sampling approach is extended to the four-dimensional “smoothing” case, where several observations are assimilated simultaneously.

The HMC sampling smoother, in its original formulation, is computationally expensive due to the innate requirement of running the forward and adjoint models repeatedly. Computationally efficient versions of the HMC sampling smoother, based on reduced-order approximations of the underlying model dynamics, are also discussed.

In the presence of nonlinear model dynamics, nonlinear observation operator, or non- Gaussian errors, the prior distribution in the sequential DA framework is not analytically tractable. The Gaussian prior assumption in the original HMC filter is relaxed. Specifically, a clustering step is introduced after the forecast phase of the filter, and the prior density function is estimated by fitting a Gaussian Mixture Model (GMM) to the prior ensemble. The “Cluster sampling filters” (ClHMC, and MC-ClHMC), are designed to accommodate non-Gaussian priors, and to guarantee that samples are taken from the vicinities of all probability modes of the formulated posterior.