|
1 |
| -# HybridVariationalInference |
| 1 | +# HybridVariationalInference HVI |
2 | 2 |
|
3 | 3 | [](https://EarthyScience.github.io/HybridVariationalInference.jl/stable/)
|
4 | 4 | [](https://EarthyScience.github.io/HybridVariationalInference.jl/dev/)
|
5 | 5 | [](https://github.com/EarthyScience/HybridVariationalInference.jl/actions/workflows/CI.yml?query=branch%3Amain)
|
6 | 6 | [](https://codecov.io/gh/EarthyScience/HybridVariationalInference.jl)
|
7 | 7 | [](https://github.com/JuliaTesting/Aqua.jl)
|
8 | 8 |
|
9 |
| -Extending Variational Inference (VI), an approximate bayesian inversion method, |
10 |
| -to hybrid models, i.e. models that combine mechanistic and machine-learning parts. |
| 9 | +Estimating uncertainty in hybrid models, |
| 10 | +i.e. models that combine mechanistic and machine-learning parts, |
| 11 | +by extending Variational Inference (VI), an approximate bayesian inversion method. |
| 12 | + |
| 13 | +## Problem |
| 14 | + |
| 15 | +Consider the case of Parameter learning, a special case of hybrid models, |
| 16 | +where a machine learning model, $g_{\phi_g}$, uses known covariates $x_{Mi}$ at site $i$, |
| 17 | +to predict a subset of the parameters, $\theta$ of the process based model, $f$. |
| 18 | + |
| 19 | +The analyst is interested in both, |
| 20 | +- the uncertainty of hybrid model predictions, $ŷ$ (predictive posterior), and |
| 21 | +- the uncertainty of process-model parameters $\theta$, including their correlations |
| 22 | + (posterior) |
| 23 | + |
| 24 | +For example consider a soil organic matter process-model that predicts carbon stocks for |
| 25 | +different sites. We need to parameterize the unknown carbon use efficiency (CUE) of the soil |
| 26 | +microbial community that differs by site, but is hypothesized to correlate with climate variables |
| 27 | +and pedogenic factors, such as clay content. |
| 28 | +We apply a machine learning model to estimate CUE and fit it end-to-end with other |
| 29 | +parameters of the process-model to observed carbon stocks. |
| 30 | +In addition to the predicted CUE, we are interested in the uncertainty of CUE and its correlation |
| 31 | +with other parameters. |
| 32 | +We are interested in the entire posterior probability distribution of the model parameters. |
| 33 | + |
| 34 | +To understand the background of HVI, refer to the [documentation]((https://EarthyScience.github.io/HybridVariationalInference.jl/dev/)). |
| 35 | + |
| 36 | +## Usage |
| 37 | + |
| 38 | + |
| 39 | +In order to apply HVI, the user has to construct a `HybridProblem` object by specifying |
| 40 | +- the machine learning model, $g$ |
| 41 | +- covariates $X_{Mi}$ for each site, $i$ |
| 42 | +- the names of parameters that differs across sites, $\theta_M$, and global parameters |
| 43 | + that are the same across sites, $\theta_P$ |
| 44 | + - optionally, sub-blocks in the within-site correlation structure of the parameters |
| 45 | + - optionally, which global parameters should be provided to $g$ as additional covariates, |
| 46 | + to account for correlations between global and site parameters |
| 47 | +- the parameter transformations from unconstrained scale to the scale relevant to the process models, $\theta = T(\zeta)$, e.g. for strictly positive parameters specify `exp`. |
| 48 | +- the process-model, $f$ |
| 49 | +- drivers of the process-model $X_{Pi}$ at each site, $i$ |
| 50 | +- the likelihood function of the observations, given the model predictions, $p(y|ŷ, \theta)$ |
| 51 | + |
| 52 | +Next this problem is passed to a `HybridPosteriorSolver` that fits an approximation |
| 53 | +of the posterior. It returns a NamedTuple of |
| 54 | +- `ϕ`: the fitted parameters, a ComponentVector with components |
| 55 | + - the machine learning model parameters (usually weights), $\phi_g$ |
| 56 | + - means of the global parameters, $\phi_P = \mu_{\zeta_P}$ at transformed |
| 57 | + unconstrained scale |
| 58 | + - additional parameters, $\phi_{unc}$ of the posterior, $q(\zeta)$, such as |
| 59 | + coefficients that describe the scaling of variance with magnitude |
| 60 | + and coefficients that parameterize the choleski-factor or the correlation matrix. |
| 61 | +- `θP`: predicted means of the global parameters, $\theta_P$ |
| 62 | +- `resopt`: the original result object of the optimizer (useful for debugging) |
| 63 | + |
| 64 | +TODO to get |
| 65 | +- means of the site parameters for each site |
| 66 | +- samples of posterior |
| 67 | +- samples of predictive posterior |
| 68 | +## Example |
| 69 | +TODO |
| 70 | + |
| 71 | +see test/test_HybridProblem.jl |
| 72 | + |
| 73 | + |
| 74 | + |
11 | 75 |
|
12 |
| -The model inversion, infers parametric approximations of posterior density |
13 |
| -of model parameters, by comparing model outputs to uncertain observations. At |
14 |
| -the same time, a machine learning model is fit that predicts parameters of these |
15 |
| -approximations by covariates. |
16 | 76 |
|
0 commit comments