Purpose
This repository contains an R package with a focus on Mixture of Experts (MoE) applied to Bayesian linear and censored regression.
Originally used in a yet to be published thesis, work is ongoing to make this publicly accessible package easier to use.
Tasks remaining include
Adding
pkgdown
then improving documentation (README
, vignettes, etc.),Possibly migrating to Stan to avoid the JAGS ‘ones trick’,
Removing unnecessary code and dependencies,
Submitting to CRAN for wider use.
Installation
You can install the development version of bmoe
from GitHub with:
remotes::install_github("nclJoshCowley/bmoe")
Model Description
We present the Mixture of Experts model as a finite mixture model of K parametric linear regressions, where the concomitant weighting parameters also depend on some predictors.
$$ f(\boldsymbol{y_i} | \boldsymbol{x}_i, \boldsymbol{\omega}, \boldsymbol{\theta}) = \sum_{k=1}^K \eta_k(\boldsymbol{x}_i | \boldsymbol{\omega}_k) f_k(\boldsymbol{y_i} | \boldsymbol{x}_i, \boldsymbol{\theta}_k) $$
Each components’ distribution is currently limited to (conditionally independent) multiple linear regressions where each response variable can potentially be left-censored.
Worked Example
Simulation
We can simulate data from these models using bmoe::simulate_bmoe()
.
Alternatively, one can use the example wrapper to utilise default arguments.
example_sim <- bmoe::example_simulate_bmoe()
example_sim$data
#> # A tibble: 180 × 4
#> y01 x01 x02 x03
#> <dbl> <dbl> <dbl> <dbl>
#> 1 -4.03 -1.07 -0.0269 -1.03
#> 2 1.26 -1.36 -1.34 0.901
#> 3 -4.38 1.37 -2.21 -0.277
#> 4 -0.490 -0.463 0.0581 0.0647
#> 5 -0.211 0.0454 -0.446 -0.406
#> 6 7.45 0.456 0.956 0.927
#> 7 4.92 1.03 1.78 2.56
#> 8 -0.140 -0.312 -0.428 0.222
#> 9 -0.162 0.195 0.0182 -0.836
#> 10 2.69 2.47 -1.46 -0.473
#> # ℹ 170 more rows
Fitting
We require prior hyperparameters for all models; the default is a vague prior for each parameter but the number of components (K) is assumed known and must be set by the user.
example_prior <- bmoe::bmoe_prior(k = 3)
example_prior
#> $k
#> [1] 3
#>
#> $regr_prec
#> [1] 0.1
#>
#> $wt_prec
#> [1] 1
#>
#> $prec_shape
#> [1] 2
#>
#> $prec_rate
#> [1] 1
For simulation study results, one can pass a simulation directly.
bmoe::bmoe(example_sim, prior = example_prior)
More generally, this package provides a extended formula-data interface.
example_fit <-
bmoe::bmoe(
y01 ~ x01 + x02 + x03,
data = example_sim$data,
prior = example_prior
)
This interface goes beyond the base R formula system as we can
model multiple response variables as conditionally independent (conditional on component membership) use the
+
symbol in the LHS.allow two sets of predictors can be separated for regression purposes and component probability weighting purposes using
|
.
For example, y01 + y02 ~ x01 + x02 + x03 | x03
implies two response variables, y01
and y02
, to be regressed against the linear predictor formed from x01 + x02 + x03
according to some component probabilities based on the linear predictor formed from x03
.
Reporting
Analysis reports can be generated from any fitted object and desired file name.
bmoe::render_bmoe_fit(example_fit, "report-name")