ABC sampling
ABC_sampling.Rmd
The Approximate Bayesian Computation Markov chain Monte Carlo (ABCMCMC) algorithm is implemented in our toolset an ABCMCMC()
.
In the ABC-MCMC algorithm, parameters are proposed and used to simulate the model. Initially, we can use an uninformative prior distribution that only encodes qualitative knowledge or knowledge about parameter ranges. This can be achieved using standard distributions such as normal or uniform. But if the procedure is repeated, then the last posterior distribution can be used as the prior distribution of the next round. This is necessary when new data is measured and added. However, the posterior density is characterized by the sample, it doesn’t have a computational form for evaluation. This requires a density estimator, which approximates the distribution; generic flowchart:
sample -> estimator -> density_function
With that function, we can now obtain a density value for any parameter vector. We use the VineCopula package for this purpose.
During ABC, the simulated output values are compared with the experimental data. Parameters with simulations close enough to the data are kept as the posterior sample
If additional data is added at a later stage, and ABC sampling is performed on the new data while approximating the prior using the old sampling results, the new sample will be somewhat incorrect depending on how good the approximation of the old posterior turned out.
It therefore interesting to check if a Markov-chain of sampled parameters is “in agreement” with an older set of experimental data that was only implicitly used in the prior, but not directly. This can be performed with function checkFitWithPreviousExperiments
.
But, one can of course also check compatibility with an experiment that wasn’t actually used at all.
R-functions
ABCMCMC
Build a chain of parameters approximately distributed as the posterior distribution of model parameters
mcmc <- ABCMCMC(objectiveFunction, startPar, nSims, Sigma0, delta, dprior, acceptanceProbability=NULL)
The ABCMCMC
function generates chains of sample parameters using the ABCMCMC algorithm. These samples are approximately distributed as the posterior probability of the model parameters. To build a chain, the algorithm considers the (current) last value in the chain and proposes a parameter that is sampled from a multivariate normal distribution that is cenetered at the current parameter and Sigma0
as covariance matrix. The covariance matrix is adaptively modified: if the chain gets stuck, a regularization is performed. After 4 regularizations, the chain is aborted. Then, to understand if the proposed parameter just sampled is “in agreement” with the experimental data objectiveFunction
is called with the proposed parameter vector: objectiveFunction
simulates the model and computes the distance between the experimental data and output functions of the model. If this distance is larger that the threshold delta
, then the proposed parameter is not likely to have generated the experimental data, and therefore rejected. The Markov chain will repeat for this point (it stands still). If the distance is less than the threshold delta
, the proposed parameter is accepted with probability equal to the ratio of prior probability density (calculated by dprior
) evaluated in the proposed and current parameter, respectively.
Input arguments * objectiveFunction
(function) - function that, given a (vectorial) parameter as input, simulated the model and outputs the distance between experimental data and data simulated from the model with the parameter provided in input * startPar
(numeric) - parameter (as vector) that corresponds to the starting parameter of the chain * nSims
(integer) - requested number of samples in the output chain * Sigma0
(numeric) - matrix with both dimensions equal to the length of the (vectorial) model parameter, corresponding to the covariance matrix of the desired proposed moves in the parameter space. In the ABCMCMC algorithm this covariance matrix is slightly modified and regularized to enhance the exploration of the parameter space and the convergence of the method * delta
(numeric) - ABC acceptance threshold * dprior
(function) - function that evaluates the prior probability density function of the parameter given in input
Output (list) The output of ABCMCMC
is a list containing the following data: * draws
(numeric) - a matrix with nSims
rows and a number of columns equal to the length of the model parameter. Each row corresponds to a sample in the ABCMCMC chain * scores
(numeric) - vector of length nSims
containing distances between experimental data and data simulated with the corresponding ABCMCMC samples (stored in draws
) * acceptanceRate
(numeric) - fraction of parameters in the ABCMCMC chain that were proposed and accepted. * nRegularizations
(integer) - number of regularizations performed on the covariance matrix for the moves proposed in the algorithm
checkFitWithPreviousExperiments
ABC acceptance of currently sampled values given old data (Prior)
oldObjective <- makeObjective(oldExperiments,modelName, distanceMeasure, parMap, simulate)
filteredDraws <- checkFitWithPreviousExperiments(draws=mcmc$draws, objectiveFunction=oldObjective, delta)
This function tests the samples in draws
with experimental data that have not been used to generate draws
. The experimental data needs to be used to construct the variable objectiveFunction
. The objectiveFunction
is used to simualte the model with the draws
as parameters, and compare the simulated data with the old experimental data. If the distances computed with objectiveFunction
are below the ABC threshold delta
, then the corresponding parameters in draws
are kept. Otherwise, they are discarded.
Input arguments * draws
(numeric) - matrix of sampled values (to be filtered) * objectiveFunction
(function) - function that, given a (vectorial) parameter as input, simulated the model and outputs the distance between experimental data and data simulated from the model with the parameter provided in input * delta
(numeric) - acceptance threshold
Output (numeric) Filtered subset (matrix) of acceptable parameter draws