Skip to contents

If the model can be described in terms of reactions, inputs, and outputs, then the most straightforward way to define it is through SBtab tables.

For completely different models (or solvers), the user must either write their own likelihood function (cost-function/objective-function) or write a vf file manually, to generate C code from.

It is also possible to directly write the model functions in C (for gsl_odeiv2).

The SBtab path uses these functions:

f <- dir(pattern="[.]tsv$",full.names=TRUE)          # file names
sb <- SBtabVFGEN::sbtab_from_tsv(f)                  # a list of data.frames
cl <- SBtabVFGEN::sbtab_to_vfgen(sb)                 # conservation laws, if any
ex <- SBtabVFGEN::sbtab.data(sb,cl)                  # includes the data

The SBtab model has now been loaded into R as data.frames, and also converted into an ODE. The data has been loaded as well.

The rest of this page describes the variables we just created, as well as the written files.


Real Example

library(uqsa)
library(SBtabVFGEN)

f <- uqsa_example('AKAR4')                           # file names
print(f[1])
#> [1] "/home/andreikr/.local/R/library/uqsa/extdata/AKAR4/AKAR4_100nM.tsv"
sb <- SBtabVFGEN::sbtab_from_tsv(f)                  # a list of data.frames
#> [tsv] file[1] «AKAR4_100nM.tsv» belongs to Document «AKAR4»
#>  I'll take this as the Model Name.
#> AKAR4_100nM.tsv  AKAR4_25nM.tsv  AKAR4_400nM.tsv  AKAR4_Compound.tsv  AKAR4_Experiments.tsv  AKAR4_Output.tsv  AKAR4_Parameter.tsv  AKAR4_Reaction.tsv
cl <- SBtabVFGEN::sbtab_to_vfgen(sb)                 # conservation laws, if any
#> Document Name: AKAR4.
#> SBtab has 8 tables.
#> The names of SBtab[[1]]:
#> !Time, >AKAR4pOUT, ~AKAR4pOUT
#>                 !Name                             !KineticLaw
#> reaction_1 reaction_1 kf_C_AKAR4*C*AKAR4 - kb_C_AKAR4*AKAR4_C
#> reaction_2 reaction_2                      kcat_AKARp*AKAR4_C
#>                  !ReactionFormula !IsReversible
#> reaction_1  C + AKAR4 <=> AKAR4_C             1
#> reaction_2 AKAR4_C <=> AKAR4p + C             0
#> There is no «Constant» Table in this model. This is OK.
#> There is no «Expression» Table in this model. This is OK.
#> compound names:
#> [1] "AKAR4"   "AKAR4_C" "AKAR4p"  "C"
#> Units:
#> [1] "micromole/liter" "micromole/liter" "micromole/liter" "micromole/liter"
#> ---
#> There is no «Input» Table in this model.
#> class(IsConstant): logical.
#> [1] Formula Unit   
#> <0 rows> (or 0-length row.names)
#> [1] Formula Unit   
#> <0 rows> (or 0-length row.names)
#> character(0)
#> ---
#> character
#> Reaction 1:line (a->b): «C + AKAR4 » ←→ « AKAR4_C»
#>  where a: [1] "C "      " AKAR4 "
#>    and b: [1] " AKAR4_C"
#> Products:
#> 1 × AKAR4_C          (AKAR4_C is compound 2)
#> Reactants:
#> 1 × C            (C is compound 4)
#> 1 × AKAR4            (AKAR4 is compound 1)
#> Reaction 2:line (a->b): «AKAR4_C » ←→ « AKAR4p + C»
#>  where a: [1] "AKAR4_C "
#>    and b: [1] " AKAR4p " " C"      
#> Products:
#> 1 × AKAR4p           (AKAR4p is compound 3)
#> 1 × C            (C is compound 4)
#> Reactants:
#> 1 × AKAR4_C          (AKAR4_C is compound 2)
#> Number of compounds: 4
#> Number of Reactions: 2
#>      [,1] [,2]
#> [1,]    0    1
#> [2,]    1    0
#> [3,]    0    1
#> [4,]    1   -1
#> Stoichiometric Matrix:
#>      [,1] [,2]
#> [1,]   -1    0
#> [2,]    1   -1
#> [3,]    0    1
#> [4,]   -1    1
#> ---
#> Conservation Law dimensions: 4 × 2
#> To check that the conservation laws apply: norm(t(StoichiometryMatrix) * ConservationLaw == 0.00000)
#>      [,1] [,2]
#> [1,]    0    1
#> [2,]    1    0
#> [3,]    0    1
#> [4,]    1   -1
#> [1] 2
#> [1] 0 1 0 1
#> [1] 0.2 0.0 0.0 0.0
#> 1*AKAR4_C+1*C
#> This will comment out compound 2 («AKAR4_C», initial value: 0), Conserved Constant = 0.000000
#> [1]  1  0  1 -1
#> [1] 0.2 0.0 0.0 0.0
#> 1*AKAR4+1*AKAR4p-1*C
#> This will comment out compound 1 («AKAR4», initial value: 0.2), Conserved Constant = 0.200000
#> StateVariable AKAR4 will be commented out as it was already defined as a Mass Conservation Law Expression.StateVariable AKAR4_C will be commented out as it was already defined as a Mass Conservation Law Expression.
#> The vf content was written to: AKAR4.vf
#> MOD: StateVariable AKAR4 will be commented out as it was already defined as a Mass Conservation Law Expression.
#> MOD: StateVariable AKAR4_C will be commented out as it was already defined as a Mass Conservation Law Expression.
#> The mod content was written to: AKAR4.mod
ex <- SBtabVFGEN::sbtab.data(sb,cl)                  # includes the data

The ex variable holds simulation instructions as a plain R list: inputs, initial values, measurement times, etc. – but also the actual data.

SBtab content:

## Reactions, as an example of SBtab in R:
print(sb$Reaction)
#>                 !Name                             !KineticLaw
#> reaction_1 reaction_1 kf_C_AKAR4*C*AKAR4 - kb_C_AKAR4*AKAR4_C
#> reaction_2 reaction_2                      kcat_AKARp*AKAR4_C
#>                  !ReactionFormula !IsReversible
#> reaction_1  C + AKAR4 <=> AKAR4_C             1
#> reaction_2 AKAR4_C <=> AKAR4p + C             0
## Conservation Laws, as text, and its attributes:
cl$Text
#> [1] "AKAR4_C_ConservedConst = AKAR4_C+1*C"     
#> [2] "AKAR4_ConservedConst = AKAR4+1*AKAR4p-1*C"
## data:
head(ex[[1]]$outputValues)
#>          AKAR4pOUT
#> E400T001    108.60
#> E400T002    111.35
#> E400T003    108.75
#> E400T004    111.40
#> E400T005    111.70
#> E400T006    113.25
head(ex[[1]]$errorValues)
#>          AKAR4pOUT
#> E400T001   14.7950
#> E400T002   14.9325
#> E400T003   14.8025
#> E400T004   14.9350
#> E400T005   14.9500
#> E400T006   15.0275

Data as a plot:

i <- seq(70)
t <- ex[[1]]$outputTimes[i]
y <- ex[[1]]$outputValues[[1]][i]
dy <- ex[[1]]$errorValues[[1]][i]
plot(t,y,bty='n',ylim=c(100,200))
arrows(t,y,t,y+dy,angle=90, length=0.01)
arrows(t,y,t,y-dy,angle=90, length=0.01)

Files

The function sbtab_to_vfgen will produce several files, using the model’s name (Document='model_name' attribute of SBtab):

  • model_name.vf, a vfgen compatible file
  • model_name.mod, a NEURON mod file
  • (optional) model_name.xml, an SBML file, if you have the R bindings to libsbml installed in R and libsbml in your system
  • model_name.tar.gz, compressed archive of a text representation of all ODE elements, no specific format
  • model_name.zip, the same as tar.gz, but as a zip archive

Let’s look at the state variables of the ode:

grep '<StateVariable' AKAR4.vf
#> <!-- <StateVariable Name="AKAR4" Description="removed compound" DefaultInitialCondition="0.2" Formula="-reaction_1"/> -->
#> <!-- <StateVariable Name="AKAR4_C" Description="removed compound" DefaultInitialCondition="0" Formula="+reaction_1-reaction_2"/> -->
#>  <StateVariable Name="AKAR4p" Description="compound" DefaultInitialCondition="0" Formula="+reaction_2"/>
#>  <StateVariable Name="C" Description="compound" DefaultInitialCondition="0" Formula="-reaction_1+reaction_2"/>

Two State variables have been commented out. These two have been replaced by vfgen expressions:

grep '<Expression' AKAR4.vf
#>  <Expression Name="AKAR4_C" Description="derived from conservation law 1" Formula="AKAR4_C_ConservedConst - (C)"/>
#>  <Expression Name="AKAR4" Description="derived from conservation law 2" Formula="AKAR4_ConservedConst - (AKAR4p-C)"/>
#>  <Expression Name="reaction_1" Description="flux" Formula="kf_C_AKAR4*C*AKAR4 - kb_C_AKAR4*AKAR4_C"/>
#>  <Expression Name="reaction_2" Description="flux" Formula="kcat_AKARp*AKAR4_C"/>

Finally, we clean up the generated files:

for t in vf tar.gz zip ; do rm ./AKAR4.$t ; done
[ -f AKAR4.xml ] && rm AKAR4.xml # since that one is optional
echo $?
#> 1

The Vector Field file

The vf file is compatible with the vfgen tool, which creates code in many languages from this vf file (matlab, cvode, octave, python, xppaut). Use this tool if you want to use the created source code for other programming languages. Some SBtab content will be missing here (e.g. scheduled events). The vfgen tool uses the ginac library to calculate Jacobians. Ginac cannot parse vf files with inequality expressions in them (e.g. a < b).

Converting the ODE to code

We use our tool, RPN-derivative, which reads the same file (or one of the archive files: zip/tar.gz) to create slightly different C code. The main advantages are:

  • a vector valued output function with a predictable name: int MODEL_func()
  • the error code of each function is the length of the output buffers that need to be allocated
  • more functions are created:
    • initial conditions: int MODEL_init()
    • default parameter values: int MODEL_default()

The repository of this tool, icpm-kth/RPN-derivative, contains a shell script ode.sh; this script uses one of three methods to calculate analytical derivatives for the Jacobians: 1. maxima, 2. yacas, 3. the bin/derivative binary that RPN-derivative contains (after make && make install)

Assuming that you followed all installation instructions:

ode -R --maxima ./myModel.vf > ./myModel.R 
##     both     ./myModel.{zip,tar.gz}     also work here
ode -C --maxima ./myModel.tar.gz > ./myModel_gvf.c

Note: ode.sh is compatible with the same vf files from SBtabVFGEN, but doesn’t support all of the output formats that VFGEN can do, and does nothing related to delays.

Background on Parameters vs Inputs

Our goal is to perform parameter estimation. A systems biology model typically has parameters that are either reaction rate coefficients \(k_f\), equilibrium constants \(K_d\), Hill coefficients, and many other quantities related to kinetc laws; they are often unknown or not uniquely determined yet. But, not all parameters of a model are unknown, and not all model parameters are intrinsic to the system we study. Some of them model the interventions we do to the model when performing an experiment. This could be the frequency of a driving force for a mechanical model, or the amount of a treatment dose (added to the system) that affects the time course of reactions:

  • a substrate, or buffer
  • an enzyme,
  • a silencing agent,
  • an inhibitor

These parameters are known to us, because they are written down in the protocol. If the input is time dependent, then its dynamics (or explicit algebraic functions) have to become part of the model. These input parameters can be different between experiments, but the intrinsic parameters are always the same.

The ordinary differential equation on the other hand does not need to know the distinction between known or unknown parameters. For these reasons:

parMCMC <- [...]             # some sampling variable
k <- parMap(parMCMC)         # a model compatible parameter vector
u <- experiments[[i]]$input  # a valid model input vector
p <- c(k,u)
# solve ODE using p

where k are intrinsic, unknown parameters (subject to sampling), and u are the known input parameters that are encoded in the input field of an experiment.

If input parameters u exist, then they are always concatenated (p <-c(k,u)) in that order and passed to the model, the model’s C code only sees p. The wrapper functions in the rgsl package do this concatenation when the solver is called. We sample in logarithmic space, but the solver gets k and appends the right u for each experiment, both in linear space.

So, if parMCMC is in log10-space, then parMap is the transformation to linear-space for the solver to work in:

parMap <- function(par) {
    return(10^(par))
}