Build your own Model
user_model.Rmd
In this article we will assume that you have a model that you want to
work on and that you have written the model in SBtab form. It may be a
google spreadsheet, an open document spreadsheet (ods
), MS
Excel file, a file for the numbers
application on an Apple
device or a .gnumeric
file.
There are three steps we need to make:
Steps | file (out) | tl;dr | |
---|---|---|---|
1 | convert SBtab to ODE | .vf | biology => math |
2 | convert the ODE to sources | .c | math => code |
3 | convert sources to shared library | .so | code => machine code |
RPN-derivative
calculates derivatives or offloads that work on maxima or yacas, install
them if you want to use the --maxima
or
--yacas
options.
Load the TSV files Convert and Compile the Model
library(uqsa)
library(SBtabVFGEN)
# find the tsv files with "?dir", in the directory of your model
f <- dir(Location,pattern='tsv$',full.names=TRUE)
# you don't need this 'Location' variable, just be in the directory where the TSV files are.
sb <- SBtabVFGEN::sbtab_from_tsv(f) # a list of data.frames
cl <- SBtabVFGEN::sbtab_to_vfgen(sb) # conservation laws, if any
ex <- SBtabVFGEN::sbtab.data(sb,cl) # includes the data
print(names(sb))
system2("ode", args=c("-C", "--maxima" ,"AKAR4cl.tar.gz"), stdout="AKAR4cl_gvf.c")
stopifnot(file.exists("AKAR4cl_gvf.c"))
modelName <- checkModel("AKAR4cl","AKAR4cl_gvf.c") # compiles the model if not yet compiled
This should produce this output:
library(uqsa)
library(SBtabVFGEN)
# find the tsv files with "?dir", in the directory of your model
f <- dir(Location,pattern='tsv$',full.names=TRUE)
# you don't need this 'Location' variable, just be in the directory where the TSV files are.
sb <- SBtabVFGEN::sbtab_from_tsv(f) # a list of data.frames
#> [tsv] file[1] «100nM.tsv» belongs to Document «AKAR4cl»
#> I'll take this as the Model Name.
#> 100nM.tsv 25nM.tsv 400nM.tsv Compound.tsv Experiments.tsv Output.tsv Parameter.tsv Reaction.tsv
cl <- SBtabVFGEN::sbtab_to_vfgen(sb) # conservation laws, if any
#> Document Name: AKAR4cl.
#> SBtab has 8 tables.
#> The names of SBtab[[1]]:
#> !Time, >AKAR4pOUT, ~AKAR4pOUT
#> !Name !KineticLaw
#> reaction_1 reaction_1 kf_C_AKAR4*C*AKAR4 - kb_C_AKAR4*AKAR4_C
#> reaction_2 reaction_2 kcat_AKARp*AKAR4_C
#> !ReactionFormula !IsReversible
#> reaction_1 C + AKAR4 <=> AKAR4_C 1
#> reaction_2 AKAR4_C <=> AKAR4p + C 0
#> There is no «Constant» Table in this model. This is OK.
#> There is no «Expression» Table in this model. This is OK.
#> compound names:
#> [1] "AKAR4" "AKAR4_C" "AKAR4p" "C"
#> Units:
#> [1] "micromole/liter" "micromole/liter" "micromole/liter" "micromole/liter"
#> ---
#> There is no «Input» Table in this model.
#> units in «!Unit» column:
#> [1] "micromole/liter" "1/(micromolarity*second)"
#> [3] "1/second" "liter"
#> automatically created sbml unit ids:
#> [1] "micromole_per_liter" "one_over_micromolarity_second"
#> [3] "one_over_second" "liter"
#> [1] "Compartment"
#> [1] "AKAR4"
#> unit of species 1 (AKAR4): «micromole_per_liter»
#> [1] "AKAR4_C"
#> unit of species 2 (AKAR4_C): «micromole_per_liter»
#> [1] "AKAR4p"
#> unit of species 3 (AKAR4p): «micromole_per_liter»
#> [1] "C"
#> unit of species 4 (C): «micromole_per_liter»
#> adding SBtab parameter 1: «kf_C_AKAR4»
#> unit of parameter 1 (kf_C_AKAR4): «one_over_micromolarity_second»
#> adding SBtab parameter 2: «kb_C_AKAR4»
#> unit of parameter 2 (kb_C_AKAR4): «one_over_second»
#> adding SBtab parameter 3: «kcat_AKARp»
#> unit of parameter 3 (kcat_AKARp): «one_over_second»
#> adding reaction 1: «reaction_1»
#> converting flux to mathml: «kf_C_AKAR4*C*AKAR4 - kb_C_AKAR4*AKAR4_C»
#> reaction rates in SBML need to have the unit «substance/time» rather than «concentration/time».
#> So conventional kinetics need to be multiplied by compartment volumes.
#> Since this model has only one Compartment, all kinetics will be multiplied by its volume for the SBML model.
#> [SBML] Reactants:
#> [1] "C" "AKAR4"
#> [SBML] Products:
#> [1] "AKAR4_C"
#> ---
#> [1] "C" "" "C"
#> [1] "AKAR4" "" "AKAR4"
#> [1] "AKAR4_C" "" "AKAR4_C"
#> adding reaction 2: «reaction_2»
#> converting flux to mathml: «kcat_AKARp*AKAR4_C»
#> reaction rates in SBML need to have the unit «substance/time» rather than «concentration/time».
#> So conventional kinetics need to be multiplied by compartment volumes.
#> Since this model has only one Compartment, all kinetics will be multiplied by its volume for the SBML model.
#> [SBML] Reactants:
#> [1] "AKAR4_C"
#> [SBML] Products:
#> [1] "AKAR4p" "C"
#> ---
#> [1] "AKAR4_C" "" "AKAR4_C"
#> [1] "AKAR4p" "" "AKAR4p"
#> [1] "C" "" "C"
#> output 1:
#> [1] "AKAR4pOUT"
#> unit of output 1 (AKAR4pOUT): «micromole_per_liter»
#> output 1 has formula:
#> [1] "108 + 380*AKAR4p"
#> 108 + 380 * AKAR4p
#> The sbml file has been named «AKAR4cl.xml».
#> class(IsConstant): logical.
#> [1] Formula Unit
#> <0 rows> (or 0-length row.names)
#> [1] Formula Unit
#> <0 rows> (or 0-length row.names)
#> character(0)
#> ---
#> character
#> Reaction 1:line (a->b): «C + AKAR4 » ←→ « AKAR4_C»
#> where a: [1] "C " " AKAR4 "
#> and b: [1] " AKAR4_C"
#> Products:
#> 1 × AKAR4_C (AKAR4_C is compound 2)
#> Reactants:
#> 1 × C (C is compound 4)
#> 1 × AKAR4 (AKAR4 is compound 1)
#> Reaction 2:line (a->b): «AKAR4_C » ←→ « AKAR4p + C»
#> where a: [1] "AKAR4_C "
#> and b: [1] " AKAR4p " " C"
#> Products:
#> 1 × AKAR4p (AKAR4p is compound 3)
#> 1 × C (C is compound 4)
#> Reactants:
#> 1 × AKAR4_C (AKAR4_C is compound 2)
#> Number of compounds: 4
#> Number of Reactions: 2
#> [,1] [,2]
#> [1,] 0 1
#> [2,] 1 0
#> [3,] 0 1
#> [4,] 1 -1
#> Stoichiometric Matrix:
#> [,1] [,2]
#> [1,] -1 0
#> [2,] 1 -1
#> [3,] 0 1
#> [4,] -1 1
#> ---
#> Conservation Law dimensions: 4 × 2
#> To check that the conservation laws apply: norm(t(StoichiometryMatrix) * ConservationLaw == 0.00000)
#> [,1] [,2]
#> [1,] 0 1
#> [2,] 1 0
#> [3,] 0 1
#> [4,] 1 -1
#> [1] 2
#> [1] 0 1 0 1
#> [1] 0.2 0.0 0.0 0.0
#> 1*AKAR4_C+1*C
#> This will comment out compound 2 («AKAR4_C», initial value: 0), Conserved Constant = 0.000000
#> [1] 1 0 1 -1
#> [1] 0.2 0.0 0.0 0.0
#> 1*AKAR4+1*AKAR4p-1*C
#> This will comment out compound 1 («AKAR4», initial value: 0.2), Conserved Constant = 0.200000
#> Loading required package: hdf5r
#> StateVariable AKAR4 will be commented out as it was already defined as a Mass Conservation Law Expression.StateVariable AKAR4_C will be commented out as it was already defined as a Mass Conservation Law Expression.
#> The vf content was written to: AKAR4cl.vf
#> MOD: StateVariable AKAR4 will be commented out as it was already defined as a Mass Conservation Law Expression.
#> MOD: StateVariable AKAR4_C will be commented out as it was already defined as a Mass Conservation Law Expression.
#> The mod content was written to: AKAR4cl.mod
ex <- SBtabVFGEN::sbtab.data(sb,cl) # includes the data
system2("ode", args=c("-C", "--maxima" ,"AKAR4cl.tar.gz"), stdout="AKAR4cl_gvf.c")
stopifnot(file.exists("AKAR4cl_gvf.c"))
modelName <- checkModel("AKAR4cl","AKAR4cl_gvf.c") # compiles the model if not yet compiled
#> building a shared library from c source, and using GSL odeiv2 as backend (pkg-config is used here).
#> cc -shared -fPIC `pkg-config --cflags gsl` -o './AKAR4cl.so' 'AKAR4cl_gvf.c' `pkg-config --libs gsl`
We have created the c source code, with these functions:
grep AKAR4cl_ ./AKAR4cl_gvf.c
#> int AKAR4cl_vf(double t, const double y_[], double f_[], void *par)
#> int AKAR4cl_netflux(double t, double y_[], double *flux, void *par){
#> int AKAR4cl_fwdflux(double t, double y_[], double *flux, void *par){
#> int AKAR4cl_bwdflux(double t, double y_[], double *flux, void *par){
#> int AKAR4cl_jac(double t, const double y_[], double *jac_, double *dfdt_, void *par)
#> int AKAR4cl_jacp(double t, const double y_[], double *jacp_, double *dfdt_, void *par)
#> int AKAR4cl_func(double t, const double y_[], double *func_, void *par)
#> int AKAR4cl_funcJac(double t, const double y_[], double *funcJac_, void *par)
#> int AKAR4cl_funcJacp(double t, const double y_[], double *funcJacp_, void *par)
#> int AKAR4cl_default(double t, void *par)
#> int AKAR4cl_init(double t, double *y_, void *par)
Simulate
## we create a closure called "sim" a function that internally remembers the "experiments" (ex)
sim <- simulator.c(ex,modelName) # ex holds the instructions for the solver
#> Loading required package: rgsl
par <- as.matrix(sb$Parameter[["!DefaultValue"]]) # one column - but can be several sets (as columns)
## sim can be called with any parameter vector - or several and will always simulate the same set of experiments
sr <- sim(par) # simulation results (one per experiment)
stopifnot(length(sr) == length(ex))
i <- seq(70)
t <- ex[[1]]$outputTimes[i]
y <- ex[[1]]$outputValues[[1]][i]
dy <- ex[[1]]$errorValues[[1]][i]
z <- sr[[1]]$func[1,i,1] # third dimension is the par column (just one here)
plot(t,z,bty='n',type='l', ylim=c(90,200))
points(t,y)
arrows(t,y,t,y+dy,angle=90, length=0.01)
arrows(t,y,t,y-dy,angle=90, length=0.01)
legend("bottomright",legend=c("simulation","data"), lty=c(1,1), pch=c(NA,1))
The simulation result sr
is a list with several
components, at least 2: - state
state variable trajectories
- func
outout functions
The output functions usually correspond to the data (in some way), but may need some sort of normalization. In this case the relationship is direct.
Both state
and func
have three indexes:
sr[[l]]$state[i,j,k] # is a number
This corresponds to experiment l
, state variable
i
, time-point j
, and parameter column
k
(again, relevant when par
is a matrix with
multiple columns).
The previously created modelName
variable has a comment
attribute (that you can also set yourself, there is no magic there),
this attribute inidicates the location of the shared library used for
simulations. You don’t have to re-make it evry time - as long as the
model doesn’t change (checkModel
does not need to be
called):
comment(modelName) <- "./AKAR4cl.so" # this is sufficient if the .so file already exists
Simulations of Several Parameter Sets
t <- as.numeric(ex[[1]]$outputTimes)
np <- NROW(par)
REPS <- 50
P <- matrix(runif(np*REPS,min=0,max=as.numeric(par)*2),np,REPS)
dim(P)
#> [1] 3 50
stm <- Sys.time()
sr <- sim(P)
etm <- Sys.time()
difftime(etm,stm)
#> Time difference of 2.111909 secs
let’s plot all trajectories in one picture:
T <- rep(c(t,NA),REPS) # the NA value will break the line
Z <- as.numeric(sr[[1]]$func[1,c(seq_along(t),NA),]) # at the end, so it doesn't loop
plot(T,Z,type='l')
The new result is (just like the old one) an array with three indexes:
sr[[l]]$func[i,j,k] # a number
-
i
output function -
j
time index -
k
selects the trajectories that belong toP[,k]
cleanup:
gf <- sprintf("AKAR4cl%s",c(".vf",".tar.gz",".zip",".so"))
print(gf)
#> [1] "AKAR4cl.vf" "AKAR4cl.tar.gz" "AKAR4cl.zip" "AKAR4cl.so"
file.remove(gf)
#> [1] TRUE TRUE TRUE TRUE
Other Formats
The SBtab content can be a normal spreadsheet.
If it is a google spreadsheet (which can be good for collaborative
work), you should write it until failry satisfied with all components
and then export it as an .xlsx
, or .ods
file.
The gnumeric application can automatically convert both
.xlsx
and .ods
to a set of .tsv
files if you want the benefits of version control (which works on text
files). But, .ods
and .xlsx
both work:
-
SBtabVFGEN::sbtab_from_ods
for ODS files -
SBtabVFGEN::sbtab_from_excel
for XLSX files
Our scripts find the tables by name:
- TSV files:
TableName
attribute is the name of the table - ODS files: the Sheet’s name is the name of the table,
TableName
attribute is not parsed - Excel files: the Sheet’s name (same as ODS)
Tab Separated Values
For more information on how to solve common tasks regarding tsv files, see the tsv topic. We obtain a list of all tsv files in the current directory, and then import the contents, like this:
modelFiles <- dir(pattern='[.]tsv$')
SBtab <- sbtab_from_tsv(modelFiles)
Open Document Spreadsheet
These files can be created with Libre Office, Apple’s numbers program, gnumeric, and any web-hosted spreadsheet application (like google spreadsheets).
modelFiles <- "DemoModel.ods"
SBtab <- sbtab_from_ods(modelFiles)
Excel
These can be created with MS Excel, and loaded like this:
modelFiles <- "DemoModel.xlsx"
SBtab <- sbtab_from_excel(modelFiles)
Parameters
The R code generated by ode.sh
includes a list with the
same functions, but with generic function names:
-
model$vf()
vector field -
model$jac()
Jacobian -
model$init()
initial values -
model$par()
default parameters in linear scale- regardless of what the scale was in the SBtab file
- this function returns the long parameter vector (Parameter and Input)
Observe that model$par()
returns the ODE model
parameters, which can be more than the biological parameters, the
components are:
- biological parameters (
SBtab$Parameter
) - origignal input parameters (
SBtab$Input
) - derived input parameters, from conservationlaws (conserved constants)
The inputs are allowed to be different for each experiment (by
definition), the biological parameters are the same for all experiments
simulated in one go. Therefore, we only supply the biological parameters
(the first n
) to the simultor and it retrieves the others
from the experiments
variable, by concatenation:
Usually, the number of leading parameters
(sb$Parameter[[“!DefaultValue”]]) does not change. This is all the
simulator needs as the inputs are already in ex
.