Build your own Model • uqsa

In this article we will assume that you have a model that you want to work on and that you have written the model in SBtab form. It may be a google spreadsheet, an open document spreadsheet (ods), MS Excel file, a file for the numbers application on an Apple device or a .gnumeric file.

There are three steps we need to make:

	Steps	file (out)	tl;dr
1	convert SBtab to ODE	.vf	biology => math
2	convert the ODE to sources	.c	math => code
3	convert sources to shared library	.so	code => machine code

RPN-derivative calculates derivatives or offloads that work on maxima or yacas, install them if you want to use the --maxima or --yacas options.

Load the TSV files Convert and Compile the Model

library(uqsa)
library(SBtabVFGEN)

# find the tsv files with "?dir", in the directory of your model
f <- dir(Location,pattern='tsv$',full.names=TRUE)
# you don't need this 'Location' variable, just be in the directory where the TSV files are.
sb <- SBtabVFGEN::sbtab_from_tsv(f)                  # a list of data.frames
cl <- SBtabVFGEN::sbtab_to_vfgen(sb)                 # conservation laws, if any
ex <- SBtabVFGEN::sbtab.data(sb,cl)                  # includes the data

print(names(sb))

system2("ode", args=c("-C", "--maxima" ,"AKAR4cl.tar.gz"), stdout="AKAR4cl_gvf.c")
stopifnot(file.exists("AKAR4cl_gvf.c"))
modelName <- checkModel("AKAR4cl","AKAR4cl_gvf.c") # compiles the model if not yet compiled

This should produce this output:

library(uqsa)
library(SBtabVFGEN)

# find the tsv files with "?dir", in the directory of your model
f <- dir(Location,pattern='tsv$',full.names=TRUE)
# you don't need this 'Location' variable, just be in the directory where the TSV files are.
sb <- SBtabVFGEN::sbtab_from_tsv(f)                  # a list of data.frames
#> [tsv] file[1] «100nM.tsv» belongs to Document «AKAR4cl»
#>  I'll take this as the Model Name.
#> 100nM.tsv  25nM.tsv  400nM.tsv  Compound.tsv  Experiments.tsv  Output.tsv  Parameter.tsv  Reaction.tsv
cl <- SBtabVFGEN::sbtab_to_vfgen(sb)                 # conservation laws, if any
#> Document Name: AKAR4cl.
#> SBtab has 8 tables.
#> The names of SBtab[[1]]:
#> !Time, >AKAR4pOUT, ~AKAR4pOUT
#>                 !Name                             !KineticLaw
#> reaction_1 reaction_1 kf_C_AKAR4*C*AKAR4 - kb_C_AKAR4*AKAR4_C
#> reaction_2 reaction_2                      kcat_AKARp*AKAR4_C
#>                  !ReactionFormula !IsReversible
#> reaction_1  C + AKAR4 <=> AKAR4_C             1
#> reaction_2 AKAR4_C <=> AKAR4p + C             0
#> There is no «Constant» Table in this model. This is OK.
#> There is no «Expression» Table in this model. This is OK.
#> compound names:
#> [1] "AKAR4"   "AKAR4_C" "AKAR4p"  "C"
#> Units:
#> [1] "micromole/liter" "micromole/liter" "micromole/liter" "micromole/liter"
#> ---
#> There is no «Input» Table in this model.
#> class(IsConstant): logical.
#> [1] Formula Unit   
#> <0 rows> (or 0-length row.names)
#> [1] Formula Unit   
#> <0 rows> (or 0-length row.names)
#> character(0)
#> ---
#> character
#> Reaction 1:line (a->b): «C + AKAR4 » ←→ « AKAR4_C»
#>  where a: [1] "C "      " AKAR4 "
#>    and b: [1] " AKAR4_C"
#> Products:
#> 1 × AKAR4_C          (AKAR4_C is compound 2)
#> Reactants:
#> 1 × C            (C is compound 4)
#> 1 × AKAR4            (AKAR4 is compound 1)
#> Reaction 2:line (a->b): «AKAR4_C » ←→ « AKAR4p + C»
#>  where a: [1] "AKAR4_C "
#>    and b: [1] " AKAR4p " " C"      
#> Products:
#> 1 × AKAR4p           (AKAR4p is compound 3)
#> 1 × C            (C is compound 4)
#> Reactants:
#> 1 × AKAR4_C          (AKAR4_C is compound 2)
#> Number of compounds: 4
#> Number of Reactions: 2
#>      [,1] [,2]
#> [1,]    0    1
#> [2,]    1    0
#> [3,]    0    1
#> [4,]    1   -1
#> Stoichiometric Matrix:
#>      [,1] [,2]
#> [1,]   -1    0
#> [2,]    1   -1
#> [3,]    0    1
#> [4,]   -1    1
#> ---
#> Conservation Law dimensions: 4 × 2
#> To check that the conservation laws apply: norm(t(StoichiometryMatrix) * ConservationLaw == 0.00000)
#>      [,1] [,2]
#> [1,]    0    1
#> [2,]    1    0
#> [3,]    0    1
#> [4,]    1   -1
#> [1] 2
#> [1] 0 1 0 1
#> [1] 0.2 0.0 0.0 0.0
#> 1*AKAR4_C+1*C
#> This will comment out compound 2 («AKAR4_C», initial value: 0), Conserved Constant = 0.000000
#> [1]  1  0  1 -1
#> [1] 0.2 0.0 0.0 0.0
#> 1*AKAR4+1*AKAR4p-1*C
#> This will comment out compound 1 («AKAR4», initial value: 0.2), Conserved Constant = 0.200000
#> Loading required package: hdf5r
#> StateVariable AKAR4 will be commented out as it was already defined as a Mass Conservation Law Expression.StateVariable AKAR4_C will be commented out as it was already defined as a Mass Conservation Law Expression.
#> The vf content was written to: AKAR4cl.vf
#> MOD: StateVariable AKAR4 will be commented out as it was already defined as a Mass Conservation Law Expression.
#> MOD: StateVariable AKAR4_C will be commented out as it was already defined as a Mass Conservation Law Expression.
#> The mod content was written to: AKAR4cl.mod
ex <- SBtabVFGEN::sbtab.data(sb,cl)                  # includes the data


system2("ode", args=c("-C", "--maxima" ,"AKAR4cl.tar.gz"), stdout="AKAR4cl_gvf.c")
stopifnot(file.exists("AKAR4cl_gvf.c"))
modelName <- checkModel("AKAR4cl","AKAR4cl_gvf.c") # compiles the model if not yet compiled
#> building a shared library from c source, and using GSL odeiv2 as backend (pkg-config is used here).
#> cc -shared -fPIC `pkg-config --cflags gsl` -o './AKAR4cl.so' 'AKAR4cl_gvf.c' `pkg-config --libs gsl`

We have created the c source code, with these functions:

grep AKAR4cl_ ./AKAR4cl_gvf.c
#> int AKAR4cl_vf(double t, const double y_[], double f_[], void *par)
#> int AKAR4cl_netflux(double t, double y_[], double *flux, void *par){
#> int AKAR4cl_fwdflux(double t, double y_[], double *flux, void *par){
#> int AKAR4cl_bwdflux(double t, double y_[], double *flux, void *par){
#> int AKAR4cl_jac(double t, const double y_[], double *jac_, double *dfdt_, void *par)
#> int AKAR4cl_jacp(double t, const double y_[], double *jacp_, double *dfdt_, void *par)
#> int AKAR4cl_func(double t, const double y_[], double *func_, void *par)
#> int AKAR4cl_funcJac(double t, const double y_[], double *funcJac_, void *par)
#> int AKAR4cl_funcJacp(double t, const double y_[], double *funcJacp_, void *par)
#> int AKAR4cl_default(double t, void *par)
#> int AKAR4cl_init(double t, double *y_, void *par)

Simulate

## we create a closure called "sim" a function that internally remembers the "experiments" (ex)
sim <- simulator.c(ex,modelName)                   # ex holds the instructions for the solver
#> Loading required package: rgsl
par <- as.matrix(sb$Parameter[["!DefaultValue"]])  # one column - but can be several sets (as columns)

## sim can be called with any parameter vector - or several and will always simulate the same set of experiments
sr <- sim(par)                                     # simulation results (one per experiment)
stopifnot(length(sr) == length(ex))

i <- seq(70)
t <- ex[[1]]$outputTimes[i]
y <- ex[[1]]$outputValues[[1]][i]
dy <- ex[[1]]$errorValues[[1]][i]

z <- sr[[1]]$func[1,i,1]                           # third dimension is the par column (just one here)

plot(t,z,bty='n',type='l', ylim=c(90,200))
points(t,y)
arrows(t,y,t,y+dy,angle=90, length=0.01)
arrows(t,y,t,y-dy,angle=90, length=0.01)
legend("bottomright",legend=c("simulation","data"), lty=c(1,1), pch=c(NA,1))

The simulation result sr is a list with several components, at least 2: - state state variable trajectories - func outout functions

The output functions usually correspond to the data (in some way), but may need some sort of normalization. In this case the relationship is direct.

Both state and func have three indexes:

sr[[l]]$state[i,j,k]        # is a number

This corresponds to experiment l, state variable i, time-point j, and parameter column k (again, relevant when par is a matrix with multiple columns).

The previously created modelName variable has a comment attribute (that you can also set yourself, there is no magic there), this attribute inidicates the location of the shared library used for simulations. You don’t have to re-make it evry time - as long as the model doesn’t change (checkModel does not need to be called):

comment(modelName) <- "./AKAR4cl.so" # this is sufficient if the .so file already exists

Simulations of Several Parameter Sets

t <- as.numeric(ex[[1]]$outputTimes)
np <- NROW(par)
REPS <- 50
P <- matrix(runif(np*REPS,min=0,max=as.numeric(par)*2),np,REPS)
dim(P)
#> [1]  3 50


stm <- Sys.time()
sr <- sim(P)
etm <- Sys.time()
difftime(etm,stm)
#> Time difference of 0.2752998 secs

let’s plot all trajectories in one picture:

T <- rep(c(t,NA),REPS)                                  # the NA value will break the line
Z <- as.numeric(sr[[1]]$func[1,c(seq_along(t),NA),])    # at the end, so it doesn't loop
plot(T,Z,type='l')

The new result is (just like the old one) an array with three indexes:

sr[[l]]$func[i,j,k] # a number

i output function
j time index
k selects the trajectories that belong to P[,k]

cleanup:

gf <- sprintf("AKAR4cl%s",c(".vf",".tar.gz",".zip",".so"))
print(gf)
#> [1] "AKAR4cl.vf"     "AKAR4cl.tar.gz" "AKAR4cl.zip"    "AKAR4cl.so"
file.remove(gf)
#> [1] TRUE TRUE TRUE TRUE

Other Formats

The SBtab content can be a normal spreadsheet.

If it is a google spreadsheet (which can be good for collaborative work), you should write it until failry satisfied with all components and then export it as an .xlsx, or .ods file. The gnumeric application can automatically convert both .xlsx and .ods to a set of .tsv files if you want the benefits of version control (which works on text files). But, .ods and .xlsx both work:

SBtabVFGEN::sbtab_from_ods for ODS files
SBtabVFGEN::sbtab_from_excel for XLSX files

Our scripts find the tables by name:

TSV files: TableName attribute is the name of the table
ODS files: the Sheet’s name is the name of the table, TableName attribute is not parsed
Excel files: the Sheet’s name (same as ODS)

Tab Separated Values

For more information on how to solve common tasks regarding tsv files, see the tsv topic. We obtain a list of all tsv files in the current directory, and then import the contents, like this:

modelFiles <- dir(pattern='[.]tsv$')
SBtab <- sbtab_from_tsv(modelFiles)

Open Document Spreadsheet

These files can be created with Libre Office, Apple’s numbers program, gnumeric, and any web-hosted spreadsheet application (like google spreadsheets).

modelFiles <- "DemoModel.ods"
SBtab <- sbtab_from_ods(modelFiles)

Excel

These can be created with MS Excel, and loaded like this:

modelFiles <- "DemoModel.xlsx"
SBtab <- sbtab_from_excel(modelFiles)

Parameters

The R code generated by ode.sh includes a list with the same functions, but with generic function names:

model$vf() vector field
model$jac() Jacobian
model$init() initial values
model$par() default parameters in linear scale
- regardless of what the scale was in the SBtab file
- this function returns the long parameter vector (Parameter and Input)

Observe that model$par() returns the ODE model parameters, which can be more than the biological parameters, the components are:

biological parameters (SBtab$Parameter)
origignal input parameters (SBtab$Input)
derived input parameters, from conservationlaws (conserved constants)

The inputs are allowed to be different for each experiment (by definition), the biological parameters are the same for all experiments simulated in one go. Therefore, we only supply the biological parameters (the first n) to the simultor and it retrieves the others from the experiments variable, by concatenation:

Usually, the number of leading parameters (sb$Parameter[[“!DefaultValue”]]) does not change. This is all the simulator needs as the inputs are already in ex.