ODE Simulation and Inference
ODE Simulation and Inference
The purpose of the functions in this module are to:
- Take a set of parameters and times as input, and predict the number of each of the three proteins there will be at every timepoint (tested in
test_repressilator_simulation_class()). - Take protein number traces as input, and determine by optimisation the governing parameters that generated them.
Getting started
Navigate to
Repressilator_tests/tests and verify that the test fails:python -m pytest ode_inference_test.py
The inference code can take quite a long time to run, so you may just want to start by testing
test_repressilator_simulation_class(). It's worth making a branch if you're changing the codebase:git checkout -b ode_fix
Now we can make changes to the tests and
repressilator_analysis module, and can choose to merge them back later. It's often worth being able to print test outputs, so you may like to add test_repressilator_simulation_class() to start with. You can then see the same test output by running:python ode_simulation_and_inference.py
The repressilator paper can be found in the
docs/ folder.
You should read box 1 on page 3, which talks about the ordinary differential equation model.
As you read, note the form of the differential equations for mRNA and protein and which parameters are defined (α, α₀, β, n).
If you have not encoutered it before, you may be interested in the Hill equation which is used in the ODE model to describe binding kinetics.The Test Functions
test_repressilator_simulation_class()
Tests a function
simulate() that is a member of a class ode_inference.RepressilatorModel. Even though the model simulates the number of all three proteins in the repressilator circuit, only two protein number values are assessed.- Arguments:
- A list of parameters, provided in the order defined by
RepressilatorModel._param_names - The times of the experiment in seconds
- Returns: A
numpy.ndarrayof predicted protein numbers, with column 0 being the nuclear numbers and column 1 the cytosolic.
test_infer_parameters()
Tests a single function:
ode_inference.infer_parameters()- Arguments:
- A list of times in seconds (one-dimensional
numpy.ndarray) - A two-dimensional
numpy.ndarrayof shape(n_times, 2)in the same format as returned byode_inference.RepressilatorModel.simulate()
- Returns: A list of parameters in the same order as
RepressilatorModel._param_names
test_repressilator_simulation_class()
AssertionError assert RMSE(values[:,0], test_nuclear_protein)<36
This indicates that the root mean squared error difference between the predicted and actual recorded protein numbers is greater than 36.
What is the code doing before the error?
First let's modify the test code to plot the protein numbers from
simulate() vs. the protein numbers stored as test dataLook at the actual
simulate() function and understand what it does.Compare the mRNA and protein equations from the code with the equations in box 1. Are they the same equation? Which terms differ?
Do the parameters used in the model (for example, those used in _param_names) agree with the description of box 1 in the paper?
How can we resolve the error?
Let's change the simulation code to match the equations from the paper
If you inspect the parameters in the
params_003.json file# after `params` is loaded for key in params.keys(): print("{0}:{1}".format(key, params[key]))
you will notice that there are more parameter values than included in the _param_names list. Update the code to incorporate these values, according to the description of their usage in box 1 in the paper.
initial_ values refer to the initial values of m1-3 and p1-3 (where n_x_1 is x1, n_x_2 is x_2 and c_x is x3, where x is either m or p), and T_e refers to the translation efficiency of an mRNA molecule. You can assume that all the transcription factors act as monomers (which is relevant in the context of the K_m parameter). Furthermore, the differential equation model is only valid after using the non-dimensionalisation procedure outlined in the Elowitz paper. Non-dimensionalisation rescales the variables so that the equations have no physical units; it simplifies the maths and means the parameters (such as α) end up as dimnesionless values rather than quantities with awkward units. The paper describes how to do this; the key effect on the code is that time must be rescaled and α must be converted before being passed to the solver.The rapid oscillation is because the time units still haven't been non-dimensionalised
In the previous solution block we're out by a constant factor; the differential equation model expresses proteins in units of K_m, but the test data is just raw protein count
test_infer_parameters()
KeyError or AssertionError assert RMSE(sim[:,0], true_array[:,1])<36
If you've been following along with the previous solution block, you'll get a KeyError, as the
_param_names variable in RepressilatorModel is now incompatible - if you run the LLM code as-is, you'll get an assertion error. In the following solution blocks, I'm going to totally rewrite the code to make use of the PINTS (parameter inference for noisy timeseries) library and the CMA-ES algorithm, which is a robust gradient-free optimiser, because that's what I'm used to (and it can handle multiple columns). For any optimisation function, you will need to- Specify parameters, and the order with which they are fed into
simulate - Specify parameter bounds (i.e. the parameter space in which the optimiser will search)
- Specify the initial search point.
- Specify optimiser options
- Find some way of dealing with multiple output columns (as
simulatereturns the cytosolic and nuclear protein numbers)
I also usually include the following
- Random initialisation
- Multiple independent optimisation runs
Rewriting the inference code
The arguments to the function are as in the LLM implementation (although without optimiser choice)
def infer_parameters( times: np.ndarray, observations: np.ndarray, ):
and we setup the model in the same way
model = RepressilatorModel(times)
The next step is to define boundaries for each parameter. The actual simulation values are centred around the values given in the paper in the
docs folderWe then need to set up the optimiser
Wrapping up
Once you're satisfied that all tests pass:
python -m pytest ode_inference_test.py
Remove any diagnostic code you added to
ode_inference_test.py, then merge your changes back into master:git checkout master git merge ode_fix




