This question is the follow up to this previous question.
background
With this simulation, I am investigating a system in which enzymes multiply in cells. During the replication of enzymes, parasites can arise by mutation. You can let the system die out. I am interested in where coexistence is possible in parameter space.
I have made the changes recommended by HoboProber. Namely correction of the style and implementation of the model, citing Numpy. Now the system is a two-dimensional array. Cells are the rows of the array. The values of the first column are the number of enzymes and the values of the second column are the number of parasites.
My request
The speed of the new implementation is already much better than the previous one. But I would like to grow there population
and gen_max
Every little increase in performance counts.
So far I have studied the system with population sizes between 100 and 1000 cells and with a maximum number of generations of 10000. The extent of increase in population size depends on performance. One million cells would be a perfectly reasonable assumption with respect to the modeled system. The maximum number of generations should be 20-30000.
- Does the code use vectorization and Numpy as effectively as possible in the first place? For example, Where should I pay attention to C-Ordering and F-Ordering, or at all?
- In that case, could performance benefit from using multithreading / multiprocessing? For example, When the proliferation of molecules takes place, the replication events in different cells are independent of each other. They could possibly happen in parallel.
- Could performance benefit from using static typing and compilation? For example, with Cython or Numba.
Of course, every advice is greatly appreciated! (For example, when writing data to a file more effectively.)
The code
# - * - Coding: utf-8 - * -
"" "
Collect data on an enzyme-parasite system that explicitly requires subdivision.
features
---------
Simulation()
Simulate the mentioned system.
write_out_file ()
Write data to the CSV output file.
"" "
Import CSV
import time
Import numpy as np
Def simulation (Population_Size, Cell_Size, Replication_Rate_P, Mutation_Rate, Gen_Max):
"" "
Simulate an enzyme-parasite system that explicitly requires compartmentalization.
parameter
----------
population_size: int
The number of cells.
cell_size: int
The maximum number of replicators of cells where cell division occurs.
replication_rate_p: int or float
The fitness (replication rate) of the parasites
relative to the suitability (replication rate) of the enzymes.
example
-------
$ replication_rate_p = 2
This means that the fitness of the parasite is twice as high as that of the enzymes.
mutation_rate: int or float
The probability of a mutation during a replication event.
gen_max: int
The maximum number of generations.
One generation corresponds to an outer cycle.
When the system is extinct, the number of generations does not reach gen_max.
yield
-------
generator object
Contains data about the simulated system.
"" "
Def Fitness (population):
"" "
Calculate the fitness of cells.
Fitness of a cell = number of enzymes / (number of enzymes + number of parasites)
parameter
---------
Population: ndarray
The system itself.
return
------
ndarray
The fitness of every cell of the system.
"" "
Return of the population[:, 0]/population.sum(axis=1)
def population_stats (population):
"" "
Calculate the statistics of the system.
parameter
---------
Population: ndarray
The system itself.
return
-------
tuple
Contains statistics of the simulated system.
"" "
gyak_sums = population.sum (axis = 0)
gyak_means = population.mean (axis = 0)
gyak_variances = population.var (axis = 0)
gyak_percentiles_25 = np.percentile (population, 25, axis = 0)
gyak_medians = np.median (population, axis = 0)
gyak_percentiles_75 = np.percentile (population, 75, axis = 0)
fitness_list = fitness (population)
Return (
gyak_sums[0]gyak_sums[1], (Population[:, 0] > 1) .sum (),
gyak_means[0], gyak_variances[0].
gyak_percentiles_25[0]gyak_medians[0]gyak_percentiles_75[0].
gyak_means[1], gyak_variances[1].
gyak_percentiles_25[1]gyak_medians[1]gyak_percentiles_75[1].
fitness_list.mean (), fitness_list.var (),
np.percentile (fitness_list, 25),
np.median (fitness_list),
np.percentile (fitness_list, 75)
)
# Creating the system with the initial state
# Semicircular cells that contain only enzymes.
population = np.zeros ((population_size, 2), dtype = int)
population[:, 0] = int (cell_size // 2)
gen = 0
Yield (Gen, * Population_Stats (Population), Population_Size,
cell_size, mutation_rate, replication_rate_p, "aft")
print (f "N = {population_size}, rMax = {cell_size},"
f "aP = {replication_rate_p}, U = {mutation_rate}")
while population.size> 0 and gen <gen_max:
gen + = 1
# Replicator proliferation until cell_size in each cell.
while np.any (population.sum (axis = 1) <cell_size):
# Calculate the probability of selecting a parasite for replication.
repl_probs_p = population[population.sum(axis=1) < cell_size].Copy()
repl_probs_p[:, 1] * = replication_rate_p
repl_probs_p = repl_probs_p[:, 1]/repl_probs_p.sum(axis=1)
# Determine if an enzyme or a parasite is replicating
# and when an enzyme replicates, it mutates into a parasite.
# (The result may vary between cells, parasites do not mutate.)
repl_choices = np.random.random_sample (repl_probs_p.shape[0])
mut_choices = np.random.random_sample (repl_probs_p.shape[0])
lucky_replicators = np.zeros (repl_probs_p.shape[0], dtype = int)
lucky_replicators[
(repl_choices < repl_probs_p) | (mut_choices < mutation_rate)
] = 1
population[Populationsum(axis=1)[Populationsum(axis=1)[populationsum(axis=1)[populationsum(axis=1)< cell_size, lucky_replicators] += 1
if gen % 100 == 0:
yield (gen, *population_stats(population), population_size,
cell_size, mutation_rate, replication_rate_p, "bef")
# Each cell divides.
new_population = np.empty_like(population)
new_population[:, 0] = np.random.binomial(population[:, 0], 0.5)
new_population[:, 1] = np.random.binomial(population[:, 1], 0.5)
population -= new_population
# Discarding dead cells.
population = np.concatenate([population[population[:, 0] > 1,].
neue_bevölkerung
neue_bevölkerung[new_population[new_population[:, 0] > 1,]])
# Select survival cells according to their fitness
# if there are more viable cells than population_size.
# Therefore, cells with a population_size or less move to the next generation.
if (population.size> 0) & (population.shape[0] > Population):
fitness_list = fitness (population)
fitness_list = fitness_list / fitness_list.sum ()
Population = population[Nprandomchoice(populationshape[Nprandomchoice(populationshape[nprandomchoice(populationshape[nprandomchoice(populationshape[0].
Population,
Replace = wrong,
p = fitness_list) ,:]elif population.size == 0:
for i in range (2):
Yield (gen + i, * (0, 0) * 9, number of inhabitants,
cell_size, mutation_rate, replication_rate_p, "aft")
print (generations) are finished, cells are extinct. ")
if (gen% 100 == 0) & (population.size> 0):
Yield (Gen, * Population_Stats (Population), Population_Size,
cell_size, mutation_rate, replication_rate_p, "aft")
if (gen% 1000 == 0) & (population.size> 0):
print (generations) are done. ")
def write_out_file (result, n_run):
"" "
Write data to the CSV output file.
parameter
----------
Result: generator object or list of generator objects
Contains data about the simulated system.
n_run: int
The number of consecutive runs.
"" "
local_time = time.strftime ("% m_% d_% H_% M_% S_% Y", time.localtime (time.time ())
with open ("output_data_" + local_time + ".csv", "w", newline = "") as out_file:
out_file.write (
"gene;"
"eSzamSum; pSzamSum; alive;"
"eSzamAtl; eSzamVar; eSzamAKv; eSzamMed; eSzamFKv;"
pSzamAtl; pSzamVar; pSzamAKv; pSzamMed; pSzamFKv;
"fitAtl; fitVar; fitAKv; fitMed; fitFKv;"
"N; rMax; U; aP; boaSplit n"
)
out_file = csv.writer (out_file, delimiter = ";")
Counter = 0
print (counter, "/", n_run)
because I result in:
out_file.writerows (i)
Counter + = 1
print (counter, "/", n_run)
RESULT = [simulation(100, 20, 1, 0, 10000)]
RESULT.append (simulation (100, 20, 1, 1, 10000))
N_RUN = 2
write_out_file (RESULT, N_RUN)
# I usually call the functions from another script,
# These last 4 lines are for example only.
`` `