Both empirical and theoretical population genetics are increasingly dependent on evolutionary simulations. How did historical processes lead to the patterns of genetic variation observed in your data set? How do selection, recombination, and drift interact to shape the genome during population divergence? These and many other questions that require statistical model testing or that would be analytically intractable can benefit from the judicious application of tools designed to model evolutionary processes in silico. As a result, our conclusions are increasingly dependent on the strength of these tools, how we apply them, and the assumptions they make. Using programs that emphasize flexibility and reproducibility give us the best shot at minimizing the potential pitfalls to this approach.
SLiM 2 (an acronym for “Selection on Linked Mutations) is a relatively new and unusually buzzed-about framework that achieves all of these goals. Though SLiM 1.0 was launched 2013, its successor is less an update than what the developers refer to as a ground-up rewrite. The primary feature that distinguishes it from its own ancestor and from most (but not all; see simuPOP) other simulators out there is its scriptabilty. What this means is that you interact with SLiM 2 through a novel, integrated programming language called Eidos. By design, its syntax has strong similarities to R
and Python
, which most users will probably be familiar with.
Eidos’ scriptability dramatically increases the breadth of cases evolutionary simulations can be useful for. Using Eidos, you aren’t restricted to a set of arbitrary pre-existing scenarios (e.g. isolation with migration, or exponential population growth after a bottleneck), and you can draw on powerful statements like for loops and conditionals. It requires a bit more thought than entering values in a params files, sure. But you also don’t need to know
C++
or another high level programming language. A simple example from SLiM’s readable, extensive manual of how to test for Hill-Robertson interference (the phenomenon of competing beneficial mutations reducing each other’s fixation probabilities below theoretical expectations) does a nice job highlighting its strengths:
initialize() {
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.05);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
Here, we initialize a suite of basic simulation parameters, including a constant mutation rate (1e-6
) for mutation type "m1"
, a genomic element type to draw on the mutation type at a given frequency, a specific genomic element of that type and of a fixed length (1000000
sites), and fixed recombination rate (1e-8
).
1 {
sim.addSubpop("p1", 1000);
}
Next, we add a subpopulation ("p1"
) of size 1000.
6000 late() {
// Calculate the fixation probability for a beneficial mutation
s = 0.05;
N = 1000;
p_fix = (1 - exp(-2 * s)) / (1 - exp(-4 * N * s));
// Calculate the expected number of fixed mutations
n_gens = 1000; // first 5000 generations were burn-in
mu = 1e-6;
locus_size = 100000;
expected = mu * locus_size * n_gens * 2 * N * p_fix;
// Figure out the actual number of fixations after burn-in
subs = sim.substitutions;
actual = sum(subs.fixationGeneration >= 5000);
At generation 6000
, we’ve run the simulation for 5000 steps of burnin and an additional 1000 steps of our “real” experimental run. We can then use Eidos to do some basic math, and calculate: 1) the fixation probability (p_fix
) of a beneficial mutation given our selection coefficient for m1 and our population size; 2) the expected number of fixed mutations given our probability, our mutation rate, the length of our locus, and the number of generations the simulation has been running; and 3) sum the actual number of fixed mutations following the burnin period.
// Print a summary of our findings
cat("P(fix) = " + p_fix + "\n");
cat("Expected fixations: " + expected + "\n");
cat("Actual fixations: " + actual + "\n");
cat("Ratio, actual/expected: " + (actual/expected) + "\n");
}
P(fix) = 0.0951626
Expected fixations: 19032.5
Actual fixations: 331
Ratio, actual/expected: 0.0173913
Finally, we print out our summary using the cat()
command, and find beneficial mutations do appear to be interfering with each other.
SLiM downloads with a useful GUI / IDE, similar to RStudio. Though you can run the program from your command line as well, this provides multiple advantages: real time visualizations the fitness of all individuals and mutations along chromosomes, plotting of model design and summary statistic output, and scripting help. Even if you plan on implementing your final runs on a high performance computing cluster, the GUI dramatically eases troubleshooting and experimental design. (Unfortunately, it’s currently only available for Mac OS X.)
What’s underneath the hood? SLiM works in forward-time (as opposed to backwards-time, e.g. using coalescent theory), simulating the the entire history of all individuals from the past to the present. Its base assumptions are more or less the Wright-Fisher model you’re familiar with: diploid individuals with nonoverlapping generations, whose probability of parentage is proportional to fitness, and whose offspring are generated by recombinding parental chromosomes and adding mutations. SLiM 2 is genetically explicit in that it models mutations in distinct genetic regions. It does not, however, model nucleotides themselves, or the probability they mutate into one another (though 4 allele states and backmutation can be approximated) — so if that’s your goal, it may be better to look elsewhere. For most other applications, though, it’s hard to imagine a more powerful simulator.
References:
Haller, B.C., Messer, P.W. 2016. SLiM 2: flexible, interactive forward genetic simulations. Molecular Biology and Evolution, 34(1), pp.230-240. DOI: 10.1093/molbev/msw211
Hill, W.G., Robertson, A. 1966. The effect of linkage on limits to artificial selection. Genetics Research, 8(3), pp.269-294.
Messer, P.W. 2013. SLiM: simulating evolution with selection and linkage. Genetics, 194(4), pp.1037-1039. DOI: 10.1534/genetics.113.152181