Applications and exercises

We are going to ask you do do a bunch of things now that will (hopefully) teach you some stuff about the coalescent process AND more about msprime.

You will have to look things up in the msprime documentation:

http://msprime.readthedocs.io

Demographic processes

We are going to look at what various demographic processes do to data.

We will first look at the effects on trees and then the effects on "data", by which we mean a genotype matrix of simulated SNP data.

Population growth

What is the effect of recent growth on genetic variation?

Model has one or, optionally, two paramters:

  1. The growth rate
  2. The time in the past when growth started

Questions

What is the effect of varying these parameters on the following quantities:

  • Total time on the tree
  • TMRCA
  • The mean time leading to sample lineages

Hints

  • It is fine to simulate without recombination. With recombination, the solutions use the same logic, but require taking the weighted average of results over trees, where the weights are the relative segments lengths.
  • The functions you need are in the previous set of slides...
  • ...except for growth rates, which you need to find in the msprime manual.

Population bottlenecks

  1. How do you simulate them?
  2. Does "simple stepwise" recovery differ appreciably from exponential growth?
  3. It is often claimed that two parameters of the bottleneck model (the reduction in $N_e$ and the duration) may be replaced with their ratio, representing a single "severity" parameter. How would you verify that claim (and does it hold?)

Ancient samples

  1. How do you code them up?
  2. How do you compare them to "modern" samples when processing the output?