Classes and functions related to simulating data under coalescent models
[Molecular Population Genetics]

Classes

class  Sequence::SimData
 Data from coalescent simulations. More...
class  Sequence::SimParams
 Parameters for Hudson's simulation program. More...
struct  Sequence::segment
 A portion of a recombining chromosome. More...
struct  Sequence::chromosome
 A chromosome is a container of segments. More...
struct  Sequence::node
 A point on a marginal tree at which a coalescent event occurs. More...
struct  Sequence::marginal
 The genealogy of a portion of a chromosome on which no recombination has occurred. More...
class  Sequence::newick_stream_marginal_tree
 Class that provides a typecast-on-output of a marginal tree to a newick tree Example use:. More...

Typedefs

typedef std::pair< std::vector
< double >, std::vector
< std::string > > 
Sequence::gamete_storage_type
 an object to store simulated gametes An object of this type will tend to exist in the calling environment of your program. If you are simulating a sample of n chromosomes, you would initialize the object as follows:
typedef std::list< marginal > Sequence::arg
 Ancestral Recombination Graph.

Functions

bool Sequence::isseg (chromosome::const_iterator seg, const int nsegs, const int pos, int *offset)
 ask if a chromosome beginning at seg and containing nsegs contains a segment containing the position pos
int Sequence::coalesce (const double &time, const int &ttl_nsam, const int &current_nsam, const int &c1, const int &c2, const int &nsites, int *nlinks, std::vector< chromosome > *sample, arg *sample_history)
 Common ancestor routine for coalescent simulation. Merges chromosome segments and updates marginal trees.
int Sequence::sample_length (const std::vector< std::pair< int, int > > &fragments)
 When simulating partially linked regions, return the total length of sample material that we are simulating.
int Sequence::total_length (const std::vector< std::pair< int, int > > &fragments)
 When simulating partially linked regions, return the total length of the region.
void Sequence::calculate_scales (const std::vector< std::pair< int, int > > &fragments, std::vector< std::pair< double, double > > *sample_scale, std::vector< std::pair< double, double > > *mutation_scale)
 This is a helper function that rescales physical distance in base pairs to continuous distance on the interval 0,1.
void Sequence::rescale_mutation_positions (SimData *d, const std::vector< std::pair< double, double > > &sample_scale, const std::vector< std::pair< double, double > > &mutation_scale)
 Rescales the positions of the mutations in d from the scale given in sample_scale to that given in mutation_scale.
void Sequence::rescale_arg (arg *sample_history, const std::vector< std::pair< int, int > > &fragments)
 Rescales the beginnings of marginal trees in an ancestral recombination graph from a genetic scale to a physical scale.
double Sequence::integrate_genetic_map (const std::vector< chromosome > &sample, const int &current_nsam, const std::vector< double > &genetic_map, std::vector< double > *reclens)
 When simulating non-uniform recombination rates, the probability of recombination at each point in the simulation needs to be obtained by integrating over the genetic map and the current sample configuration. This function does that.
std::vector< chromosome > Sequence::init_sample (const std::vector< int > &pop_config, const int &nsites)
 A simple function to initialize a sample of chromosomes.
marginal Sequence::init_marginal (const int &nsam)
 Simple function to initialize a marginal tree.
std::pair< int, int > Sequence::pick_uniform_spot (const double &random_01, const int &nlinks, std::vector< chromosome >::const_iterator sample_begin, const unsigned &current_nsam)
 Pick a crossover point for the model where recombination rates are constant across a recion. Picks a positions uniformly amongst all chromosomes at which a recombination event will occur.
int Sequence::crossover (const int &current_nsam, const int &chromo, const int &pos, std::vector< chromosome > *sample, arg *sample_history)
 Recombination function.
double Sequence::total_time (const marginal::const_iterator beg, const int &nsam)
 Calculate total time on a marginal tree.
int Sequence::pick_branch (marginal::const_iterator beg, const int &nsam, const double &rtime)
 pick a random branch of a marginal tree
std::vector< int > Sequence::get_all_descendants (marginal::const_iterator beg, const int &nsam, const int &branch)
 Find all the descendants of a branch on a marginal tree.
bool Sequence::is_descendant (marginal::const_iterator beg, const int &ind, const int &branch)
 Ask if a tip of a tree is a descendant of a particular branch.
double Sequence::total_time_on_arg (const Sequence::arg &sample_history, const int &total_number_of_sites)
 Returns the total time on an ancestral recombination graph.
template<typename uniform_generator , typename uniform01_generator , typename exponential_generator , typename poisson_generator >
Sequence::SimData Sequence::neutral_sample (uniform_generator &uni, uniform01_generator &uni01, exponential_generator &expo, poisson_generator &poiss, const double &theta, const double &rho, const int &nsites, const int &nsam, std::vector< chromosome > *sample, arg *sample_history, unsigned *max_chromosomes=NULL, const unsigned &max_chromosomes_inc=0)
 A simple function to generate samples under a neutral equilibrium model.
template<typename uniform_generator >
std::pair< int, int > Sequence::pick2_in_deme (const uniform_generator &uni, const std::vector< Sequence::chromosome > &sample, const int &current_nsam, const int &deme_nsam, const int &deme)
 Choose two random chromosomes from the same deme.
template<typename uniform_generator >
std::pair< int, int > Sequence::pick2 (uniform_generator &uni, const int &nsam)
template<typename uniform_generator >
std::pair< int, int > Sequence::pick2 (const uniform_generator &uni, const int &nsam)
template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::bottleneck (uniform_generator &uni, uniform01_generator &uni01, exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &tr, const double &d, const double &f, const double &rho=0., const bool &exponential_recovery=false, const double &recovered_size=1.)
 Coalescent simulation of a population bottleneck Simulate a single, bottlenecked, population according to the Wright-Fisher model without selection. The population can recover from the bottleneck either instantaneously ("stepwise bottleneck"), or according to an exponential growth model. For the case of a stepwise bottleneck, this function is equivalent to the following options in Dick Hudson's program "ms": -eN 0 recovered_size -eN tr f -eN (tr+d) 1. For the case where recovery from the bottleneck is by exponential growth, the equivalent "ms" options are: -eN 0 recovered_size -eG tr (log(recovered_size)-log(f))/d -eG (tr+d) 0 -eN (tr+d) 1.
template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::bottleneck (const uniform_generator &uni, const uniform01_generator &uni01, const exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &tr, const double &d, const double &f, const double &rho=0., const bool &exponential_recovery=false, const double &recovered_size=1.)
 Coalescent simulation of a population bottleneck Simulate a single, bottlenecked, population according to the Wright-Fisher model without selection. The population can recover from the bottleneck either instantaneously ("stepwise bottleneck"), or according to an exponential growth model. For the case of a stepwise bottleneck, this function is equivalent to the following options in Dick Hudson's program "ms": -eN 0 recovered_size -eN tr f -eN (tr+d) 1. For the case where recovery from the bottleneck is by exponential growth, the equivalent "ms" options are: -eN 0 recovered_size -eG tr (log(recovered_size)-log(f))/d -eG (tr+d) 0 -eN (tr+d) 1.
template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::exponential_change (uniform_generator &uni, uniform01_generator &uni01, exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &G, const double &t_begin, const double &t_end, const double &rho=0., const double &size_at_end=-1)
 Coalescent simulation of exponential change in population size Simulate a single population whose size changes exponentially during some period of time. The relevant command line options for Hudson's program "ms" would be: -eG t_begin G -eG t_end 0. -eN t_end size_at_end.
template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::exponential_change (const uniform_generator &uni, const uniform01_generator &uni01, const exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &G, const double &t_begin, const double &t_end, const double &rho=0., const double &size_at_end=-1)
 Coalescent simulation of exponential change in population size Simulate a single population whose size changes exponentially during some period of time. The relevant command line options for Hudson's program "ms" would be: -eG t_begin -eG t_end 0. -eN t_end size_at_end.
template<typename uniform_generator >
void Sequence::add_S_inf_sites (uniform_generator &uni, marginal::const_iterator history, const double &tt, const int &beg, const int &end, const int &nsam, const int &nsites, const int &S, const int &first_snp_index, gamete_storage_type *gametes)
 Add S segregating sites to sample with a particular marginal history, according to the infinitely-many sites model.
template<typename uniform_generator >
void Sequence::add_S_inf_sites (const uniform_generator &uni, marginal::const_iterator history, const double &tt, const int &beg, const int &end, const int &nsam, const int &nsites, const int &S, const int &first_snp_index, gamete_storage_type *gametes)
 Add S segregating sites to sample with a particular marginal history, according to the infinitely-many sites model.
template<typename poisson_generator , typename uniform_generator >
int Sequence::infinite_sites (poisson_generator &poiss, uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename poisson_generator , typename uniform_generator >
int Sequence::infinite_sites (const poisson_generator &poiss, const uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename uniform_generator >
int Sequence::infinite_sites (uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename uniform_generator >
int Sequence::infinite_sites (const uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename poisson_generator , typename uniform_generator >
SimData Sequence::infinite_sites_sim_data (poisson_generator &poiss, uniform_generator &uni, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename poisson_generator , typename uniform_generator >
SimData Sequence::infinite_sites_sim_data (const poisson_generator &poiss, const uniform_generator &uni, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename uniform_generator >
SimData Sequence::infinite_sites_sim_data (uniform_generator &uni, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename uniform_generator >
SimData Sequence::infinite_sites_sim_data (const uniform_generator &uni, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename uniform01_generator >
std::pair< int, int > Sequence::pick_spot (uniform01_generator &uni01, const double &total_reclen, const std::vector< double > &reclens, std::vector< chromosome >::const_iterator sample_begin, const unsigned &current_nsam, const double *rec_map)
template<typename uniform01_generator >
std::pair< int, int > Sequence::pick_spot (const uniform01_generator &uni01, const double &total_reclen, const std::vector< double > &reclens, std::vector< chromosome >::const_iterator sample_begin, const unsigned &current_nsam, const double *rec_map)

Variables

int Sequence::MAX_SEGSITES
 controls allocation of simulated gametes You must define this in namespace Sequence in your program. A value of 200 works well.
int Sequence::MAX_SEGS_INC
 controls (re)allocation of simulated gametes You must define this in namespace Sequence in your program. A value of 100 works well

Typedef Documentation

typedef std::list<marginal> Sequence::arg

Ancestral Recombination Graph.

An arg is an "ancestral recombination graph", which is a linked list of marginal histories.

Note:
The implementation of the crossover function ensures that the marginal trees are sorted in ascending order determined by marginal::beg
Examples:
bottleneck.cc, fragments.cc, freerec.cc, ms--.cc, and msbeta.cc.

Definition at line 215 of file SimTypes.hpp.

typedef std::pair< std::vector<double>, std::vector<std::string> > Sequence::gamete_storage_type

an object to store simulated gametes An object of this type will tend to exist in the calling environment of your program. If you are simulating a sample of n chromosomes, you would initialize the object as follows:

    gamete_storage_type gamete_bucket( std::vector<double>(MAX_SEGSITES,0.),
    std::vector< std::string >(n,std::string(MAX_SEGSITES,'0')) );
Examples:
freerec.cc, and msbeta.cc.

Definition at line 40 of file Mutation.hpp.


Function Documentation

template<typename uniform_generator >
void Sequence::add_S_inf_sites ( const uniform_generator &  uni,
marginal::const_iterator  history,
const double &  tt,
const int &  beg,
const int &  end,
const int &  nsam,
const int &  nsites,
const int &  S,
const int &  first_snp_index,
gamete_storage_type *  gametes 
) [inline]

Add S segregating sites to sample with a particular marginal history, according to the infinitely-many sites model.

This routine places a fixed number of segregating sites on a marginal history that begins at position beg and ends at position end-1. Mutations are assigned positions randomly on the continuous, half-open interval [ beg/L,end/L ), where L = nsites-1.

Parameters:
uni a uniform random number generator that takes two doubles as arguments
history the history onto which mutations will be placed
tt the total time on history
beg the first site in the history (0 <= beg < nsites)
end the last site in the history ( beg < end < nsites )
nsam the total sample size being simulated
nsites the number of mutational sites simulated
S the number of mutations to drop on history
first_snp_index the index at which mutations will begin to be added into gametes
gametes a container in which you are storing the mutations. Must be allocated in the calling environment.

Definition at line 264 of file Mutation.tcc.

template<typename uniform_generator >
void Sequence::add_S_inf_sites ( uniform_generator &  uni,
marginal::const_iterator  history,
const double &  tt,
const int &  beg,
const int &  end,
const int &  nsam,
const int &  nsites,
const int &  S,
const int &  first_snp_index,
gamete_storage_type *  gametes 
) [inline]

Add S segregating sites to sample with a particular marginal history, according to the infinitely-many sites model.

This routine places a fixed number of segregating sites on a marginal history that begins at position beg and ends at position end-1. Mutations are assigned positions randomly on the continuous, half-open interval [ beg/L,end/L ), where L = nsites-1.

Parameters:
uni a uniform random number generator that takes two doubles as arguments
history the history onto which mutations will be placed
tt the total time on history
beg the first site in the history (0 <= beg < nsites)
end the last site in the history ( beg < end < nsites )
nsam the total sample size being simulated
nsites the number of mutational sites simulated
S the number of mutations to drop on history
first_snp_index the index at which mutations will begin to be added into gametes
gametes a container in which you are storing the mutations. Must be allocated in the calling environment.

Definition at line 228 of file Mutation.tcc.

template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::bottleneck ( const uniform_generator &  uni,
const uniform01_generator &  uni01,
const exponential_generator &  expo,
const std::vector< chromosome > &  initialized_sample,
const marginal &  initialized_marginal,
const double &  tr,
const double &  d,
const double &  f,
const double &  rho = 0.,
const bool &  exponential_recovery = false,
const double &  recovered_size = 1. 
) [inline]

Coalescent simulation of a population bottleneck Simulate a single, bottlenecked, population according to the Wright-Fisher model without selection. The population can recover from the bottleneck either instantaneously ("stepwise bottleneck"), or according to an exponential growth model. For the case of a stepwise bottleneck, this function is equivalent to the following options in Dick Hudson's program "ms": -eN 0 recovered_size -eN tr f -eN (tr+d) 1. For the case where recovery from the bottleneck is by exponential growth, the equivalent "ms" options are: -eN 0 recovered_size -eG tr (log(recovered_size)-log(f))/d -eG (tr+d) 0 -eN (tr+d) 1.

Parameters:
uni A binary function object (or equivalent) that returns a random deviate between a and b such that a <= x < b. a and b are the arguments to operator() of uni
uni01 A function object (or equivalent) whose operator() takes no arguments and returns a random deviate 0 <= x < 1.
expo A unary function object whose operator() takes the mean of an exponential process as an argument and returns a deviate from an exponential distribution with that mean
initialized_sample An initialized vector of chromosomes for a single population. For example, this may be the return value of init_sample. This object is used to copy-construct a non-const sample for the simulation
initialized_marginal An initialized marginal tree of the appropriate sample size for the simulation. For example, the return value of init_marginal.
tr The time at which the population recovers from the bottleneck. In units of 4N0 generations, where N0 is the effective size before the bottleneck.
d The duration of the bottleneck, in units of 4N0 generations, where N0 is the effective size before the bottleneck.
f Bottleneck severity. Define Nb as the effective size during the bottleneck, and N0 the effective size prior to the bottleneck. f=Nb/N0.
rho The population recombination rate 4N0r. The number of "sites" simulated is not neccesary, as it can be obtained from initialized_sample[0].last()+1.
exponential_recovery If true, the population recovers from the bottleneck according to an exponential growth model. If false, a stepwise bottleneck is assumed.
recovered_size If 1, the population recovers to N0 at time tr. If 0.5, the population recovers to 1/2 the pre-bottleneck size, etc.
Returns:
The ancestral recombination graph (arg) describing the sample history.
Precondition:
d>0 and tr>0 and f>0 and rho>=0 and recovered_size>0 and initialized_marginal.nsam == initialized_sample.size()
Note:
Preconditions are checked by the assert macro, and are therefore disabled when compiling with -DNDEBUG. No checks or warnings are otherwised performed nor given.

Definition at line 264 of file DemographicModels.tcc.

template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::bottleneck ( uniform_generator &  uni,
uniform01_generator &  uni01,
exponential_generator &  expo,
const std::vector< chromosome > &  initialized_sample,
const marginal &  initialized_marginal,
const double &  tr,
const double &  d,
const double &  f,
const double &  rho = 0.,
const bool &  exponential_recovery = false,
const double &  recovered_size = 1. 
) [inline]

Coalescent simulation of a population bottleneck Simulate a single, bottlenecked, population according to the Wright-Fisher model without selection. The population can recover from the bottleneck either instantaneously ("stepwise bottleneck"), or according to an exponential growth model. For the case of a stepwise bottleneck, this function is equivalent to the following options in Dick Hudson's program "ms": -eN 0 recovered_size -eN tr f -eN (tr+d) 1. For the case where recovery from the bottleneck is by exponential growth, the equivalent "ms" options are: -eN 0 recovered_size -eG tr (log(recovered_size)-log(f))/d -eG (tr+d) 0 -eN (tr+d) 1.

Parameters:
uni A binary function object (or equivalent) that returns a random deviate between a and b such that a <= x < b. a and b are the arguments to operator() of uni
uni01 A function object (or equivalent) whose operator() takes no arguments and returns a random deviate 0 <= x < 1.
expo A unary function object whose operator() takes the mean of an exponential process as an argument and returns a deviate from an exponential distribution with that mean
initialized_sample An initialized vector of chromosomes for a single population. For example, this may be the return value of init_sample. This object is used to copy-construct a non-const sample for the simulation
initialized_marginal An initialized marginal tree of the appropriate sample size for the simulation. For example, the return value of init_marginal.
tr The time at which the population recovers from the bottleneck. In units of 4N0 generations, where N0 is the effective size before the bottleneck.
d The duration of the bottleneck, in units of 4N0 generations, where N0 is the effective size before the bottleneck.
f Bottleneck severity. Define Nb as the effective size during the bottleneck, and N0 the effective size prior to the bottleneck. f=Nb/N0.
rho The population recombination rate 4N0r. The number of "sites" simulated is not neccesary, as it can be obtained from initialized_sample[0].last()+1.
exponential_recovery If true, the population recovers from the bottleneck according to an exponential growth model. If false, a stepwise bottleneck is assumed.
recovered_size If 1, the population recovers to N0 at time tr. If 0.5, the population recovers to 1/2 the pre-bottleneck size, etc.
Returns:
The ancestral recombination graph (arg) describing the sample history.
Precondition:
d>0 and tr>0 and f>0 and rho>=0 and recovered_size>0 and initialized_marginal.nsam == initialized_sample.size()
Note:
Preconditions are checked by the assert macro, and are therefore disabled when compiling with -DNDEBUG. No checks or warnings are otherwised performed nor given.
Examples:
bottleneck.cc.

Definition at line 227 of file DemographicModels.tcc.

void Sequence::calculate_scales ( const std::vector< std::pair< int, int > > &  fragments,
std::vector< std::pair< double, double > > *  sample_scale,
std::vector< std::pair< double, double > > *  mutation_scale 
)

This is a helper function that rescales physical distance in base pairs to continuous distance on the interval 0,1.

Parameters:
fragments A vector of pairs, representing physical distance in bp. For each pair, the first element is the distance to the next fragment, and the second element is the length of the fragment. For example, two 1kb fragments separated by 10kb would be represented by the pairs (0,1000) (10000,1000).
sample_scale This vector will be filled with values representing the positions of the fragments on the continuous interval, without any space betwen them. This is because we will actually do the simulation using a non-uniform genetic map to represent the high recombination rates between fragments
mutation_scale This is a direct mapping of the data contained in fragments to the continuous scale, and can be used to rescale the positions of mutations
Examples:
fragments.cc.
int Sequence::coalesce ( const double &  time,
const int &  ttl_nsam,
const int &  current_nsam,
const int &  c1,
const int &  c2,
const int &  nsites,
int *  nlinks,
std::vector< chromosome > *  sample,
arg *  sample_history 
)

Common ancestor routine for coalescent simulation. Merges chromosome segments and updates marginal trees.

Common ancestor routine for coalescent simulation. This routine performs the merging of two lineages by a coalescent event. Such merges usually require two sorts of operations. The first is an update to the segments contained in a chromosome, and the second is an update of the nodes on a marginal tree.

Parameters:
time the time at which the coalecent event is occuring
ttl_nsam the total sample size being simulated
current_nsam the current sample size in the simulation
c1 the array index of the first chromosome involved in the coalescent event
c2 the array index of the second chromosome involved in the coalescent event
nsites the total mutational length of the region begin simulated. In the language of Hudson (1983), this is the number of infinitely-many-alleles loci in the simulation.
nlinks a pointer to the number of "links" currently in the simulation. A link is the region between two sites, such that a chromosome currently with k sites has k-1 links
sample a pointer to the vector of chromosomes which makes up the sample
sample_history a pointer to the ancestral recombination graph
Returns:
the decrease in current_nsam due to the coalescent event. Usually, the return value is 1. Sometimes, however, it is two, when the two chromosomes being merged have no ancestral material on the same marginal tree.
Examples:
fragments.cc, freerec.cc, and msbeta.cc.
int Sequence::crossover ( const int &  current_nsam,
const int &  chromo,
const int &  pos,
std::vector< chromosome > *  sample,
arg *  sample_history 
)

Recombination function.

Parameters:
current_nsam the current sample size in the simulation
chromo the chromosome on which the crossover event is to occur
pos the crossover event happens between sites pos and pos+1 (0<= pos < nsites)
sample the sample of chromosomes being simulated
sample_history the genealogy of the sample
Returns:
the number of links lost due to the crossover event
Note:
as the type arg is based on std::list, and insertions into lists are done in constant time, this routine keeps the ancestral recombination graph sorted
Examples:
fragments.cc, and msbeta.cc.
template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::exponential_change ( const uniform_generator &  uni,
const uniform01_generator &  uni01,
const exponential_generator &  expo,
const std::vector< chromosome > &  initialized_sample,
const marginal &  initialized_marginal,
const double &  G,
const double &  t_begin,
const double &  t_end,
const double &  rho = 0.,
const double &  size_at_end = -1 
) [inline]

Coalescent simulation of exponential change in population size Simulate a single population whose size changes exponentially during some period of time. The relevant command line options for Hudson's program "ms" would be: -eG t_begin -eG t_end 0. -eN t_end size_at_end.

Parameters:
uni A binary function object (or equivalent) that returns a random deviate between a and b such that a <= x < b. a and b are the arguments to operator() of uni
uni01 A function object (or equivalent) whose operator() takes no arguments and returns a random deviate 0 <= x < 1.
expo A unary function object whose operator() takes the mean of an exponential process as an argument and returns a deviate from an exponential distribution with that mean
initialized_sample An initialized vector of chromosomes for a single population. For example, this may be the return value of init_sample. This object is used to copy-construct a non-const sample for the simulation
initialized_marginal An initialized marginal tree of the appropriate sample size for the simulation. For example, the return value of init_marginal.
G The rate of exponential change in effective size. If G>0, the population grows exponentially (forwards in time). If G<0, it shrinks (again, forwards in time).
t_begin The time in the past (in units of 4Ne generations) at which population size change begins (i.e., ends, moving forward in time)
t_end The time in the past (in units of 4Ne generations) at which populations size change ends (begins forward in time)
rho The population recombination rate 4N0r. The number of "sites" simulated is not neccesary, as it can be obtained from initialized_sample[0].last()+1.
size_at_end At time t_end in the past, the population size is set to size_at_end. If size_at_end = 1, the population is set to the same size that is was at t=0 (i.e. the beginning of the simulation). If size_at_and < 0, the population size is not adjusted at t_end. In other words, it is left at whatever it grew or shrank to.
Returns:
The ancestral recombination graph (arg) describing the sample history.
Precondition:
t_begin>=0 and t_end>=0 and t_end>=t_begin and rho>=0 and initialized_marginal.nsam == initialized_sample.size()
Note:
Preconditions are checked by the assert macro, and are therefore disabled when compiling with -DNDEBUG. No checks or warnings are otherwised performed nor given.

Definition at line 337 of file DemographicModels.tcc.

template<typename uniform_generator , typename uniform01_generator , typename exponential_generator >
arg Sequence::exponential_change ( uniform_generator &  uni,
uniform01_generator &  uni01,
exponential_generator &  expo,
const std::vector< chromosome > &  initialized_sample,
const marginal &  initialized_marginal,
const double &  G,
const double &  t_begin,
const double &  t_end,
const double &  rho = 0.,
const double &  size_at_end = -1 
) [inline]

Coalescent simulation of exponential change in population size Simulate a single population whose size changes exponentially during some period of time. The relevant command line options for Hudson's program "ms" would be: -eG t_begin G -eG t_end 0. -eN t_end size_at_end.

Parameters:
uni A binary function object (or equivalent) that returns a random deviate between a and b such that a <= x < b. a and b are the arguments to operator() of uni
uni01 A function object (or equivalent) whose operator() takes no arguments and returns a random deviate 0 <= x < 1.
expo A unary function object whose operator() takes the mean of an exponential process as an argument and returns a deviate from an exponential distribution with that mean
initialized_sample An initialized vector of chromosomes for a single population. For example, this may be the return value of init_sample. This object is used to copy-construct a non-const sample for the simulation
initialized_marginal An initialized marginal tree of the appropriate sample size for the simulation. For example, the return value of init_marginal.
G The rate of exponential change in effective size. If G>0, the population grows exponentially (forwards in time). If G<0, it shrinks (again, forwards in time).
t_begin The time in the past (in units of 4Ne generations) at which population size change begins (i.e., ends, moving forward in time)
t_end The time in the past (in units of 4Ne generations) at which populations size change ends (begins forward in time)
rho The population recombination rate 4N0r. The number of "sites" simulated is not neccesary, as it can be obtained from initialized_sample[0].last()+1.
size_at_end At time t_end in the past, the population size is set to size_at_end. If size_at_end = 1, the population is set to the same size that is was at t=0 (i.e. the beginning of the simulation). If size_at_and < 0, the population size is not adjusted at t_end. In other words, it is left at whatever it grew or shrank to.
Returns:
The ancestral recombination graph (arg) describing the sample history.
Precondition:
t_begin>=0 and t_end>=0 and t_end>=t_begin and rho>=0 and initialized_marginal.nsam == initialized_sample.size()
Note:
Preconditions are checked by the assert macro, and are therefore disabled when compiling with -DNDEBUG. No checks or warnings are otherwised performed nor given.

Definition at line 301 of file DemographicModels.tcc.

std::vector< int > Sequence::get_all_descendants ( marginal::const_iterator  beg,
const int &  nsam,
const int &  branch 
)

Find all the descendants of a branch on a marginal tree.

Parameters:
beg A pointer to the beginning of a marginal tree, i.e. the return value of marginal::begin()
nsam the total sample size simulated
branch the index of the branch of the tree whose descendants you want.
Note:
branch must be <= 2*nsam-2, which is checked by assert
template<typename uniform_generator >
int Sequence::infinite_sites ( const uniform_generator &  uni,
gamete_storage_type *  gametes,
const int &  nsites,
const arg &  history,
const double *  total_times,
const unsigned *  segsites 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.

Parameters:
uni a uniform random number generator that takes two doubles as an argument
gametes object in which to store the simulated gametes
nsites the length of the region begin simulated
history the list of marginal histories for the sample
total_times the total times on each marginal tree in history
segsites the number of segregating sites to place on each tree
Returns:
the number of mutations placed on the tree
Note:
total_times and segsites must contain a number of elements equal to history.size()

Definition at line 370 of file Mutation.tcc.

template<typename uniform_generator >
int Sequence::infinite_sites ( uniform_generator &  uni,
gamete_storage_type *  gametes,
const int &  nsites,
const arg &  history,
const double *  total_times,
const unsigned *  segsites 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.

Parameters:
uni a uniform random number generator that takes two doubles as an argument
gametes object in which to store the simulated gametes
nsites the length of the region begin simulated
history the list of marginal histories for the sample
total_times the total times on each marginal tree in history
segsites the number of segregating sites to place on each tree
Returns:
the number of mutations placed on the tree
Note:
total_times and segsites must contain a number of elements equal to history.size()

Definition at line 346 of file Mutation.tcc.

template<typename poisson_generator , typename uniform_generator >
int Sequence::infinite_sites ( const poisson_generator &  poiss,
const uniform_generator &  uni,
gamete_storage_type *  gametes,
const int &  nsites,
const arg &  history,
const double &  theta 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph.

Parameters:
poiss a Poisson random number generator which takes the mean of the poisson as an argument
uni a uniform random number generator that takes two doubles as an argument
gametes object in which to store the simulated gametes
nsites the length of the region begin simulated
history the list of marginal histories for the sample
theta the coalescent-scaled mutation rate
Returns:
the number of mutations placed on the tree

Definition at line 324 of file Mutation.tcc.

template<typename poisson_generator , typename uniform_generator >
int Sequence::infinite_sites ( poisson_generator &  poiss,
uniform_generator &  uni,
gamete_storage_type *  gametes,
const int &  nsites,
const arg &  history,
const double &  theta 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph.

Parameters:
poiss a Poisson random number generator which takes the mean of the poisson as an argument
uni a uniform random number generator that takes two doubles as an argument
gametes object in which to store the simulated gametes
nsites the length of the region begin simulated
history the list of marginal histories for the sample
theta the coalescent-scaled mutation rate
Returns:
the number of mutations placed on the tree
Examples:
freerec.cc, and msbeta.cc.

Definition at line 301 of file Mutation.tcc.

template<typename uniform_generator >
SimData Sequence::infinite_sites_sim_data ( const uniform_generator &  uni,
const int &  nsites,
const arg &  history,
const double *  total_times,
const unsigned *  segsites 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.

Parameters:
uni a uniform random number generator that takes two doubles as an argument
nsites the length of the region begin simulated
history the list of marginal histories for the sample
total_times the total times on each marginal tree in history
segsites the number of segregating sites to place on each tree
Returns:
an object of type SimData that represent the sample. for analysis
Note:
total_times and segsites must contain a number of elements equal to history.size()

Definition at line 463 of file Mutation.tcc.

template<typename uniform_generator >
SimData Sequence::infinite_sites_sim_data ( uniform_generator &  uni,
const int &  nsites,
const arg &  history,
const double *  total_times,
const unsigned *  segsites 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.

Parameters:
uni a uniform random number generator that takes two doubles as an argument
nsites the length of the region begin simulated
history the list of marginal histories for the sample
total_times the total times on each marginal tree in history
segsites the number of segregating sites to place on each tree
Returns:
an object of type SimData that represent the sample. for analysis
Note:
total_times and segsites must contain a number of elements equal to history.size()

Definition at line 440 of file Mutation.tcc.

template<typename poisson_generator , typename uniform_generator >
SimData Sequence::infinite_sites_sim_data ( const poisson_generator &  poiss,
const uniform_generator &  uni,
const int &  nsites,
const arg &  history,
const double &  theta 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph.

Parameters:
poiss a Poisson random number generator which takes the mean of the poisson as an argument
uni a uniform random number generator that takes two doubles as an argument
nsites the length of the region begin simulated
history the list of marginal histories for the sample
theta the coalescent-scaled mutation rate
Returns:
an object of type SimData that represent the sample. (the gametes are also stored in gametes). The SimData object can be passed directly into class PolySIM for analysis

Definition at line 418 of file Mutation.tcc.

template<typename poisson_generator , typename uniform_generator >
SimData Sequence::infinite_sites_sim_data ( poisson_generator &  poiss,
uniform_generator &  uni,
const int &  nsites,
const arg &  history,
const double &  theta 
) [inline]

Apply the infinitely-many sites mutation model to an ancetral recombination graph.

Parameters:
poiss a Poisson random number generator which takes the mean of the poisson as an argument
uni a uniform random number generator that takes two doubles as an argument
nsites the length of the region begin simulated
history the list of marginal histories for the sample
theta the coalescent-scaled mutation rate
Returns:
an object of type SimData that represent the sample. (the gametes are also stored in gametes). The SimData object can be passed directly into class PolySIM for analysis
Examples:
bottleneck.cc, and fragments.cc.

Definition at line 395 of file Mutation.tcc.

marginal Sequence::init_marginal ( const int &  nsam  ) 

Simple function to initialize a marginal tree.

Parameters:
nsam the total sample size (i.e. summed over all populations) that you want to simulate
Examples:
bottleneck.cc, fragments.cc, freerec.cc, ms--.cc, and msbeta.cc.
std::vector< chromosome > Sequence::init_sample ( const std::vector< int > &  pop_config,
const int &  nsites 
)

A simple function to initialize a sample of chromosomes.

Parameters:
pop_config For a k-population model, this vector contains the sample size for each pop. Individuals are labeled as beloning to population 0 to k-1, in the order specified in this vector
nsites The number of sites at which mutations occur. For a k-site model, recombination occurs at any of the k-1 "links" between sites. Eaach chromosome is assigned a single segment starting at position 0 and ending at nsites-1.
Examples:
bottleneck.cc, fragments.cc, freerec.cc, ms--.cc, and msbeta.cc.
double Sequence::integrate_genetic_map ( const std::vector< chromosome > &  sample,
const int &  current_nsam,
const std::vector< double > &  genetic_map,
std::vector< double > *  reclens 
)

When simulating non-uniform recombination rates, the probability of recombination at each point in the simulation needs to be obtained by integrating over the genetic map and the current sample configuration. This function does that.

Parameters:
sample the vector containing the current state of all chromosomes in the sample
current_nsam the current sample size in the simulation
genetic_map a vector containing rho/"link" for each link in the sample. For the i-th base-pair in the chromosome, the "link" is the "space between" positions i and i+1. The value of genetic_map[i] is therefore 4Nr between site i and i+1 (sometimes called 4Nr/"site").
reclens a vector of doubles. This vector will be resized to current_nsam in this function, and filled with current_nsam values, each of which is the sum(genetic_map[beg],genetic_map[end-1]) for each chromosome in the sample, where beg and end are the first and last positions in each chromosome. These data are needed by the function pick_spot (Sequence/Coalescent/Recombination.hpp).
Returns:
the cummulative recombination rate in the sample, which is obtained by integrating over the ancestral material in the sample and the genetic map.
Examples:
fragments.cc.
bool Sequence::is_descendant ( marginal::const_iterator  beg,
const int &  ind,
const int &  branch 
)

Ask if a tip of a tree is a descendant of a particular branch.

Parameters:
beg A pointer to the beginning of a marginal tree, i.e. the return value of marginal::begin()
ind the index of the putative descendant node
branch the index of the branch of the tree which may be the ancestor of ind
Note:
This function does not check whether ind or branch go out of bounds, and so the programmer must ensure that both values are <= 2*nsam-2, where nsam is the total sample size simulated
bool Sequence::isseg ( chromosome::const_iterator  seg,
const int  nsegs,
const int  pos,
int *  offset 
)

ask if a chromosome beginning at seg and containing nsegs contains a segment containing the position pos

Parameters:
seg a pointer to a segment of a chromosome (this should be the 1st segment, such as the return value of chromosome::begin())
nsegs the number of segs in the chromosome pointed to by seg
offset a pointer to an integer. This integer is used for repeated pointer arithmetic, and should be initalized to 0 before the first call.
pos a position a long a chromosome. This function asks if pos is contained in the ancestral material of the chromosome whose segments begin at seg
Returns:
true if a segment exists that contains the point pos
Note:
only used by the function coalesce
template<typename uniform_generator , typename uniform01_generator , typename exponential_generator , typename poisson_generator >
Sequence::SimData Sequence::neutral_sample ( uniform_generator &  uni,
uniform01_generator &  uni01,
exponential_generator &  expo,
poisson_generator &  poiss,
const double &  theta,
const double &  rho,
const int &  nsites,
const int &  nsam,
std::vector< chromosome > *  sample,
arg *  sample_history,
unsigned *  max_chromosomes = NULL,
const unsigned &  max_chromosomes_inc = 0 
) [inline]

A simple function to generate samples under a neutral equilibrium model.

A simple function to generate samples under a neutral equilibrium model with infinite-sites mutation and a constant recombination rate accross the region.

Parameters:
uni a function/object capable of returning a random double uniformly from [0,k)
uni01 a function/object capable of returning a random probability uniformly from [0,1)
expo a function/object capable of returning an exponentially distributed random variable. The function must take a single double as an argument, which is the mean of the exponential distribution
poiss a function/object capable of returning an poisson distributed random variable. The function must take a single double as an argument, which is the mean of the poisson distribution
theta 4Nu, the coalescent-scaled mutation rate
rho 4Nr, the recombination rate for the whole region
nsites the number of mutational sites to simulate. Recombination is equally likely between any two sites.
nsites the total sample size. (There is no population structure in this routine)
sample A pointer to the sample of chromosomes you wish to simulate. This must be properly initialized, for example using the function init_sample in <Sequence/Coalescent/Initialize.hpp>
sample_history a pointer to the ancestral recombination graph. This must be initialized in the calling enviroment. In general, you can use init_marginal in <Sequence/Coalescent/Initialize.hpp>
max_chromosomes This is a pointer to an integer in the calling environment which you can use to reserve memory in the array containing the sample of chromosomes. If the size of sample ever gets larger than this, max_chromosomes is incremented by max_chromosomes_inc
max_chromosomes_inc the amount by which to increment max_chromosomes
Note:
This function does require a bit of work to use, although not much. Please see the example code that comes with the library, in particular ms--.cc
Examples:
ms--.cc.

Definition at line 16 of file NeutralSample.hpp.

template<typename uniform_generator >
std::pair< int, int > Sequence::pick2 ( const uniform_generator &  uni,
const int &  nsam 
) [inline]
Parameters:
uni a random number function/object capable of returning a double-precision random number between 0 and nsam-1
nsam the current sample size in the simulation
Returns:
A pair of integers which contains the indexes of two chromosomes in sample

Definition at line 102 of file Coalesce.tcc.

template<typename uniform_generator >
std::pair< int, int > Sequence::pick2 ( uniform_generator &  uni,
const int &  nsam 
) [inline]
Parameters:
uni a random number function/object capable of returning a double-precision random number between 0 and nsam-1
nsam the current sample size in the simulation
Returns:
A pair of integers which contains the indexes of two chromosomes in sample
Examples:
fragments.cc, freerec.cc, and msbeta.cc.

Definition at line 89 of file Coalesce.tcc.

template<typename uniform_generator >
std::pair< int, int > Sequence::pick2_in_deme ( const uniform_generator &  uni,
const std::vector< Sequence::chromosome > &  sample,
const int &  current_nsam,
const int &  deme_nsam,
const int &  deme 
) [inline]

Choose two random chromosomes from the same deme.

Parameters:
uni A random number generator taking two arguments, a and b, and returning a random variable distributed uniformly over [a,b)
sample The current state of the simulated sample
current_nsam The total sample size being simuled (the sum of sample sizes over all demes)
deme_nsam The sample size of the deme from which you wish to sample
deme The index ( 0 <= deme < # populations ) of the deme from which you wish to sample
Returns:
A pair of integers which contains the indexes of two chromosomes in sample

Definition at line 68 of file Coalesce.tcc.

int Sequence::pick_branch ( marginal::const_iterator  beg,
const int &  nsam,
const double &  rtime 
)

pick a random branch of a marginal tree

Parameters:
beg A pointer to the beginning of a marginal tree, i.e. the return value of marginal::begin()
nsam the total sample size simulated
rtime a (preferably random) double between 0 and the total_time on the marginal tree from which beg is the iterator
template<typename uniform01_generator >
std::pair< int, int > Sequence::pick_spot ( const uniform01_generator &  uni01,
const double &  total_reclen,
const std::vector< double > &  reclens,
std::vector< chromosome >::const_iterator  sample_begin,
const unsigned &  current_nsam,
const double *  rec_map 
) [inline]

Picks a positions amongst all chromosomes at which a recombination event will occur, based on an arbitrary genetic map

Parameters:
uni01 a function/object which takes no arguments and can return a U[0,1)
total_reclen the total recombination length of all chromosomes in the sample
reclens a vector of the proportion of total_reclen contributed by each chromosome. This needs to be ordered in the same order as sample_begin to (sample_begin + current_nsam - 1)
sample_begin an iterator pointing to the beginning of the sample
current_nsam the current sample size in the simulation
rec_map an array of probabilities describing the recombination map. The map is completely up to the programmer, and it is not checked for sanity at all in this function. For a region of k sites, indexes 0 to k-2 of this array should be filled. The i-th element should contain the probability that a crossover occurs between position i and i+1. The sum of all elements should be 1, such that the array describes the recombination map in terms of a probability distribution function. An example of how to do this is in the file examples/msbeta.cc that comes with the source for this library.
Returns:
a pair of integers containing the index of the recombinant chromosome (.first), and the position at which the crossover will occur (.second)

Definition at line 84 of file Recombination.tcc.

template<typename uniform01_generator >
std::pair< int, int > Sequence::pick_spot ( uniform01_generator &  uni01,
const double &  total_reclen,
const std::vector< double > &  reclens,
std::vector< chromosome >::const_iterator  sample_begin,
const unsigned &  current_nsam,
const double *  rec_map 
) [inline]

Picks a positions amongst all chromosomes at which a recombination event will occur, based on an arbitrary genetic map

Parameters:
uni01 a function/object which takes no arguments and can return a U[0,1)
total_reclen the total recombination length of all chromosomes in the sample
reclens a vector of the proportion of total_reclen contributed by each chromosome. This needs to be ordered in the same order as sample_begin to (sample_begin + current_nsam - 1)
sample_begin an iterator pointing to the beginning of the sample
current_nsam the current sample size in the simulation
rec_map an array of probabilities describing the recombination map. The map is completely up to the programmer, and it is not checked for sanity at all in this function. For a region of k sites, indexes 0 to k-2 of this array should be filled. The i-th element should contain the probability that a crossover occurs between position i and i+1. The sum of all elements should be 1, such that the array describes the recombination map in terms of a probability distribution function. An example of how to do this is in the file examples/msbeta.cc that comes with the source for this library.
Returns:
a pair of integers containing the index of the recombinant chromosome (.first), and the position at which the crossover will occur (.second)
Examples:
fragments.cc, and msbeta.cc.

Definition at line 54 of file Recombination.tcc.

std::pair< int, int > Sequence::pick_uniform_spot ( const double &  random_01,
const int &  nlinks,
std::vector< chromosome >::const_iterator  sample_begin,
const unsigned &  current_nsam 
)

Pick a crossover point for the model where recombination rates are constant across a recion. Picks a positions uniformly amongst all chromosomes at which a recombination event will occur.

Parameters:
random_01 a random uniform deviate U[0,1)
nlinks the number of links currently in the simulation
sample_begin an iterator pointing to the beginning of the sample
current_nsam the current sample size in the simulation
Returns:
a pair of integers containing the index of the recombinant chromosome (.first), and the position at which the crossover will occur (.second)
void Sequence::rescale_mutation_positions ( SimData *  d,
const std::vector< std::pair< double, double > > &  sample_scale,
const std::vector< std::pair< double, double > > &  mutation_scale 
)

Rescales the positions of the mutations in d from the scale given in sample_scale to that given in mutation_scale.

Note:
See documentation for calcualate_scales
Examples:
fragments.cc.
int Sequence::sample_length ( const std::vector< std::pair< int, int > > &  fragments  ) 

When simulating partially linked regions, return the total length of sample material that we are simulating.

Returns:
The sum of fragments[i].second for i=0 to i=fragments.size()-1
Examples:
fragments.cc.
int Sequence::total_length ( const std::vector< std::pair< int, int > > &  fragments  ) 

When simulating partially linked regions, return the total length of the region.

Returns:
The sum of fragments[i].first + fragments[i].second for i=0 to i=fragments.size()-1
Examples:
fragments.cc.
double Sequence::total_time ( const marginal::const_iterator  beg,
const int &  nsam 
)

Calculate total time on a marginal tree.

Parameters:
beg A pointer to the beginning of a marginal tree, i.e. the return value of marginal::begin()
nsam the total sample size simulated
Returns:
The total time on the tree.
Note:
The scaling of time in the simulation is up to you
double Sequence::total_time_on_arg ( const Sequence::arg sample_history,
const int &  total_number_of_sites 
)

Returns the total time on an ancestral recombination graph.

Parameters:
sample_history an ancestral recombination graph
total_number_of_sites the number of "sites" simulated on the ARG
Returns:
The total time on an ancestral recombination graph.
Note:
The time is in terms of whatever units are recorded on the nodes of the mariginals of the ARG
Exceptions:
Sequence::SeqException if the beginning of any marginal tree is >= total_number_of_sites

Generated on Mon Jul 12 15:22:02 2010 for libsequence by  doxygen 1.6.1