Sequence Namespace Reference

The namespace in which this library resides. More...


Classes

struct  newick_stream_marginal_tree_impl
class  AlignStream
 Virtual interface to alignment streams. More...
class  ClustalW
 ClustalW streams. More...
class  Comeron95
 Ka and Ks by Comeron's (1995) method. More...
struct  ComplementBase
struct  ProductMoment
 Pearson's product-moment correlation. More...
struct  SpearmansRank
 Spearman's rank correlation. More...
struct  upperCrit
 Find the upper critical value of a sorted list. More...
struct  lowerCrit
 Find the upper critical value of a sorted list. More...
class  Sums
class  Fasta
 FASTA sequence stream. More...
class  Fastq
 FASTQ sequence stream. More...
class  FST
 analysis of population structure using $F_{ST}$ More...
class  Grantham
 Grantham's distances. More...
class  GranthamWeights2
 Weights paths by Grantham's distances for codons differing at 2 sites. More...
class  GranthamWeights3
 Weights paths by Grantham's distances for codons differing at 3 sites. More...
struct  HKAdata
 Data from a single locus for an HKA test. More...
struct  HKAresults
 results of calculations of the HKA test More...
class  Kimura80
 Kimura's 2-parameter distance. More...
class  phylipData
struct  countStates
 Functor to count the number of states, excluding gaps and missing data, in a range of characters. More...
struct  countDerivedStates
 Functor to count the number of derived states, excluding gaps and missing data, in a range of characters. More...
struct  ssh
 Calculate nucleotide diversity from a polymorphic site. More...
class  nmuts
 Calculate the number of mutations at a polymorphic site. More...
class  PolySIM
 Analysis of coalescent simulation data. More...
class  PolySites
 Polymorphism tables for sequence data. More...
class  PolySNP
 Molecular population genetic analysis. More...
struct  _PolySNPImpl
class  PolyTable
 The base class for polymorphism tables. More...
class  PolyTableSlice
 A container class for "sliding windows" along a polymorphism table. More...
class  RedundancyCom95
 Calculate redundancy of a genetic code using Comeron's counting scheme. More...
class  Seq
 Abstract interface to sequence objects. More...
class  SeqException
 Base class for exceptions that may be thrown. More...
class  badFormat
class  shortestPath
 Calculate shortest path between 2 codons. More...
class  SimData
 Data from coalescent simulations. More...
class  SimParams
 Parameters for Hudson's simulation program. More...
class  SimpleSNP
 SNP table data format. More...
class  SingleSub
 Deal with codons differing at 1 position. More...
class  Sites
 Calculate length statistics for divergence calculations. More...
class  stateCounter
 keep track of state counts at a site in an alignment or along a sequence More...
class  ThreeSubs
 Deal with codons differing at all 3 positions. More...
class  TwoSubs
 Deal with codons differing at 2 positions. More...
class  Unweighted2
 weights all pathways equally More...
class  Unweighted3
 weights all pathways equally More...
class  WeightingScheme2
 abstract interface to weighting schemes when codons differ at 2 positions More...
class  WeightingScheme3
 abstract interface to weighting schemes when codons differ at 3 positions More...
struct  randomShuffleAdaptor
struct  segment
 A portion of a recombining chromosome. More...
struct  chromosome
 A chromosome is a container of segments. More...
struct  node
 A point on a marginal tree at which a coalescent event occurs. More...
struct  marginal
 The genealogy of a portion of a chromosome on which no recombination has occurred. More...
class  newick_stream_marginal_tree
 Class that provides a typecast-on-output of a marginal tree to a newick tree Example use:. More...

Namespaces

namespace  Alignment
 Routines fundamental to aligned data.
namespace  Recombination
 Methods dealing with recombination.

Typedefs

typedef SimpleSNP Hudson2001
typedef std::vector< std::pair
< std::string, int > > 
CodonUsageTable
typedef std::pair< double,
std::string > 
polymorphicSite
typedef std::vector
< polymorphicSite
polySiteVector
typedef std::istringstream istr
typedef std::ostringstream ostr
typedef std::pair< std::vector
< double >, std::vector
< std::string > > 
gamete_storage_type
 an object to store simulated gametes An object of this type will tend to exist in the calling environment of your program. If you are simulating a sample of n chromosomes, you would initialize the object as follows:
typedef std::list< marginalarg
 Ancestral Recombination Graph.

Enumerations

enum  Nucleotides {
  A, T, G, C,
  N, GAP
}
enum  GeneticCodes { UNIVERSAL }
enum  Mutations { Unknown, Ts, Tv }

Functions

bool isseg (chromosome::const_iterator seg, const int nsegs, const int pos, int *offset)
 ask if a chromosome beginning at seg and containing nsegs contains a segment containing the position pos
int coalesce (const double &time, const int &ttl_nsam, const int &current_nsam, const int &c1, const int &c2, const int &nsites, int *nlinks, std::vector< chromosome > *sample, arg *sample_history)
 Common ancestor routine for coalescent simulation. Merges chromosome segments and updates marginal trees.
int sample_length (const std::vector< std::pair< int, int > > &fragments)
 When simulating partially linked regions, return the total length of sample material that we are simulating.
int total_length (const std::vector< std::pair< int, int > > &fragments)
 When simulating partially linked regions, return the total length of the region.
void calculate_scales (const std::vector< std::pair< int, int > > &fragments, std::vector< std::pair< double, double > > *sample_scale, std::vector< std::pair< double, double > > *mutation_scale)
 This is a helper function that rescales physical distance in base pairs to continuous distance on the interval 0,1.
void rescale_mutation_positions (SimData *d, const std::vector< std::pair< double, double > > &sample_scale, const std::vector< std::pair< double, double > > &mutation_scale)
 Rescales the positions of the mutations in d from the scale given in sample_scale to that given in mutation_scale.
void rescale_arg (arg *sample_history, const std::vector< std::pair< int, int > > &fragments)
 Rescales the beginnings of marginal trees in an ancestral recombination graph from a genetic scale to a physical scale.
double integrate_genetic_map (const std::vector< chromosome > &sample, const int &current_nsam, const std::vector< double > &genetic_map, std::vector< double > *reclens)
 When simulating non-uniform recombination rates, the probability of recombination at each point in the simulation needs to be obtained by integrating over the genetic map and the current sample configuration. This function does that.
std::vector< chromosomeinit_sample (const std::vector< int > &pop_config, const int &nsites)
 A simple function to initialize a sample of chromosomes.
marginal init_marginal (const int &nsam)
 Simple function to initialize a marginal tree.
void output_gametes (FILE *fp, const unsigned &segsites, const unsigned &nsam, const gamete_storage_type &gametes)
 Write an object of type gamete_storage type to a C-style file stream This function is used when you need to output simulated gametes using a method faster than the operator<< for class SimData.
std::pair< int, int > pick_uniform_spot (const double &random_01, const int &nlinks, std::vector< chromosome >::const_iterator sample_begin, const unsigned &current_nsam)
 Pick a crossover point for the model where recombination rates are constant across a recion. Picks a positions uniformly amongst all chromosomes at which a recombination event will occur.
int crossover (const int &current_nsam, const int &chromo, const int &pos, std::vector< chromosome > *sample, arg *sample_history)
 Recombination function.
std::ostream & operator<< (std::ostream &s, const chromosome &c)
 output operator for chromosome types in coalescent simulation Outputs the segments contained by the chromosome
std::ostream & operator<< (std::ostream &s, const marginal &m)
 Write a marginal tree to an ostream.
std::ostream & operator<< (std::ostream &o, const newick_stream_marginal_tree &n)
std::istream & operator>> (std::istream &i, newick_stream_marginal_tree &n)
double total_time (const marginal::const_iterator beg, const int &nsam)
 Calculate total time on a marginal tree.
int pick_branch (marginal::const_iterator beg, const int &nsam, const double &rtime)
 pick a random branch of a marginal tree
std::vector< int > get_all_descendants (marginal::const_iterator beg, const int &nsam, const int &branch)
 Find all the descendants of a branch on a marginal tree.
bool is_descendant (marginal::const_iterator beg, const int &ind, const int &branch)
 Ask if a tip of a tree is a descendant of a particular branch.
double total_time_on_arg (const Sequence::arg &sample_history, const int &total_number_of_sites)
 Returns the total time on an ancestral recombination graph.
void minimize_arg (Sequence::arg *sample_history)
CodonUsageTable makeCodonUsageTable (const Seq *sequence)
CodonUsageTable makeCodonUsageTable (const std::string &sequence)
CodonUsageTable makeCodonUsageTable (std::string::const_iterator beg, std::string::const_iterator end)
Mutations TsTv (char i, char j)
Mutations TsTv (int i, int j)
bool Different (const std::string &seq1, const std::string &seq2, bool skip_missing, bool nucleic_acid)
unsigned NumDiffs (const std::string &seq1, const std::string &seq2, bool skip_missing, bool nucleic_acid)
bool Gapped (const std::string &s)
bool NotAGap (const char &c)
HKAresults calcHKA (const std::vector< HKAdata > &data)
void Intermediates2 (string *intermediates, const std::string &codon1, const std::string &codon2)
 Calculate the intermediate codons between a pair of codons diverged at 2 positions.
void Intermediates3 (string *intermediates, const std::string &codon1, const std::string &codon2)
 Calculate the intermediate codons between a pair of codons diverged at 3 positions.
bool containsCharacter (const PolyTable *t, const char &ch)
void fillIn (PolyTable *t, const unsigned &refseq, const char &identical)
void addIdentityChar (PolyTable *t, const unsigned &refseq, const char &identical)
void RemoveGaps (PolyTable *table, const char &gapchar) throw (SeqException)
void RemoveInvariantColumns (PolyTable *table, const bool &skipOutgroup, const unsigned &outgroup) throw (SeqException)
bool PolyTableValid (const PolyTable *table)
polySiteVector rotatePolyTable (const Sequence::PolyTable *data)
std::pair< unsigned, unsigned > mutsShortestPath (const std::string &codon1, const std::string &codon2, const Sequence::GeneticCodes &code) throw (Sequence::SeqException)
std::pair< unsigned,
shortestPath::pathType
diffType (const std::string &codon1, const std::string &codon2, const Sequence::GeneticCodes &code) throw (Sequence::SeqException)
boost::tuple
< shortestPath::pathType,
shortestPath::pathType,
shortestPath::pathType
diffTypeMulti (const std::string &codon1, const std::string &codon2, const Sequence::GeneticCodes &code) throw (Sequence::SeqException)
std::ostream & operator<< (std::ostream &stream, class SimParams &object)
std::istream & operator>> (std::istream &s, SimParams &c)
double Snn_statistic (const unsigned individuals[], const std::vector< std::vector< double > > &dkj, const unsigned config[], const size_t &npop, const unsigned &nsam)
std::string Translate (std::string::const_iterator beg, std::string::const_iterator end, Sequence::GeneticCodes genetic_code, const char &gapchar) throw (Sequence::SeqException)
template<typename T>
std::istream & operator>> (std::istream &s, AlignStream< T > &c)
template<typename T>
std::ostream & operator<< (std::ostream &s, const AlignStream< T > &c)
template<typename T>
bool notDifferent (const T &l, const T &r)
template<typename Iterator>
bool Gapped (Iterator beg, Iterator end, const char &gapchar= '-')
template<typename iter1, typename iter2, typename correlation_type, typename comparison_function, typename UniformIntGenerator>
ensureFloating< typename
std::iterator_traits< iter1 >
::value_type, typename
std::iterator_traits< iter2 >
::value_type >::type 
PermuteCorrelation (iter1 beg_x, iter1 end_x, iter2 beg_y, const correlation_type &c, const comparison_function &comp, UniformIntGenerator &rand, const unsigned &NPERM=10000)
template<typename iter1, typename iter2, typename correlation_type, typename comparison_function, typename UniformIntGenerator>
ensureFloating< typename
std::iterator_traits< iter1 >
::value_type, typename
std::iterator_traits< iter2 >
::value_type >::type 
PermuteCorrelation (iter1 beg_x, iter1 end_x, iter2 beg_y, const correlation_type &c, const comparison_function &comp, const UniformIntGenerator &rand, const unsigned &NPERM=10000)
template<typename key, typename value>
std::vector< std::pair< key,
value > > 
operator+ (const std::vector< std::pair< key, value > > &lhs, const std::vector< std::pair< key, value > > &rhs)
template<typename key, typename value>
std::vector< std::pair< key,
value > > 
operator+= (std::vector< std::pair< key, value > > &lhs, const std::vector< std::pair< key, value > > &rhs)
template<typename key, typename value, typename comparison>
std::map< key, value, comparison > operator+ (const std::map< key, value, comparison > &lhs, const std::map< key, value, comparison > &rhs)
template<typename key, typename value, typename comparison>
std::map< key, value, comparison > operator+= (std::map< key, value, comparison > &lhs, const std::map< key, value, comparison > &rhs)
template<typename iterator>
double mean (iterator beg, iterator end)
template<typename iterator>
double variance (iterator beg, iterator end)
template<typename ForwardIterator>
std::pair< double, double > meanAndVar (ForwardIterator beg, ForwardIterator end)
template<typename T>
const Sums< T > operator+ (const Sums< T > &lhs, const Sums< T > &rhs)
template<typename T>
const Sums< T > operator+ (const Sums< T > &lhs, const T &rhs)
void Intermediates2 (std::string *intermediates, const std::string &codon1, const std::string &codon2)
void Intermediates3 (std::string *intermediates, const std::string &codon1, const std::string &codon2)
std::istream & operator>> (std::istream &s, PolyTable &c)
std::ostream & operator<< (std::ostream &o, const PolyTable &c)
std::ostream & operator<< (std::ostream &s, const Seq &c)
std::istream & operator>> (std::istream &s, Seq &c)
std::ostream & operator<< (std::ostream &s, const SeqException &c)
template<typename Iter>
bool validSeq (Iter beg, Iter end, const char *_pattern=Sequence::basic_dna_alphabet, const bool icase=true)
template<typename Iterator>
std::map< typename
std::iterator_traits< Iterator >
::value_type, unsigned > 
makeCountList (Iterator beg, Iterator end)
template<typename Iterator>
bool internalGapCheck (Iterator beg, Iterator end, const char &gapchar= '-', const unsigned &mod=3)
template<typename uniform_int_generator>
std::pair< double, double > Snn_test (const PolyTable &snpTable, const unsigned config[], const size_t &npop, uniform_int_generator &uni_int, const unsigned &nperms=10000)
 Conducts a permutation-test of Hudson's Snn (sequence nearest-neighbor) statistic.
template<typename uniform_int_generator>
std::vector< std::vector
< double > > 
Snn_test_pairwise (const PolyTable &snpTable, const unsigned config[], const size_t &npop, uniform_int_generator &uni_int, const unsigned &nperms=10000)
 Conducts a permutation-test of Hudson's Snn (sequence nearest-neighbor) statistic, for all pairwise combinations of populations.
template<typename uniform_int_generator>
std::pair< double, double > Snn_test (const PolyTable &snpTable, const unsigned config[], const size_t &npop, const uniform_int_generator &uni_int, const unsigned &nperms=10000)
 Conducts a permutation-test of Hudson's Snn (sequence nearest-neighbor) statistic.
template<typename uniform_int_generator>
std::vector< std::vector
< double > > 
Snn_test_pairwise (const PolyTable &snpTable, const unsigned config[], const size_t &npop, const uniform_int_generator &uni_int, const unsigned &nperms=10000)
 Conducts a permutation-test of Hudson's Snn (sequence nearest-neighbor) statistic, for all pairwise combinations of populations.
template<typename iter1, typename iter2, typename correlation_type, typename comparison_function, typename UniformIntGenerator>
ensureFloating< typename
std::iterator_traits< iter1 >
::value_type, typename
std::iterator_traits< iter2 >
::value_type >::type 
PermuteCorrelation_details (iter1 beg_x, iter1 end_x, iter2 beg_y, const correlation_type &corr, const comparison_function &comp, UniformIntGenerator &rand, const unsigned &NPERM)
template<typename uniform_int_generator>
std::pair< double, double > Snn_test_details (const PolyTable &snpTable, const unsigned config[], const size_t &npop, uniform_int_generator &uni_int, const unsigned &nperms)
template<typename uniform_int_generator>
std::vector< std::vector
< double > > 
Snn_test_pairwise_details (const PolyTable &snpTable, const unsigned config[], const size_t &npop, uniform_int_generator uni_int, const unsigned &nperms)
template<typename iterator, typename generator>
void random_shuffle (iterator __first, iterator __last, generator &__g)
template<typename iterator, typename generator>
void random_shuffle (iterator __first, iterator __last, const generator &__g)
template<typename uniform_generator>
std::pair< int, int > pick2_in_deme (uniform_generator &uni, const std::vector< Sequence::chromosome > &sample, const int &ttl_nsam, const int &deme_nsam, const int &deme)
template<typename uniform_generator>
std::pair< int, int > pick2_in_deme (const uniform_generator &uni, const std::vector< Sequence::chromosome > &sample, const int &current_nsam, const int &deme_nsam, const int &deme)
 Choose two random chromosomes from the same deme.
template<typename uniform_generator>
std::pair< int, int > pick2 (uniform_generator &uni, const int &nsam)
template<typename uniform_generator>
std::pair< int, int > pick2 (const uniform_generator &uni, const int &nsam)
template<typename uniform_generator, typename uniform01_generator, typename exponential_generator>
arg bottleneck (uniform_generator &uni, uniform01_generator &uni01, exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &tr, const double &d, const double &f, const double &rho=0., const bool &exponential_recovery=false, const double &recovered_size=1.)
 Coalescent simulation of a population bottleneck Simulate a single, bottlenecked, population according to the Wright-Fisher model without selection. The population can recover from the bottleneck either instantaneously ("stepwise bottleneck"), or according to an exponential growth model. For the case of a stepwise bottleneck, this function is equivalent to the following options in Dick Hudson's program "ms": -eN 0 recovered_size -eN tr f -eN (tr+d) 1. For the case where recovery from the bottleneck is by exponential growth, the equivalent "ms" options are: -eN 0 recovered_size -eG tr (log(recovered_size)-log(f))/d -eG (tr+d) 0 -eN (tr+d) 1.
template<typename uniform_generator, typename uniform01_generator, typename exponential_generator>
arg bottleneck (const uniform_generator &uni, const uniform01_generator &uni01, const exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &tr, const double &d, const double &f, const double &rho=0., const bool &exponential_recovery=false, const double &recovered_size=1.)
 Coalescent simulation of a population bottleneck Simulate a single, bottlenecked, population according to the Wright-Fisher model without selection. The population can recover from the bottleneck either instantaneously ("stepwise bottleneck"), or according to an exponential growth model. For the case of a stepwise bottleneck, this function is equivalent to the following options in Dick Hudson's program "ms": -eN 0 recovered_size -eN tr f -eN (tr+d) 1. For the case where recovery from the bottleneck is by exponential growth, the equivalent "ms" options are: -eN 0 recovered_size -eG tr (log(recovered_size)-log(f))/d -eG (tr+d) 0 -eN (tr+d) 1.
template<typename uniform_generator, typename uniform01_generator, typename exponential_generator>
arg exponential_change (uniform_generator &uni, uniform01_generator &uni01, exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &G, const double &t_begin, const double &t_end, const double &rho=0., const double &size_at_end=-1)
 Coalescent simulation of exponential change in population size Simulate a single population whose size changes exponentially during some period of time. The relevant command line options for Hudson's program "ms" would be: -eG t_begin G -eG t_end 0. -eN t_end size_at_end.
template<typename uniform_generator, typename uniform01_generator, typename exponential_generator>
arg exponential_change (const uniform_generator &uni, const uniform01_generator &uni01, const exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &G, const double &t_begin, const double &t_end, const double &rho=0., const double &size_at_end=-1)
 Coalescent simulation of exponential change in population size Simulate a single population whose size changes exponentially during some period of time. The relevant command line options for Hudson's program "ms" would be: -eG t_begin -eG t_end 0. -eN t_end size_at_end.
template<typename uniform_generator, typename uniform01_generator, typename exponential_generator>
arg snm (uniform_generator &uni, uniform01_generator &uni01, exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &rho)
template<typename uniform_generator, typename uniform01_generator, typename exponential_generator>
arg snm (const uniform_generator &uni, const uniform01_generator &uni01, const exponential_generator &expo, const std::vector< chromosome > &initialized_sample, const marginal &initialized_marginal, const double &rho)
template<typename uniform_generator>
void add_S_inf_sites (uniform_generator &uni, marginal::const_iterator history, const double &tt, const int &beg, const int &end, const int &nsam, const int &nsites, const int &S, const int &first_snp_index, gamete_storage_type *gametes)
 Add S segregating sites to sample with a particular marginal history, according to the infinitely-many sites model.
template<typename uniform_generator>
void add_S_inf_sites (const uniform_generator &uni, marginal::const_iterator history, const double &tt, const int &beg, const int &end, const int &nsam, const int &nsites, const int &S, const int &first_snp_index, gamete_storage_type *gametes)
 Add S segregating sites to sample with a particular marginal history, according to the infinitely-many sites model.
template<typename poisson_generator, typename uniform_generator>
int infinite_sites (poisson_generator &poiss, uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename poisson_generator, typename uniform_generator>
int infinite_sites (const poisson_generator &poiss, const uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename uniform_generator>
int infinite_sites (uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename uniform_generator>
int infinite_sites (const uniform_generator &uni, gamete_storage_type *gametes, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename poisson_generator, typename uniform_generator>
SimData infinite_sites_sim_data (poisson_generator &poiss, uniform_generator &uni, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename poisson_generator, typename uniform_generator>
SimData infinite_sites_sim_data (const poisson_generator &poiss, const uniform_generator &uni, const int &nsites, const arg &history, const double &theta)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph.
template<typename uniform_generator>
SimData infinite_sites_sim_data (uniform_generator &uni, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename uniform_generator>
SimData infinite_sites_sim_data (const uniform_generator &uni, const int &nsites, const arg &history, const double *total_times, const unsigned *segsites)
 Apply the infinitely-many sites mutation model to an ancetral recombination graph with a fixed number of segregating sites.
template<typename uniform_generator, typename uniform01_generator, typename exponential_generator, typename poisson_generator>
Sequence::SimData neutral_sample (uniform_generator &uni, uniform01_generator &uni01, exponential_generator &expo, poisson_generator &poiss, const double &theta, const double &rho, const int &nsites, const int &nsam, std::vector< chromosome > *sample, arg *sample_history, unsigned *max_chromosomes=NULL, const unsigned &max_chromosomes_inc=0)
 A simple function to generate samples under a neutral equilibrium model.
template<typename uniform01_generator>
std::pair< int, int > pick_spot (uniform01_generator &uni01, const double &total_reclen, const std::vector< double > &reclens, std::vector< chromosome >::const_iterator sample_begin, const unsigned &current_nsam, const double *rec_map)
template<typename uniform01_generator>
std::pair< int, int > pick_spot (const uniform01_generator &uni01, const double &total_reclen, const std::vector< double > &reclens, std::vector< chromosome >::const_iterator sample_begin, const unsigned &current_nsam, const double *rec_map)
template<typename uni01_generator>
void ConditionalTraj (uni01_generator &uni01, std::vector< double > *traj, const unsigned &N, const double &s, const double &dt, const double &initial_frequency, const double &final_frequency=1.)
template<typename uni01_generator>
void ConditionalTraj (const uni01_generator &uni01, std::vector< double > *traj, const unsigned &N, const double &s, const double &dt, const double &initial_frequency, const double &final_frequency=1.)
template<typename uni01_generator>
void ConditionalTrajNeutral (uni01_generator &uni01, std::vector< double > *traj, const double &dt, const double &initial_freq=1., const double &final_freq=0.)
template<typename uni01_generator>
void ConditionalTrajNeutral (const uni01_generator &uni01, std::vector< double > *traj, const double &dt, const double &initial_freq=1., const double &final_freq=0.)
template<typename uni01_generator>
void ConditionalTraj_details (uni01_generator &uni01, std::vector< double > *traj, const unsigned &N, const double &s, const double &dt, const double &initial_frequency, const double &final_frequency)
template<typename uni01_generator>
void ConditionalTrajNeutral_details (uni01_generator &uni01, std::vector< double > *traj, const double &dt, const double &initial_freq, const double &final_freq)

Variables

const unsigned SEQMAXUNSIGNED = std::numeric_limits<unsigned>::max()
const double SEQMAXDOUBLE = std::numeric_limits<double>::max()
const char * basic_dna_alphabet = "[^AGTCN\\-]"
const char * full_dna_alphabet = "[^AGCTNXMRWSKVHDB\\-]"
const char * pep_alphabet = "[^ARNDBCQEZGHILKMFPSTWYV\\-]"
int MAX_SEGSITES
 controls allocation of simulated gametes You must define this in namespace Sequence in your program. A value of 200 works well.
int MAX_SEGS_INC
 controls (re)allocation of simulated gametes You must define this in namespace Sequence in your program. A value of 100 works well


Detailed Description

The namespace in which this library resides.

The entirety of this library is defined in namespace Sequence.


Typedef Documentation

typedef std::vector< std::pair<std::string,int> > Sequence::CodonUsageTable

A CodonUsageTable is a vector of pairs. In each pair, the first element is the codon, and the second element is an integer counting the number of occurrences of the codon

Examples:
codons.cc.

Definition at line 54 of file typedefs.hpp.

typedef std::pair< double, std::string > Sequence::polymorphicSite

For polymorphism data, a Site can be represented as a position (a double) and the characters at that positions (a std::string)

Definition at line 61 of file typedefs.hpp.

typedef std::vector< polymorphicSite > Sequence::polySiteVector

A polymorphism data set can be represented as a vector containing a sequence of polymorphicSite

Definition at line 67 of file typedefs.hpp.


Enumeration Type Documentation

Only UNIVERSAL (=0) is currently supported. The order of the genetic codes is that of NCBI's code tables, available at http://www.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wprintgc?mode=c#SG2

Definition at line 46 of file SeqEnums.hpp.

Values: Unknown=0,Ts, and Tv.
Unknown means unknown, Ts means transition, Tv means transversion

Definition at line 51 of file SeqEnums.hpp.

An enum type for nucleotide data. Comes in handy when you need to iterate over all possible bases, etc. enum values are:A=0,T,G,C,N,GAP

Definition at line 40 of file SeqEnums.hpp.


Function Documentation

void Sequence::addIdentityChar ( PolyTable *  t,
const unsigned &  refseq,
const char &  identical 
)

Fill in a PolyTable with characters representing identity to some reference sequence ("refseq") in the data.

Definition at line 79 of file PolyTableFunctions.cc.

template<typename uni01_generator>
void Sequence::ConditionalTraj ( const uni01_generator &  uni01,
std::vector< double > *  traj,
const unsigned &  N,
const double &  s,
const double &  dt,
const double &  initial_frequency,
const double &  final_frequency 
) [inline]

Stochastic trajectory of beneficial mutations, following Coop and Griffiths (2004).

Parameters:
uni01 a random number generator returning a U(0,1].
traj A vector of length L, such that,for dt = 1/(k*2N), L/(k*2N) is the length of the sweep, in units of 2N generations
Note:
For a diploid population of size N, this function will return trajectories equivalent to what one would get from a Wright-Fisher simulation of a haploid population of size 2N
Parameters:
dt amount by which to change increment time during simulation.
initial_frequency Initial frequency of beneficial allele (i.e. 1/2N)
final_frequency Final frequency of beneficial allele (1 means fixation).

Definition at line 103 of file Trajectories.tcc.

template<typename uni01_generator>
void Sequence::ConditionalTraj ( uni01_generator &  uni01,
std::vector< double > *  traj,
const unsigned &  N,
const double &  s,
const double &  dt,
const double &  initial_frequency,
const double &  final_frequency 
) [inline]

Stochastic trajectory of beneficial mutations, following Coop and Griffiths (2004).

Parameters:
uni01 a random number generator returning a U(0,1].
traj For a diploid population of size N, this function will return trajectories equivalent to what one would get from a Wright-Fisher simulation of a haploid population of size 2N
dt amount by which to change increment time during simulation.
initial_frequency Initial frequency of beneficial allele (i.e. 1/2N)
final_frequency Final frequency of beneficial allele (1 means fixation).
Returns:
A vector of length L, such that,for dt = 1/(k*2N), L/(k*2N) is the length of the sweep, in units of 2N generations

Definition at line 78 of file Trajectories.tcc.

template<typename uni01_generator>
void Sequence::ConditionalTraj_details ( uni01_generator &  uni01,
std::vector< double > *  traj,
const unsigned &  N,
const double &  s,
const double &  dt,
const double &  initial_frequency,
const double &  final_frequency 
) [inline]

Implementation details

Definition at line 12 of file Trajectories.tcc.

template<typename uni01_generator>
void Sequence::ConditionalTrajNeutral ( const uni01_generator &  uni01,
std::vector< double > *  traj,
const double &  dt,
const double &  initial_freq,
const double &  final_freq 
) [inline]

Stochastic trajectory of a neutral allele, following Coop & Griffiths (2004) TPB, and Przeworski et al. (2005) Evolution. The simulation is backwards in time.

Parameters:
uni01 a random number generator returning a U(0,1].
traj A vector of length L, and for dt=1/(k*2N), the length of time, in units of 2N generations, is given by L/(k*2N). The vector describes the change in allele frequency from initial_freq to final_freq, in jumps in time of dt.
dt amount by which to change increment time during simulation.
initial_freq Initial frequency of the neutral allele.
final_freq final frequency of the neutral allele.

Definition at line 151 of file Trajectories.tcc.

template<typename uni01_generator>
void Sequence::ConditionalTrajNeutral ( uni01_generator &  uni01,
std::vector< double > *  traj,
const double &  dt,
const double &  initial_freq,
const double &  final_freq 
) [inline]

Stochastic trajectory of a neutral allele, following Coop & Griffiths (2004) TPB, and Przeworski et al. (2005) Evolution. The simulation is backwards in time.

Parameters:
uni01 a random number generator returning a U(0,1].
traj A vector of length L, and for dt=1/(k*2N), the length of time, in units of 2N generations, is given by L/(k*2N). The vector describes the change in allele frequency from initial_freq to final_freq, in jumps in time of dt.
dt amount by which to change increment time during simulation.
initial_freq Initial frequency of the neutral allele.
final_freq final frequency of the neutral allele.

Definition at line 128 of file Trajectories.tcc.

template<typename uni01_generator>
void Sequence::ConditionalTrajNeutral_details ( uni01_generator &  uni01,
std::vector< double > *  traj,
const double &  dt,
const double &  initial_freq,
const double &  final_freq 
) [inline]

Implementation details

Definition at line 49 of file Trajectories.tcc.

bool Sequence::containsCharacter ( const PolyTable *  t,
const char &  ch 
)

Returns:
true if t contains ch, false otherwise

Definition at line 33 of file PolyTableFunctions.cc.

bool Sequence::Different ( const std::string &  seq1,
const std::string &  seq2,
bool  skip_missing,
bool  nucleic_acid 
)

Ask if two strings are different. While this can normally be done by asking if (seq1 != seq2) {}, missing data poses a problem here. If skip-missing == 1, missing data (the 'N' character for nucleotide data, 'X' for amino acid) are not used to determine if the sequences are different. If nucleic_acid ==1, nucleotide data are assumed, if nucleic_acid==0, protein data are assumed.

Note:
case-insensitive
Returns:
true if the seqs are different, false otherwise. If the two sequences are of different length, true is returned.

Definition at line 114 of file Comparisons.cc.

std::pair< unsigned, shortestPath::pathType > Sequence::diffType ( const std::string &  codon1,
const std::string &  codon2,
const Sequence::GeneticCodes code 
) throw (Sequence::SeqException)

Parameters:
codon1 a std::string of length 3 representing a codon
codon2 a std::string of length 3 representing a codon
code the genetic code to use in translating the codons
Returns:
a std::pair<unsigned,shortestPath::pathType>. The first member of the pair takes a value of either 0,1, or 2, depending on the site at which the two codons differ (1st, 2nd, or 3rd position, respectively). If the codons differ at more than 1 site, or contain characters other that {A,G,C,T}, the first member will be set to Sequence::SEQMAXUNSIGNED. The second member will have the value Sequence::shortestPath::pathType::N if the change is nonsynonymous, Sequence::shortestPath::pathType::S if synonymous, Sequence::shortestPath::pathType::NONE if the codons don't differ, and Sequence::shortestPath::pathType::AMBIG if any of the codons contain characters other than {A,G,C,T}.
Precondition:
(codon1.length()==3 && codon2.length() == 3)

Definition at line 425 of file shortestPath.cc.

boost::tuple< shortestPath::pathType, shortestPath::pathType, shortestPath::pathType > Sequence::diffTypeMulti ( const std::string &  codon1,
const std::string &  codon2,
const Sequence::GeneticCodes code 
) throw (Sequence::SeqException)

Returns:
a tuple representing the type of single position changes between codon1 and codon2. There is one value in the tuple for each codon position.
Note:
The values are assigned as follows: For each position in codon1, and 2, swap the i-th state between the two codons. If this results in a replacement change in both cases, record shortestPath::N. If it's synonymous in both cases, record shortestPath::S. If the swap results in no change at all (i.e. the two bases are identical), record shortestPath::NONE. For all other cases, record shortestPath::AMBIG. This function is most useful at identifying mutations that can be unambiguously classifies as silent or replacement. Note that, if one considers the pathways possible between codons, all sites can be assigned as N or S. For such applications, use Sequence::shortestPath.

Definition at line 477 of file shortestPath.cc.

void Sequence::fillIn ( PolyTable *  t,
const unsigned &  refseq,
const char &  identical 
)

Sometimes polymorphism data contain a special character that means that a particular state is identical to a reference sequence in the data. This function replaces that character with the state of the reference sequence.

Parameters:
t a PolyTable
refseq the index of the reference sequence
identical the character used to represent identity to the refseq

Definition at line 51 of file PolyTableFunctions.cc.

template<typename Iterator>
bool Sequence::Gapped ( Iterator  beg,
Iterator  end,
const char &  gapchar = '-' 
) [inline]

Parameters:
beg an iterator
end an iterator
gapchar a character representing an aligment gap
Returns:
true if gapchar is present in the range [beg,end), false otherwise
Examples:
gestimator.cc.

Definition at line 65 of file Comparisons.hpp.

bool Sequence::Gapped ( const std::string &  s  ) 

Ask if the std::string contains a gap character.

Returns:
true if the string contains gaps, false otherwise
Note:
The only gap character checked so far is '-'. Use template version for other gap characters
Deprecated:

Definition at line 196 of file Comparisons.cc.

template<typename Iterator>
bool Sequence::internalGapCheck ( Iterator  beg,
Iterator  end,
const char &  gapchar = '-',
const unsigned &  mod = 3 
) [inline]

This function checks a range for internal gaps that meet a certain length requirement. The requirement is that lengthmod == 0. The value true is returned if this is not the case, false otherwise. One use of this function may be to check that the internal gaps in an aligned cds sequence are all multiples of 3 in length.

Definition at line 87 of file SeqUtilities.hpp.

CodonUsageTable Sequence::makeCodonUsageTable ( std::string::const_iterator  beg,
std::string::const_iterator  end 
)

Parameters:
beg a const_iterator to the beginning of a std::string or Sequence::Seq
end a const_iterator to the end of a std::string or Sequence::Seq
Returns:
and object of type Sequence::CodonUsageTable
Note:
beg and end can be adjusted to point to the first at last positions in a CDS

Definition at line 101 of file CodonTable.cc.

CodonUsageTable Sequence::makeCodonUsageTable ( const std::string &  sequence  ) 

Parameters:
sequence and object of type std::string
Returns:
and object of type Sequence::CodonUsageTable
Note:
Assumes first character of sequence is a first codon position

Definition at line 91 of file CodonTable.cc.

CodonUsageTable Sequence::makeCodonUsageTable ( const Seq *  sequence  ) 

Parameters:
sequence and object of type Sequence::Seq1
Returns:
and object of type Sequence::CodonUsageTable
Note:
Assumes first character of sequence is a first codon position.

Definition at line 81 of file CodonTable.cc.

template<typename Iterator>
std::map< typename std::iterator_traits< Iterator >::value_type, unsigned > Sequence::makeCountList ( Iterator  beg,
Iterator  end 
) [inline]

Parameters:
beg an iterator
end an iterator
Returns:
a std::map< type, unsigned >, where type is the iterator_traits<Iterator>::value_type of Iterator. The keys are the (unique) elements present in the range, and the unsinged values the numbers of times each element occurs occur
Note:
This function can be used as an alternative to Sequence::stateCounter if you want to count more than just strict DNA characters.
Examples:
baseComp.cc.

Definition at line 62 of file SeqUtilities.hpp.

template<typename ForwardIterator>
std::pair< double, double > Sequence::meanAndVar ( ForwardIterator  beg,
ForwardIterator  end 
) [inline]

A function to calculate the mean and variance of the values stored in a container. The rationale is that when both the mean and the variance (an sum of squares) are needed, it is more efficient to calculate them together, because you only go over the data once.

Examples:
critical_values.cc.

Definition at line 66 of file descriptiveStats.tcc.

void Sequence::minimize_arg ( Sequence::arg sample_history  ) 

Takes an arg (Ancestral Recombination Graph) and removes redundant marginal trees. Specifically, for two adjacent trees i and j (j->beg > i->beg), j is removed from the arg if the topology and branch lengths of i and j are identical.

Parameters:
sample_history the arg to minimize
Examples:
fragments.cc.

Definition at line 165 of file CoalescentTreeOperations.cc.

bool Sequence::NotAGap ( const char &  c  ) 

Returns:
true if a c is not a gap character, false otherwise.
Note:
Currently, only '-' is considered to be a gap character

Definition at line 208 of file Comparisons.cc.

unsigned Sequence::NumDiffs ( const std::string &  seq1,
const std::string &  seq2,
bool  skip_missing,
bool  nucleic_acid 
)

Returns:
the number of differences between two std::strings. Can skip missing data in the same fashion as Comparisons::Different. If one sequence is shorter than the other, the number of positions compared is the length of the shorter sequence.

Definition at line 156 of file Comparisons.cc.

std::ostream & Sequence::operator<< ( std::ostream &  o,
const newick_stream_marginal_tree &  n 
)

Returns:
n.print(o);

Definition at line 444 of file CoalescentSimTypes.cc.

std::istream & Sequence::operator>> ( std::istream &  i,
newick_stream_marginal_tree &  n 
)

Returns:
n.read(o);

Definition at line 452 of file CoalescentSimTypes.cc.

void Sequence::output_gametes ( FILE *  fp,
const unsigned &  segsites,
const unsigned &  nsam,
const gamete_storage_type &  gametes 
)

Write an object of type gamete_storage type to a C-style file stream This function is used when you need to output simulated gametes using a method faster than the operator<< for class SimData.

Parameters:
fp pointer to an open C-style output stream
segsites the number of segregating sites in gametes
nsam the number of individuals in gametes
gametes the simulated sample. Must be allocated to hold at least segsites positions, and nsam strings of length segsites
Examples:
freerec.cc, and msbeta.cc.

Definition at line 28 of file CoalescentMutation.cc.

template<typename iter1, typename iter2, typename correlation_type, typename comparison_function, typename UniformIntGenerator>
ensureFloating< typename std::iterator_traits< iter1 >::value_type, typename std::iterator_traits< iter2 >::value_type >::type Sequence::PermuteCorrelation ( iter1  beg_x,
iter1  end_x,
iter2  beg_y,
const correlation_type &  corr,
const comparison_function &  comp,
const UniformIntGenerator &  rand,
const unsigned &  NPERM 
) [inline]

Obtain the p-value of a correlation coefficient by permutation. This function can be used to get 1- or 2- tailed p-values by using different comparison_function objects. For example, using std::greater_equal<double> will returned the 1-tailed probability of observing a correlation >= the observed value.

Parameters:
beg_x pointer to the beginning of the range of the 1st vector
end_x pointer to the end of the range of the 1st vector
beg_y pointer to the beginning of the range of the 2nd vector
corr a function object to calculate the correlation statistic (i.e. ProductMoment)
comp a comparison function
NPERM number of permutations to do
rand a function returning a random integer (must be compatible with std::random_shuffle)
Note:
This function keeps the order of the 2 containers intact.

Definition at line 237 of file Correlations.tcc.

template<typename iter1, typename iter2, typename correlation_type, typename comparison_function, typename UniformIntGenerator>
ensureFloating< typename std::iterator_traits< iter1 >::value_type, typename std::iterator_traits< iter2 >::value_type >::type Sequence::PermuteCorrelation ( iter1  beg_x,
iter1  end_x,
iter2  beg_y,
const correlation_type &  corr,
const comparison_function &  comp,
UniformIntGenerator &  rand,
const unsigned &  NPERM 
) [inline]

Obtain the p-value of a correlation coefficient by permutation. This function can be used to get 1- or 2- tailed p-values by using different comparison_function objects. For example, using std::greater_equal<double> will returned the 1-tailed probability of observing a correlation >= the observed value.

Parameters:
beg_x pointer to the beginning of the range of the 1st vector
end_x pointer to the end of the range of the 1st vector
beg_y pointer to the beginning of the range of the 2nd vector
corr a function object to calculate the correlation statistic (i.e. ProductMoment)
comp a comparison function
NPERM number of permutations to do
rand a function returning a random integer (must be compatible with std::random_shuffle)
Note:
This function keeps the order of the 2 containers intact.
Examples:
correlations.cc.

Definition at line 208 of file Correlations.tcc.

bool Sequence::PolyTableValid ( const PolyTable *  table  ) 

Returns:
true if the following conditions are met : First, the length of every row in the table is equal to the length of the vector of positions. Second, all of the characters in the data are members of the set {A,G,C,T,N,-} (case-insensitive).
This function is useful if you play around with PolyTable objects in non-const contexts, or read them in from files and need to check that the data are compatible with other routines in this library. This routine can be thought of as a PolyTable equivalent to Alignment::validForPolyAnalysis, which works on ranges of Sequence::Seq objects.

Definition at line 185 of file PolyTableFunctions.cc.

void Sequence::RemoveGaps ( PolyTable *  table,
const char &  gapchar 
) throw (SeqException)

Removes all positions containing gapchar from table

Exceptions:
SeqException if unable to assign transformed data to table (via PolyTable::assign)

Definition at line 108 of file PolyTableFunctions.cc.

void Sequence::RemoveInvariantColumns ( PolyTable *  table,
const bool &  skipOutgroup,
const unsigned &  outgroup 
) throw (SeqException)

Goes through the data and removes any columns that contain only 1 state. There is an option to ignore the state of the outgroup in this calculation.

Exceptions:
SeqException if unable to assign transformed data to table (via PolyTable::assign)

Definition at line 144 of file PolyTableFunctions.cc.

Mutations Sequence::TsTv ( int  i,
int  j 
)

Takes two ints, assumed to be integer representations of nucleotides. The way to ensure that the int represents a nucleotide in a valid way is to use Sequence::Nucleotides. The return value is determined by a call to Comparisons::TsTv(int i, int j), where the ints are defined in turn by Sequence::Nucleotides

Definition at line 90 of file Comparisons.cc.

Mutations Sequence::TsTv ( char  i,
char  j 
)

Takes two chars, assumed to be nucleotides. The integer returned by this function is a member of the enumeration type Sequence::Mutations.

Definition at line 32 of file Comparisons.cc.

template<typename Iter>
bool Sequence::validSeq ( Iter  beg,
Iter  end,
const char *  _pattern = Sequence::basic_dna_alphabet,
const bool  icase = true 
) [inline]

Parameters:
beg an iterator to the beginning of a range
end an iterator to the end of a range
_pattern the (complement of the) alphabet as a regular expression.
icase defaults to case insensitive matching. Pass "false" to make matching case sensitive The character set is complemented because we test for not in the alphabet
Returns:
true if beg and end define a range of valid characters. The range is valid if and only if all characters in the range are present in the pattern (i.e. are not part of the set of characters that complement the pattern)
Note:
requires the boost_regex library to compile (see http://www.boost.org)
Examples:
valid_dna.cc.

Definition at line 56 of file SeqRegexes.hpp.


Variable Documentation

const char* Sequence::basic_dna_alphabet = "[^AGTCN\\-]"

A regex for the complement of the minimal DNA alphabet

Definition at line 41 of file SeqRegexes.hpp.

const char* Sequence::full_dna_alphabet = "[^AGCTNXMRWSKVHDB\\-]"

A regex for the complement of a complete DNA alphabet

Examples:
valid_dna.cc.

Definition at line 46 of file SeqRegexes.hpp.

const char* Sequence::pep_alphabet = "[^ARNDBCQEZGHILKMFPSTWYV\\-]"

A regex for the complement of an amino acid alphabet

Definition at line 51 of file SeqRegexes.hpp.

const double Sequence::SEQMAXDOUBLE = std::numeric_limits<double>::max()

The maximum value of an double

Definition at line 50 of file SeqConstants.cc.

const unsigned Sequence::SEQMAXUNSIGNED = std::numeric_limits<unsigned>::max()

The maximum value of an unsinged integer.

Definition at line 46 of file SeqConstants.cc.


Generated on Wed Feb 4 09:31:49 2009 for libsequence by  doxygen 1.5.6