Sequence::_PolySNPImpl Struct Reference

#include <PolySNPimpl.hpp>

List of all members.

Public Member Functions

void preprocess (void)
 _PolySNPImpl (const Sequence::PolyTable *data, bool haveOutgroup, unsigned outgroup, bool totMuts)

Public Attributes

const PolyTable_data
unsigned _nsites
unsigned _nsam
unsigned _outgroup
bool _haveOutgroup
bool _totMuts
unsigned _totsam
unsigned _DVK
double _DVH
bool _counted_singletons
bool _know_pi
bool _CalculatedDandV
double _pi
unsigned _singletons
unsigned _walls_Bprime
unsigned _NumPoly
double _walls_B
double _walls_Q
bool _calculated_wall_stats
std::vector
< Sequence::stateCounter
_counts
std::vector< std::pair< bool,
Sequence::stateCounter > > 
_derivedCounts
bool _preprocessed

Detailed Description

Implementation details for PolySNP. This class is visible so that it can be accessed from classes derived from PolySNP. A PolySNP object contains a pointer to an instance of this class that is storage class protected.

Definition at line 30 of file PolySNPimpl.hpp.


Member Function Documentation

void Sequence::_PolySNPImpl::preprocess ( void   ) 

This routine takes the data and obtains count information for each possible character state at each site. The reason for doing this is that the various summary statistics that depend on frequency spectrum information are all O(n x S) to compute, where n is the sample size and S the number of sites. However, all of these statistics depend basically on the same features of the data, so it inefficient to recalculate them every time. This function obtains the counts for each site. Note that the efficiency gain is huge (especially for large data sets) as each summary statistic is reduced from an O(n x S) to an O(S) calculation. The increase in run-time efficiency comes at the cost of allocating 2 vectors whose sizes are linear in S

Definition at line 92 of file PolySNP.cc.


The documentation for this struct was generated from the following files:

Generated on Mon Jul 12 15:22:04 2010 for libsequence by  doxygen 1.6.1