본문 바로가기

ecology

Watterson estimator

*집단 변이율, 집단 유전적 다양성, 

폴리모픽 사이트의 숫자를 세어서, 

측정된 핵산다양성에서 유효집단크기와 세대별변이율 도출


method for describing the genetic diversity in a population. 

It is estimated by counting the number of polymorphic sites

It is a measure of the "population mutation rate" from the observed nucleotide diversity of a population. 

, where  is the effective population size and  is the per-generation (neutral) mutation rate of the population of interest (Watterson (1975) ). 


The assumptions made are that there is a sample of  haploid individuals from the population of interest, 

that there are infinitely many sites capable of varying (so that mutations never overlay or reverse one another), 

and that 

Because the number of segregating sites counted will increase with the number of sequences looked at, the correction factor  is used.


The estimate of , often denoted as , is  


where  is the number of segregating sites (an example of a segregating site would be a single-nucleotide polymorphism) in the sample and


 is the th harmonic number.


This estimate is based on coalescent theory

Watterson's estimator is commonly used for its simplicity. 

When its assumptions are met (적당한), the estimator is unbiased and 

the variance of the estimator decreases with increasing sample size or recombination rate


However, the estimator can be biased by population structure

For example,  is downwardly biased in an exponentially growing population. 


It can also be biased by violation of the infinite-sites mutational model; 

if multiple mutations can overwrite one another, Watterson's estimator will be biased downward.


Comparing the value of the Watterson estimator, to nucleotide diversity 

is the basis of Tajima's D which allows inference of the evolutionary regime of a given locus.