Hardy-Weinberg principle

The Hardy-Weinberg principle (HWP) (also Hardy-Weinberg equilibrium or HWE) states that, under certain conditions, after one generation of random mating, the genotype frequencies at a single gene (or locus) will become fixed at a particular equilibrium value. It also specifies that those equilibrium frequencies can be represented as a simple function of the allele frequencies at that locus. In the simplest case of a single locus with two alleles A and a with allele frequencies of p and q, respectively, the HWP predicts that the genotypic frequencies for the AA homozygote to be p², the Aa heterozygote to be 2pq and the other aa homozygote to be q².

The Hardy-Weinberg principle is an expression of the notion of a population in "genetic equilibrium" and is a basic principle of population genetics. First formulated independently in 1908 by English mathematician G. H. Hardy and German physician Wilhelm Weinberg the original assumptions for Hardy-Weinberg equilibrium (HWE) were that populations are:

and experience:

no drift (i.e. an "infinite" population)
no selection
no mutation
no migration (gene flow)

Derivation of the Hardy-Weinberg principle

A more statistical description for the HWP, is that the alleles for the next generation for any given individual are chosen independently. Consider two alleles, A and a, with frequencies p and q, respectively, in the population then the different ways to form new genotypes can be derived using a Punnett square, where the size of each cell is proportional to the fraction of each genotypes in the next generation:

Females

A (p) a (q)

Males A (p) AA (p²) Aa (pq)

a (q) aA (qp) aa (q²)

So the final three possible genotype frequencies, in the offspring, if the alleles are drawn independently become:

p² (AA)
2pq (Aa)
q² (aa)

The generalization of the HWP for more than two alleles, can be found by the multinomial formula. If the frequencies of the n alleles at a given locus A₁,...,A_n are given by p₁,...,p_n, then the frequency of the A_iA_j genotype is given by:

2p_ip_j if i≠j and;
p_i² if i=j.

Testing deviation from the HWP is generally performed using Pearson's chi-square test, using the observed genotype frequencies obtained from the data and the expected genotype frequencies obtained using the HWP. For systems where there are large numbers of alleles, this may result in data with many empty possible genotypes and low genotype counts, because there are often not enough individuals present in the sample to adequately represent all genotype classes. If this is the case, then the asymptotic assumption of the chi-square distribution, will no longer hold, and it may be necessary to use a form of Fisher's "exact test" (see likelihood-ratio test).

Derivation of the Hardy-Weinberg principle

See also