So what can cause a departure from Hardy Weinberg equilibrium if we conduct the test and reject the hypothesis? What is the biological mechanism leading to that rejection? Of course, we can't tell directly by rejecting what was the cause, but we can certainly consider a range of possibilities. Natural selection would be one with some genotypes are favored over others. Mutation might be one where the allele frequencies have changed. Their maybe migration coming into the population. There are many possible causes for rejecting for departures from Hardy Weinberg. The most common one that we focus that we focus on is unrecognized population structure. The population we've sampled, rather than being homogeneous, might infect consist of a series of subpopulations if we're starting a genetic disease. The population they have a group of people with the disease, and those without. If the allele frequencies for a snip just by chance with different in the cases with the disease in the controls without the disease, than a Hardy Weinberg test on the whole population could agree will indicated departure. There can be many reasons for population structure. It could be geographic people living in different areas would have a greater tendency to marry within that one area. So geographic separation can lead overtime to maybe even quite small differences in allele frequencies. There might be cultural barriers tending to favor marriages and children within one group rather than between groups. So for whatever reason there might be subdivision within the population. As a simple example, supposing the population had two groups, these might be 2 geographic regions that we have chosen to ignore. There might be religious differences, people tending to marry within their own religion than not, even though they the same ancestral background. Supposing there in there are two groups and they were equally frequent in the population population, roughly is 50% in each group, but for whatever reason, and it could simply be chance the allele frequencies are now different population. One subpopulation one, the little frequencies are PA and PB population 2 the QA and QB. With random mating an Hardy Weinberg equilibrium will result within each of those two groups, so we know the genotype proportions are squares and products of the Ps or the Qs. The population as a whole, however, has an allele frequency, which is the average of the two separate frequencies. And the three genotype proportions and the whole population are averages of those three pairs. When we come to test for Hardy Weinberg with a sample from the whole population, the genotype proportions we see may not be simply the squares and products of the actually or proportions in their population. If we put all that information together, we could look at the population, inbreeding coefficient by the same equation as before, take 1 minus the ratio of the actual heterozygosity, which turns out to be P -- Q 2^ and divide by the twice the product of the average real frequencies. That quantity that induced inbreeding coefficient, the induced departure from Hardy Weinberg is always positive, has a square on the top line. So this population subdivision will lead to an excess of homozygosity. Over what we expect under how do you want it so that phenomenon with subdivision is more homozygote than expected, as called the violent effect, and affect that holds the matter how many subpopulations there are, what they're allele frequencies are, and what the proportions of the sub populations are. It's a very nice result, but are certainly complicates subsequent analysis. We shouldn't proceed with many of our Epidemiology calculations. If there was evidence with population structure. We should take it. Take that into account so that leads us to a little more in depth discussion of population structure. So still with this little example, supposing the two equal size subgroups had earlier frequencies the same distance from behalf, so sub first subgroup A for subgroup, the proportion was a half plus a small amount epsilon. The second subgroup of had a small amount minus epsilon. If we use that, it turns out that the inbreeding coefficient is induced is four times the square of epsilon and the non centrality parameter for testing Hardy Weinberg is 16 times epsilon to the power of four times the sample size so? How likely are we to detect a departure of 0.1? One subgroup has frequencies of 0.46. And the other has the opposite 0.6 and 0.4. That induced inbreeding coefficient as 0.04. Not an unusually high value for human populations with sample sizes of 1000, the power the probability of detecting that inbreeding level is 0.24. There are not particularly high 24% chance of detecting was actually quite a large degree of population structure. Will do better Of course if we have more snips, although the calculations become more complicated if the cause the many sips are unlikely that will have the same allele frequencies. What else can we say about Hardy Weinberg testing? The Chi square test we've talked about is simple, it's been around a long time. It has some underlying assumptions, including the fact that the allele proportions in the sample have a normal distribution that can't be exactly true because the allele counts in the sample are discrete, whereas the normal distribution is for continuous variables. And partly for that reason that the Chi square test doesn't perform particularly well if the minor allele proportions are small, the minor allele counts are close to 0. We do better with the test that uses the binomial distribution for allele frequencies instead of the normal, so that's S is called an exact test, and the P value we calculate is based on the binomial rather than the normal distribution. So that test is better. Even then, though, the tests don't do especially well for small minor allele counts. It turns out that testing can be severely affected over a large number of snips if there's unbalanced data. If not all the snips are typed on all the individuals if there's some missing data, than that effects the performance of the exact square test quite severely. We might get spurious indications of departures of Hardy Weinberg. Another thing that we don't often pay much attention to is that we might get to purchase from Hardy Weinberg because of the structure induced by 6. If we simply combine the data from males and females in our study, and if they had those groups had different allele frequencies, then we'll get to purchase from Hardy Weinberg.. So one possibility is to do Hardy Weinberg test separately and males and females.