Tuesday, January 09, 2007

To test or not to test that is the question

A recent paper by Zou & Donner (2006) questions whether testing for Hardy-Weinberg equilibrium (HW-eqm)case-control association studies is a viable strategy.

The main point they make is that genotyping error is unlikely to be detected by testing for departure from HW-eqm. This is for a number of reasons, the assumptions underlying HW-eqm, and secondly (perhaps more importantly) that by performing a two stage analysis of screening for deviations for HW-eqm and then only taking forward markers which do not deviate inflates the Type 1 Error rate, because the p-value can not be interpreted as evidence that alleles are independent (i.e. HW-eqm holds).

This is of course appealing as it reduces the computational burden (particuarly when performing whole genome screens by association where multiple testing becomes a big problem), but also because a large proportion of associations are seen where loci do deviate from HW-eqm.

The authors propose a new test for adjusted χ2 test based on the difference in variance of the estimated allele frequencies in cases and controls which essentially is essentially the same as a Cochran-Armitage trend test (Sassieni, 1977). The power and performance of this test is discussed in a upcoming paper in Annals of Human Genetics (Ahn et al 2007).

So the upshot of all of this is that as a first pass screen your probably better of using a robust test rather than worrying about deviations from Hardy-Weinberg equilibrium.

References and Links

  • Zou GY, Donnet A (2006) The Merits of Testing Hardy-Weinberg Equilibrium in the Analysis of Unmatched Case-Control Data: A Cautionary Note Annals of Human Genetics 70:923-933

  • Ahn K, Haynes C, Kim W, Fleur RS, Gordon D, Finch SJ (2007) The Effects of SNP Genotyping Errors on the Power of the Cochran-Armitage Linear Trend Test for Case/Control Association Studies. Annals of Human Genetics AOP:

  • Sassieni PD (1997) From genotype to genes: doubling the sample size 53:1253-1261

Monday, January 01, 2007

Fast Genome Wide Association Analysis

Recently a new R package that provides an interface to running PBAT in parallel was published in Bioinformatics.The package is described in the Open Access article here.

This is the first package that I've come across that provides an easy interface to analyse the massive numbers of data that are being generated by the new genotyping arrays such as Illumina's bead station and Affymetrix' SNP chip. Basically the data is split up and small chunks analysed on each processor, with a seperate instance of PBAT running on eahc processor. The graphical R front end allows users and easy way to select their analysis option and then handles splitting the data, sending off jobs and amalgamating the results when they are completed.

Refernces and Links