"Statistical methods for genomic associations" by Roderick D. Ball, Scion (NZ Forest Research Institute Ltd.) Quantitative trait locus (QTL) and association mapping attempt to find locations in the genome associated with variation in traits of interest. QTL mapping exploits the correlations (so called `linkage disequilibrium') between marker loci and loci putatively associated with a trait, that are generated in a pedigree or family. For example the probability that 2 loci on a chromosome are inherited from the same grandparent is 1-r where the recombination rate, r, ranges from 0 to 0.5 and is an increasing function of distance between loci within a chromosome and is 0.5 for loci on different chromosome. QTL mapping aims to infer the genetic architecture of traits by estimating the number, locations and effects for QTL loci whose genotypes and locations are a priori unknown. Association mapping (also known as linkage disequilibrium mapping) exploits linkage disequilibrium between loci in a population. Association mapping exploits recombinations that have occurred in the whole population history, hence has potentially much greater resolution. However population linkage disequilibrium is not a monotonic function of distance between loci, and to exploit the high resolution requires genotyping many markers on large sample sizes of individuals. Many published associations are spurious. Spurious associations could arise from undiagnosed population structure. Another explanation is that the evidence was never strong. When re-evaluated with Bayesian methods, the evidence was found to be weak (e.g. Bayes factor less than 1), or in some cases moderately strong but still insufficient to overcome the low prior probability per marker for genomic associations. This includes a number of associations from recent large scale genome-wide association studies (Diabetes genome initiative of Harvard, MIT, Lund Universities and Novartis; and the Wellcome Trust of Oxford University). We will introduce the biological background for gene mapping and discuss Bayesian experimental design and statistical methods ranging from closed form single locus calculation of Bayes factors for case control studies and test statistics (Association Mapping in Plants, Chapters 7,8; Springer 2007) to approximate posterior probabilities for models in multilocus methods for Bayesian inference of the genetic architecture (Ball, Genetics 2001). For association mapping with 500,000 or more SNP marker loci, brute force evaluation of all possible models is not possible, therefore we need to resort to a search strategy such as Markov chain Monte Carlo (MCMC) simulations with the goal of finding a subset of models accounting for a high percentage of posterior probability. The Bayesian model selection framework (where models where only specified sets of selected markers have non-zero effects) is useful or necessary to make the algebra feasible (e.g. to evaluate the full X'X matrix or its inverse is not possible). Many MCMC methods and variants have been used in the genetics literature but there is a large gap between theory and practice. Inference from MCMC assumes the sampler has converged. MCMC convergence is guaranteed by theory under general ergodicity conditions. However the conditions are rarely verified, moreover theoretical bounds on convergence are orders of magnitude greater than the number of iterations used and thought to be needed in practice for (apparent) convergence. The methods are often presented with a single example, and there has been little or no attention to the convergence of the MCMC algorithms by the authors or subsequent researchers, and a lack of papers comparing or reconciling different methods. As a result, one cannot rely on sampler convergence and correctness of current gene mapping MCMC methods, models and computer implementations. Diagnostics based on the sampled chains exist and can often diagnose problems with a sampler and the sampler algorithm can generally be adjusted or tuned to provide a rapidly converging sampler in common garden statistical models. However, there is no guarantee, and diagnostics only show apparent convergence which can persist for thousands or millions of iterations in worst case scenarios. With the large number of possible parameters corresponding to a dense marker map covering the genome, it is desirable that samplers converge automatically, and to have some confidence in convergence. Current research for improving and verifying convergence of the MCMC samplers for genomic associations will be outlined. This includes using analytically calculated probabilities to adaptively adjust sampling probabilities with respect to a Bayesian model selection framework (which the parameter space for a range of samplers can be mapped to) so that sample frequencies for models converge to the approximate values. A perfect sampler would be desirable but may not be possible/practical. A regeneration sampler and/or bounds on convergence (found e.g. using Nummelin splitting and the analytically calculated probabilities), would be useful alternatives since the chain after a regeneration is independent of the starting point, and independence of tours between successive regenerations assures properties of ergodic averages.