Mat-1.600 Laskennallisen tieteen ja tekniikan seminaari

5.4.2004  14.15  U322

Jaakko Peltonen, Informaatiotekniikan laboratorio
Visualizations for Assessing Convergence and Mixing of MCMC and Informative Discriminant Analysis

1. Visualizations for Assessing Convergence and Mixing of MCMC
Bayesian inference often requires approximating the posterior distribution with Markov Chain Monte Carlo (MCMC) sampling. A central problem with MCMC is how to detect whether the simulation has converged. The samples come from the true posterior distribution only after convergence. A common solution is to start several simulations from different starting points, and measure overlap of the different chains. We point out that Linear Discriminant Analysis (LDA) minimizes the overlap measured by the usual multivariate overlap measure. Hence, LDA is a justified method for visualizing convergence. However, LDA makes restrictive assumptions about the distributions of the chains and their relationships. These restrictions can be relaxed by the extension discussed below:

2. Informative Discriminant Analysis
We introduce a probabilistic model that generalizes classical linear discriminant analysis and gives an interpretation for the components as informative or relevant components of data. The components maximize the predictability of class distribution which is asymptotically equivalent to (i) maximizing mutual information with the classes, and (ii) finding principal components in the so-called learning or Fisher metrics. The Fisher metric measures only distances that are relevant to the classes, that is, distances that cause changes in the class distribution. The components have applications in data exploration, visualization, and dimensionality reduction.