Parameter estimation in models for sequence alignment

Ana Arribas-Gil (February 23, 2010)

Please install the Flash Plugin

Abstract

Models for pairwise alignment based on the TKF (Thorne, Kishino and Felsenstein 1991) indel process fit into the pair-Hidden Markov Model (pair-HMM). Observations in a pair-HMM are formed by the couple of sequences to be aligned and the hidden alignment is a Markov chain. Many efficient algorithms have been developed to estimate alignments and evolution parameters in this context. From a theoretical point of view, it is also interesting to investigate the statistical properties of the estimators computed by these algorithms. In particular, we will discuss consistency of maximum likelihood and Bayesian estimators. In the context of multiple alignment, the classical indel evolution process for the sequences (TKF) provides a complex hidden variable model for the alignment in which the phylogenetic relationships between the sequences must be taken into account. We provide a theoretical framework for this model and study, as for the pairwise alignment, the consistency of estimators.