Seminar: Laura Kubatko - Using Algebraic Statistics to Estimate Phylogenetic Trees, with Application to Food Security in East Africa

Photo of Laura Kubatko
September 17, 2019
10:20AM - 11:15AM
MBI Auditorium, Jennings Hall 355

Date Range
2019-09-17 10:20:00 2019-09-17 11:15:00 Seminar: Laura Kubatko - Using Algebraic Statistics to Estimate Phylogenetic Trees, with Application to Food Security in East Africa Laura Kubatko Co-Director of MBI and Professor in the Departments of Statistics and Evolution, Ecology, and Organismal Biology, The Ohio State University The advent of rapid and inexpensive DNA sequencing technologies has necessitated the development of computationally efficient methods for analyzing sequence data for many genes simultaneously in an evolutionary framework.  This is particularly important for systems in which evolution occurs rapidly in response to environmental conditions and for which it is important to quickly diagnose what species are present. The multispecies coalescent is the most commonly used model for estimating species-level phylogenetic trees from multi-locus data, but inference under the coalescent model is computationally daunting in the typical inference frameworks (e.g., the likelihood and Bayesian frameworks) due to the dimensionality of the space of both gene trees and species trees.   By viewing the data arising under the phylogenetic coalescent model as a collection of site patterns, the algebraic structure associated with the probability distribution on the site patterns can be used to develop computationally efficient methods for inference. In this talk, I will describe how identifiability results for four-taxon species trees based on site pattern probabilities can be used to build a quartet-based inference algorithm for trees of arbitrary size. The method will be applied to data on viruses that infect cassava plants worldwide, with losses of approximately $100 million USD annually in East Africa alone. The methods discussed in this talk are the result of joint work with former MBI post-docs Julia Chifman and Colby Long. MBI Auditorium, Jennings Hall 355 America/New_York public

Laura Kubatko

Co-Director of MBI and Professor in the Departments of Statistics and Evolution, Ecology, and Organismal Biology, The Ohio State University


The advent of rapid and inexpensive DNA sequencing technologies has necessitated the development of computationally efficient methods for analyzing sequence data for many genes simultaneously in an evolutionary framework.  This is particularly important for systems in which evolution occurs rapidly in response to environmental conditions and for which it is important to quickly diagnose what species are present. The multispecies coalescent is the most commonly used model for estimating species-level phylogenetic trees from multi-locus data, but inference under the coalescent model is computationally daunting in the typical inference frameworks (e.g., the likelihood and Bayesian frameworks) due to the dimensionality of the space of both gene trees and species trees.   By viewing the data arising under the phylogenetic coalescent model as a collection of site patterns, the algebraic structure associated with the probability distribution on the site patterns can be used to develop computationally efficient methods for inference. In this talk, I will describe how identifiability results for four-taxon species trees based on site pattern probabilities can be used to build a quartet-based inference algorithm for trees of arbitrary size. The method will be applied to data on viruses that infect cassava plants worldwide, with losses of approximately $100 million USD annually in East Africa alone.

The methods discussed in this talk are the result of joint work with former MBI post-docs Julia Chifman and Colby Long.

 

 

Events Filters: