Special Lecture Archives for Academic Year 2014

Key Words: phylogeny estimation, multiple sequence alignment, metagenomics, large dataset analysis, phylogenetic placement, Hidden Markov Models

When: Thu, September 26, 2013 - 11:00am
Where: AVW 2460
Speaker: Professor Tandy Warnow (The University of Texas at Austin) -
Abstract: The estimation of the "Tree of Life" from biomolecular sequence data presents many interesting computational, mathematical, and statistical challenges. In particular, the most accurate methods have depended on having an accurate multiple sequence alignment, and the estimation of an accurate multiple sequence alignment is itself a challenging problem. In this talk, I will present several new methods that combine machine learning, chordal graph theory, and local search strategies, to obtain improved large-scale multiple sequence alignment and phylogeny estimation. I will begin with "fast-converging methods", which are algorithms that are guaranteed to reconstruct the true tree from polynomial length sequences, under the standard assumption that the sequences are generated by a Markov model down an unknown model tree. I will then present DACTAL, a method for estimating trees without a multiple sequence alignment, and methods for ultra-large alignment estimation (SATe, PASTA (unpublished), and UPP (unpublished)). Finally, if time permits, I will present results for TIPP (unpublished), a method for taxon identification of short metagenomic reads. I will also present results on simulated datasets with up to 1,000,000 (1 million) sequences, and also on biological datasets with up to 100,000 sequences.

A generalized inequality for elliptic solutions by Calabi

When: Tue, October 1, 2013 - 4:30pm
Where: MTH 3206, Colloquium
Speaker: Ryan Hunter (UMCP) -

Introduction to Hyperbolic Dynamics

When: Wed, October 2, 2013 - 12:30pm
Where: MATH 0102
Speaker: Prof. Michael Jakobson (Dept. of Math, UMCP) -

Posterior Concentration in High-Dimensional Data Analysis

When: Mon, December 9, 2013 - 2:00pm
Where: MATH 3206
Speaker: Nate Strawn (Duke) -
Abstract: From megapixel imagery to DNA microarrays, modern data collection methods are creating data sets with higher and higher dimensionality. Despite the high ambient dimensionality, most data sets exhibit much lower complexity. For example, it is well-known that natural images admit sparse approximations in wavelet bases. The existence of sparse approximations indicates that the lower complexity of the data is in part due to the low intrinsic dimensionality of the data. Compressed sensing is one example of the exploitation of low-dimensionality to perform estimation and model selection in high-dimensional data analysis.

In many situations, it is possible to collect large amounts of data beforehand to construct an empirical prior distribution for a data set. In this talk, we examine the phenomenon of posterior concentration in high-dimensional data analysis, validating a framework for incorporating this kind of precise a priori information. In our work, we analyze under-sampled linear regression with a Gaussian noise model, and we show that a large class of priors admit a finite-sample posterior concentration bound (with explicit constants) around the true signal in this setting. While most compressed sensing theory shies away from incorporating more precise a priori information because of its focus on convex optimization, our results demonstrate that a priori information can be successfully incorporated in a full Bayesian analysis with similar theoretical guarantees. We discuss present and future directions for exploring and exploiting this framework.

2014 Monroe-Martin Spotlight on Student Research Talks

When: Mon, May 5, 2014 - 2:15pm
Where: Colloquium Room 3206
Speaker: Graduate Students (Department of Mathematics, UMCP) -

2014 Monroe-Martin Spotlight on Student Research Talks

When: Wed, May 7, 2014 - 2:15pm
Where: Colloquium Room 3206
Speaker: Graduate Students (Department of Mathematics, UMCP) -