Quikr: a Method for Rapid Reconstruction of Bacterial Communities via Compressive Sensing
Oxford Journal of Bioinformatics
(2013) (Under Review)
AbstractMany metagenomic studies compare hundreds to thousands of environmental and health-related samples by extracting
and sequencing their 16S rRNA amplicons and measuring their similarity using beta-diversity metrics. However, one of the first steps- to classify the operational taxonomic units withing the sample - can be a computationally time-consuming task since most methods rely on computing the taxonomic assignment of each individual read out of tens to hundreds of thousands of reads.
We introduce Quikr: a QUadratic, K-mer based, Iterative, Reconstruction method which computes a vector of taxonomic assignments and their proportions in the sample using an optimization technique motivated from the mathematical theory of compressive sensing. On both simulated and actual biological data, we demonstrate that Quikr is typically more accurate as well as typically orders of magnitude faster than the most commonly utilized taxonomic assignment technique (the Ribosomal Database Project√Ę‚?¨‚?Ęs Naive Bayesian Classifier). Furthermore, the technique is shown to be unaffected by the presence of chimeras thereby allowing for the circumvention of the time-intensive step of chimera filtering.The Quikr computational package (using MATLABor Octave) for the Linux and Mac platforms is available at http://sourceforge.net/projects/quikr/.