vConTACT2: Leveraging network analytics to classify novel viruses

Benjamin Bolduc (November 9, 2018)

Viruses of bacteria and archaea are likely to be critical to all natural, engineered and human ecosystems, and yet their study is hampered by the lack of a universal or scalable taxonomic framework. Here, we introduce vConTACT 2.0, a network-based application to establish prokaryotic virus taxonomy that scales to thousands of uncultivated virus genomes/fragments, and integrates confidence scores for all taxonomic predictions. Performance tests using vConTACT 2.0 demonstrate near-identical correspondence to the current official viral taxonomy (>85% genus-rank assignments at 96% accuracy) through an integrated distance-based hierarchical approach. Beyond €œknown viruses€?, we used vConTACT 2.0 to automatically assign 1,364 previously unclassified reference viruses to tentative taxa, and scaled it to modern metagenomic datasets for which the reference network was robust to adding 16,000 viral metagenomic contigs. Together these efforts provide a systematic reference network and an accurate, scalable taxonomic analysis tool that is critically needed for the research community.