Name: Sanchita Jamindar
Email: sanchita95616@yahoo.com
Author: Sanchita Jamindar, Shawn W. Polson, and K. Eric Wommack
Author affiliation: University of Delaware, Delaware Biotechnology Inst.
Abstract title: Exploration of the abundant genosphere: Environmental annotation of viral genes in the biosphere
Absstract:
Viruses are not only numerically the most abundant biological entities on the biosphere; they are highly dynamic having high production and decay rates. This implies that viral genes are among the most, if not the most, actively expressed and replicated classes of genes within marine environments. Metagene was used to identify ?116,000 complete open reading frames (ORFs) within 20 viral metagenomes from marine, soil, and extreme environments totaling 217Mb of sequence data in > 290,000 Sanger-length reads. CD-Hit clustering with a 40% similarity threshold resulted in 59,019 total viral protein clusters with 52% of ORFs represented in the top 7,738 clusters containing three or more members. Our data revealed numerous viral protein clusters more abundant than DNA polymerase, major capsid protein genes, and other targets expected to occur at high abundance. A significant finding was the fact that the majority of the most common clusters do not display significant homology to known proteins in the GenBank nr database. The occurrence of genes from four large clusters were assessed within archived Chesapeake Bay (CB) and Delaware Bay (DB) viral concentrate samples, by cloning and sequencing of amplified products using cluster specific primers. The study indicated that each of the four clusters were broadly distributed across samples from both CB and DB. We further conceptualize an inverse PCR technique utilizing similar targets to assess the "genomic neighborhood" of the sequences comprising each cluster. These data underscore the dearth of knowledge regarding the significance of highly abundant viral-encoded proteins.