Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Apr 1;111(13):4904-9.
doi: 10.1073/pnas.1402564111. Epub 2014 Mar 14.

Tackling soil diversity with the assembly of large, complex metagenomes

Affiliations

Tackling soil diversity with the assembly of large, complex metagenomes

Adina Chuang Howe et al. Proc Natl Acad Sci U S A. .

Erratum in

  • Proc Natl Acad Sci U S A. 2014 Apr 22;111(16):6115

Abstract

The large volumes of sequencing data required to sample deeply the microbial communities of complex environments pose new challenges to sequence analysis. De novo metagenomic assembly effectively reduces the total amount of data to be analyzed but requires substantial computational resources. We combine two preassembly filtering approaches--digital normalization and partitioning--to generate previously intractable large metagenome assemblies. Using a human-gut mock community dataset, we demonstrate that these methods result in assemblies nearly identical to assemblies from unprocessed data. We then assemble two large soil metagenomes totaling 398 billion bp (equivalent to 88,000 Escherichia coli genomes) from matched Iowa corn and native prairie soils. The resulting assembled contigs could be used to identify molecular interactions and reaction networks of known metabolic pathways using the Kyoto Encyclopedia of Genes and Genomes Orthology database. Nonetheless, more than 60% of predicted proteins in assemblies could not be annotated against known databases. Many of these unknown proteins were abundant in both corn and prairie soils, highlighting the benefits of assembly for the discovery and characterization of novelty in soil biodiversity. Moreover, 80% of the sequencing data could not be assembled because of low coverage, suggesting that considerably more sequencing data are needed to characterize the functional content of soil.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Summary of approaches for large-scale assembly of complex metagenomes presented in this study. Unprocessed (I), normalized (II), and partitioned assemblies (III) were evaluated and compared with the HGMC metagenome. These approaches were used toward the assembly of metagenomes.
Fig. 2.
Fig. 2.
Coverage (median base pair recovered) distribution of assembled contigs from the Iowa corn soil (Upper) and Iowa prairie soil (Lower) metagenomes.
Fig. 3.
Fig. 3.
Distribution of most abundant KEGG Orthology groups identified in corn and prairie soil metagenomes.

References

    1. Arumugam M, et al. MetaHIT Consortium Enterotypes of the human gut microbiome. Nature. 2011;473(7346):174–180. - PMC - PubMed
    1. Hess M, et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science. 2011;331(6016):463–467. - PubMed
    1. Iverson V, et al. Untangling genomes from metagenomes: Revealing an uncultured class of marine Euryarchaeota. Science. 2012;335(6068):587–590. - PubMed
    1. Mackelprang R, et al. Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw. Nature. 2011;480(7377):368–371. - PubMed
    1. Qin J, et al. MetaHIT Consortium A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464(7285):59–65. - PMC - PubMed

Publication types