A Colorado State University-led team of scientists including researchers from Lawrence Berkeley National Laboratory (Berkeley Lab) have detailed for the first time both broad and specific information about the presence and function of microorganisms in rivers covering 90% of the watersheds in the continental U.S. Cataloging the microbiome of these rivers is the result of a yearslong participatory science effort published Nov. 20, 2024, in the journal Nature.
Berkeley Lab expertise and capabilities enabled this research by bringing genomic sequencing tools and expertise along with making the data widely accessible through public databases. Enabled by the JGI’s Community Science Program, Wrighton and her team set out to study river microbiomes. They sent the river water samples to the JGI for genomic sequencing that revealed specific clues about the river water samples. The JGI sequenced over 1,000 metagenomes, which provided insight into the varying communities of microbes across the samples. Researchers also contributed to generating metatranscriptomic information, which revealed gene expression of microbes.
Throughout their sequencing efforts, the JGI established standardized sequencing protocols, which means that researchers could compare results across the varying sampling locations. The JGI’s expertise and resources were also crucial for assembling and sorting the vast amount of data into individual microbial genomes or “bins”. The bins from the automated pipeline, alongside additional binning approaches from the Wrighton Lab, were used to produce GROWdb, the Genome Resolved Open Watershed database.
These datasets were made available on multiple platforms hosted at Berkeley Lab. The NMDC data portal provides access to the GROW project as part of a larger microbiome data integration hub. Researchers added to the JGI analysis by linking the metagenomic and metatranscriptomic data to other relevant datasets, which ultimately provide a more comprehensive analysis of river microbial activity.
The published river microbiome datasets are also available through the DOE Systems Biology Knowledgebase (KBase). KBase enables researchers and students to immediately copy and use data from the GROWdb for comparative analyses with their own data on the KBase platform. This saves time, with no need to download and format data or load analysis packages and resources as KBase provides all the computing power. Meanwhile, the dataset for the worldwide river sampling effort, WHONDRS, is hosted on DOE’s environmental systems science data repository, ESS-DIVE.
“GROWdb and the collaborative team behind the effort, along with infrastructure at the JGI, NMDC, ESS-DIVE, and KBase, is a stellar model for the future of microbiome research, with data integration and sharing as the foundation,” said Emiley Eloe-Fadrosh, head of the Metagenome Program at the U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a DOE Office of Science user facility located at Berkeley Lab, and lead PI of the National Microbiome Data Collaborative (NMDC).
Read the full story on the Berkeley Lab news site.