Advances in DNA sequencing have facilitated the accumulation of genomic data for thousands of organisms from diverse environments. A major challenge in genomics, however, is the knowledge gap between reading a gene’s nucleotide sequence and understanding its encoded function. Computational methods have proven limited in their ability to predict a gene’s function based solely on its sequence, and biochemical and genetic methods for functional characterization have traditionally been quite laborious. What’s more, the importance of a given gene may only become apparent under certain conditions.
Dual barcoded shotgun expression library sequencing, or Dub-seq, is a novel high-throughput method for discovering gene function in microbes under various environmental conditions. It was developed by Adam Arkin, Adam Deutschbauer, Vivek Mutalik, and Pavel Novichkov of the Environmental Genomics and Systems Biology (EGSB) Division under the aegis of the Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) program. ENIGMA is a multi-institution consortium funded by DOE through its Scientific Focus Area (SFA) grant program and managed by Berkeley Lab. In a seminal paper published January 18 in Nature Communications, the Dub-seq team presented details of the technique and proof-of-concept work applied to the bacterium E. coli.
Dub-seq employs a workflow of inserting a DNA fragment from any source between two DNA barcodes, i.e. random series of twenty nucleic acid bases. Once the barcode is associated with a specific fragment it can be used as a proxy; this association is done only once for each library. Next, the library is moved into a host platform—be it a bacterial, fungal, plant, or animal model system—and the function imparted by barcoded DNA fragment to the host is studied under many different conditions.
The authors demonstrated that this approach is reproducible, economical, scalable, and identifies both known and novel gene functions. “The Dub-seq method is simpler and more cost effective than other currently available overexpression-based technologies and provides complementary information to loss-of-function approaches such as transposon site sequencing or CRISPRi,” said Mutalik, a senior author on the study.
The technique was recognized with a 2017 R&D 100 Award. In addition to the aforementioned core team members, Arkin lab research scientist Morgan Price and graduate student researcher Sean Carim, and Deutschbauer lab research associates Trenton Owens and Mark Callaghan were co-authors on the Nature Communications paper.