As instruments in large-scale user facilities are becoming more powerful, the volume of data and its complexity also grow. To leverage these heightened capabilities and accelerate scientific discoveries, a field known as autonomous discovery has emerged. It uses algorithms to learn from a comparatively little amount of input data and determine the best next experimental steps — all with minimal human intervention.
Elliot Perryman, a computer science and physics major at the University of Tennessee, began working with staff scientist Peter Zwart in the Center for Advanced Mathematics for Energy Research Applications (CAMERA) last fall through the Berkeley Lab Undergraduate Research (BLUR) program. Together they developed an algorithm that will extract better structures from low-quality crystallographic diffraction data.
Berkeley Lab researchers, in collaboration with scientists from SLAC National Accelerator Laboratory and the Max Planck Institute, have demonstrated that fluctuation X-ray scattering is capable of capturing the behavior of biological systems in unprecedented detail. Although this technique was first proposed more than four decades ago, its implementation was hindered by the lack of sufficiently powerful X-ray sources and associated detector technology, sample delivery methods, and the means to analyze the data. The team developed a novel mathematical and data analyses framework that was applied to data obtained from DOE’s Linac Coherent Light Source (LCLS) at SLAC. This breakthrough was recently reported in the Proceedings of the National Academy of Sciences (PNAS).
Typical machine learning methods used to analyze experimental imaging data rely on tens or hundreds of thousands of training images. But Daniël Pelt and James Sethian of Berkeley Lab’s Center for Advanced Mathematics for Energy Research Applications (CAMERA) have developed what they call a “Mixed-Scale Dense Convolution Neural Network” (MS-D) that “learns” much more quickly from a remarkably small training set. One promising application of MS-D is in understanding the internal structure and morphology of biological cells to identify, for example, differences between healthy and diseased cells. In one such project in Carolyn Larabell’s lab, the method needed data from just seven cells to determine the cell structure.
As part of an international team, researchers with Berkeley Lab’s Center for Advanced Mathematics for Energy Research Applications (CAMERA) employed their multi-tiered iterative phasing (M-TIP) algorithm to process X-ray free laser (XFEL) data taken from single virus particles and resolve their nanometer-scale structures in 3D. The new approach circumvents several challenges of imaging biomolecules that do not crystallize well, such as the random orientations of particles in solution and the asymmetrical structures of many viruses and proteins. Jeff Donatelli of the Computational Research Division’s Mathematics Group and Peter Zwart and Kanupriya Pande of the Biosciences Area’s Molecular Biophysics and Integrated Bioimaging (MBIB) division contributed to the work, the results of which were published in Physical Review Letters. Read more from the Berkeley Lab News Center.