A Novel Type of Neural Network Comes to the Aid of Big Physics

A Novel Type of Neural Network Comes to the Aid of Big Physics
Written by Techbot

Suppose you have a thousand-page book, but each page has only a single line of text. You’re supposed to extract the information contained in the book using a scanner, only this particular scanner systematically goes through each and every page, scanning one square inch at a time. It would take you a long time to get through the whole book with that scanner, and most of that time would be wasted scanning empty space. 

Such is the life of many an experimental physicist. In particle experiments, detectors capture and analyze vast amounts of data, even though only a tiny fraction of it contains useful information. “In a photograph of, say, a bird flying in the sky, every pixel can be meaningful,” explained Kazuhiro Terao, a physicist at the SLAC National Accelerator Laboratory. But in the images a physicist looks at, often only a small portion of it actually matters. In circumstances like that, poring over every detail needlessly consumes time and computational resources.

But that’s starting to change. With a machine learning tool known as a sparse convolutional neural network (SCNN), researchers can focus on the relevant parts of their data and screen out the rest. Researchers have used these networks to vastly accelerate their ability to do real-time data analysis. And they plan to employ SCNNs in upcoming or existing experiments on at least three continents. The switch marks a historic change for the physics community. 

“In physics, we are used to developing our own algorithms and computational approaches,” said Carlos Argüelles-Delgado, a physicist at Harvard University. “We have always been on the forefront of development, but now, on the computational end of things, computer science is often leading the way.” 

Sparse Characters

The work that would lead to SCNNs began in 2012, when Benjamin Graham, then at the University of Warwick, wanted to make a neural network that could recognize Chinese handwriting. 

The premier tools at the time for image-related tasks like this were convolutional neural networks (CNNs). For the Chinese handwriting task, a writer would trace a character on a digital tablet, producing an image of, say, 10,000 pixels. The CNN would then move a 3-by-3 grid called a kernel across the entire image, centering the kernel on each pixel individually. For every placement of the kernel, the network would perform a complicated mathematical calculation called a convolution that looked for distinguishing features.

CNNs were designed to be used with information-dense images such as photographs. But an image containing a Chinese character is mostly empty; researchers refer to data with this property as sparse. It’s a common feature of anything in the natural world. “To give an example of how sparse the world can be,” Graham said, if the Eiffel Tower were encased in the smallest possible rectangle, that rectangle would consist of “99.98 percent air and just 0.02 percent iron.”

The IceCube Neutrino Observatory at the South Pole.Photograph: Felipe Pedreros/IceCube/NSF/Quanta

Graham tried tweaking the CNN approach so that the kernel would only be placed on 3-by-3 sections of the image that contain at least one pixel that has nonzero value (and is not just blank). In this way, he succeeded in producing a system that could efficiently identify handwritten Chinese. It won a 2013 competition by identifying individual characters with an error rate of only 2.61 percent. (Humans scored 4.81 percent on average.) He next turned his attention to an even bigger problem: three-dimensional-object recognition.

By 2017, Graham had moved to Facebook AI Research and had further refined his technique and published the details for the first SCNN, which centered the kernel only on pixels that had a nonzero value (rather than placing the kernel on any 3-by-3 section that had at least one “nonzero” pixel). It was this general idea that Terao brought to the world of particle physics.

Underground Shots

Terao is involved with experiments at the Fermi National Accelerator Laboratory that probe the nature of neutrinos, among the most elusive known elementary particles. They’re also the most abundant particles in the universe with mass (albeit not much), but they rarely show up inside a detector. As a result, most of the data for neutrino experiments is sparse, and Terao was constantly on the lookout for better approaches to data analysis. He found one in SCNNs.

In 2019, he applied SCNNs to simulations of the data expected from the Deep Underground Neutrino Experiment, or DUNE, which will be the world’s largest neutrino physics experiment when it comes online in 2026. The project will shoot neutrinos from Fermilab, just outside Chicago, through 800 miles of earth to an underground laboratory in South Dakota. Along the way, the particles will “oscillate” between the three known types of neutrinos, and these oscillations may reveal detailed neutrino properties.

The SCNNs analyzed the simulated data faster than ordinary methods, and required significantly less computational power in doing so. The promising results mean that SCNNs will likely be used during the actual experimental run.

In 2021, meanwhile, Terao helped add SCNNs to another neutrino experiment at Fermilab known as MicroBooNE. Here, scientists look at the aftermath of collisions between neutrinos and the nuclei of argon atoms. By examining the tracks created by these interactions, researchers can infer details about the original neutrinos. To do that, they need an algorithm that can look at the pixels (or, technically, their three-dimensional counterparts called voxels) in a three-dimensional representation of the detector and then determine which pixels are associated with which particle trajectories.

Because the data is so sparse—a smattering of tiny lines within a large detector (approximately 170 tons of liquid argon)—SCNNs are almost perfect for this task. With a standard CNN, the image would have to be broken up into 50 pieces, because of all the computation to be done, Terao said. “With a sparse CNN, we analyze the entire image at once—and do it much faster.”

Timely Triggers

One of the researchers who worked on MicroBooNE was an undergraduate intern named Felix Yu. Impressed with the power and efficiency of SCNNs, he brought the tools with him to his next workplace as a graduate student at a Harvard research laboratory formally affiliated with the IceCube Neutrino Observatory at the South Pole.

One of the key goals of the observatory is to intercept the universe’s most energetic neutrinos and trace them back to their sources, most of which lie outside our galaxy. The detector is comprised of 5,160 optical sensors buried in the Antarctic ice, only a tiny fraction of which light up at any given time. The rest of the array remains dark and is not particularly informative. Worse, many of the “events” that the detectors record are false positives and not useful for neutrino hunting. Only so-called trigger-level events make the cut for further analysis, and instant decisions need to be made as to which ones are worthy of that designation and which will be permanently ignored.

Standard CNNs are too slow for this task, so IceCube scientists have long relied on an algorithm called LineFit to tell them about potentially useful detections. But that algorithm is unreliable, Yu said, “which means we could be missing out on interesting events.” Again, it’s a sparse data environment ideally suited for an SCNN.

Yu—along with Argüelles-Delgado, his doctoral adviser, and Jeff Lazar, a graduate student at the University of Wisconsin, Madison—quantified that advantage, showing in a recent paper that these networks would be about 20 times faster than typical CNNs. “That’s fast enough to run on every event that comes out of the detector,” about 3,000 each second, Lazar said. “That enables us to make better decisions about what to throw out and what to keep.”

IceCube has thousands of sensors buried deep in the Antarctic ice, such as the one at left (signed by researchers and engineers). At any time, only a few of these sensors produce useful data for neutrino hunters, so researchers needed a tool to help them separate out the unwanted data.Photographs: Robert Schwarz/NSF/Quanta

The authors have also successfully employed an SCNN in a simulation using official IceCube data, and the next step is to test their system on a replica of the South Pole computing system. If all goes well, Argüelles-Delgado believes they should get their system installed at the Antarctic observatory next year. But the technology could see even wider use. “We think that [SCNNs could benefit] all neutrino telescopes, not just IceCube,” Argüelles-Delgado said.

Beyond Neutrinos

Philip Harris, a physicist at the Massachusetts Institute of Technology, is hoping SCNNs can help out at the biggest particle collider of them all: the Large Hadron Collider (LHC) at CERN. Harris heard about this kind of neural network from an MIT colleague, the computer scientist Song Han. “Song is an expert on making algorithms fast and efficient,” Harris said—perfect for the LHC, where 40 million collisions occur every second.

When they spoke a couple of years ago, Song told Harris about an autonomous-vehicle project he was pursuing with members of his lab. Song’s team was using SCNNs to analyze 3D laser maps of the space in front of the vehicle, much of which is empty, to see if there were any obstructions ahead.

Harris and his colleagues face similar challenges at the LHC. When two protons collide inside the machine, the crash creates an expanding sphere made of particles. When one of these particles hits the collector, a secondary particle shower occurs. “If you can map out the full extent of this shower,” Harris said, “you can determine the energy of the particle that gave rise to it,” which might be an object of special interest—something like the Higgs boson, which physicists discovered in 2012, or a dark matter particle, which physicists are still searching for.

“The problem we are trying to solve comes down to connecting the dots,” Harris said, just as a self-driving car might connect the dots of a laser map to detect an obstruction.

SCNNs would speed up data analysis at the LHC by at least a factor of 50, Harris said. “Our ultimate goal is to get [SCNNs] into the detector”—a task that will take at least a year of paperwork and additional buy-in from the community. But he and his colleagues are hopeful.

Altogether, it’s increasingly likely that SCNNs—an idea originally conceived in the computer science world—will soon play a role in the biggest experiments ever conducted in neutrino physics (DUNE), neutrino astronomy (IceCube), and high-energy physics (the LHC).

Graham said he was pleasantly surprised to learn that SCNNs had made their way to particle physics, though he was not totally shocked. “In an abstract sense,” he said, “a particle moving in space is a bit like the tip of a pen moving on a piece of paper.”

Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.

Original Article:

About the author