Next Launch:

Analyzing Data from 5 Million SETI@home Users

SETI,ET,Citizen Science
Keith Cooper
Mathea Madsen
Rachel Scheetz
February 1, 202111:00 AM UTC (UTC +0)

Seventeen billion wannabe alien signals, and no Arecibo to observe them.

This is the dilemma facing the scientists working on SETI@home. But before they can even worry about the future of Arecibo, they’ve got to conduct some of the most difficult computer analysis ever undertaken, to determine which signals in their catalog are just noise, and which, if any, stand a chance of being the real deal.

At some point during the past 20 years, over five million of us downloaded the SETI@home software, putting our personal computers to work looking for aliens while we went about our daily lives.

Then, on 31 March 2020, SETI@home went quiet. No longer were its army of users receiving raw data from the Arecibo radio telescope. Yet this was far from the end of the project. After twenty years, it’s only just beginning.

Launched on 17 May 1999, SETI@home wasn’t the first volunteer distributed-computing project, but it was certainly the most successful. From its BOINC (Berkeley Open Infrastructure for Network Computing) infrastructure, a large family of similar projects rapidly sprang, encompassing fields as diverse as molecular biology, mathematics, cognitive science, and astronomy.

But it was the hunt for aliens that really caught the public’s imagination.

“We figured that SETI@home would run for maybe five years, not twenty,” says David Anderson, co-founder and former Director of the project at the University of California, Berkeley.

The majority of the data for SETI@home consisted of Arecibo observations, although, following the advent of Breakthrough Listen, it also began taking data from the Green Bank Telescope in West Virginia, and the Parkes telescope in Australia. Since Arecibo Observatory had no Internet connection, every two to three days tapes containing 2.5 terabytes of data each would ship out to Berkeley. Once in California, the data would be divided into work units, each consisting of 107 seconds of observations at a certain frequency. It’s these work units that the computers belonging to the 5.2 million participants all over the world received for processing.

SETI@home’s success was also its worst enemy. The Berkeley team behind the project had been hoping for 50,000 to 100,000 users at best — over five million was beyond their wildest imagination. With such huge amounts of data flowing in, it was impossible for the team to give any observation more than a cursory glance. As a result, it may surprise many of SETI@home’s users that much of the data is still to be fully analyzed. That cursory glance told the team that no stand-out, obvious beacons have been detected, but fainter signals may still be hiding. To find them, the data will need careful analysis.

Therein lies the problem. A few years after the onset of SETI@home, with the project becoming a tremendous success, the team first began to consider further processing their data. They wrote some software for the job, but “it was impossibly slow,” says Anderson. “So for a long time, we just accumulated data, with no way of doing the back-end analysis.”

Anderson began thinking about this problem again in 2016 and wrote new software, called Nebula, to handle the task. It’s a system with two layers – the bottom layer contains various tools that allow the data to be stored and accessed, and made available for parallel computing. The top layer is more specific to the challenges of conducting radio SETI, such as the removal of so-called RFI – radio frequency interference – and identifying and ranking decent candidate signals.

Still, it’s a fiendish task. “It’s probably the most difficult analysis problem I’ve ever worked on in my life,” says Berkeley’s Eric Korpela, who took over the reins from David Anderson as SETI@home’s director in 2015. The lack of personnel — “Most of the time it was just four people running it,” says Korpela — only compounded the problem. It took all their spare time to simply keep the show running. That’s why the public’s participation was finally curtailed — there’s plenty of data — now the team must direct their energy toward analyzing, rather than collecting it.

Over two decades the project collected more than 17 billion narrowband radio signals. All known natural astrophysical phenomena radiate broadband emissions across a wide range of radio frequencies, so SETI instead eavesdrops on narrow wavelength bands because any signal that is confined to a narrow channel is sure to be artificial. Transmitting a narrowband signal is also cost-effective, since the sender isn’t spreading their energy across a wide range of wavelengths, and can therefore focus that energy on boosting the power of a narrowband signal.

Historically, SETI has favored searching around a narrow channel centered on 1,420 MHz, which is the frequency of radio emission from electron transitions in neutral hydrogen atoms. However, in the past few decades, SETI has broadened the search to cover billions of narrow channels across the radio spectrum.

It would take forever to inspect each of the 17 billion detected signals individually, so the Nebula software is trained to look for telltale signs. These include sudden spikes in power, signals with a slow rise and fall as Arecibo’s field of view passes over them, three power spikes in a row perhaps indicating a pulsing signal, and autocorrelated signals, where a delayed copy of the signal is sent just after the primary signal, as a way of correcting for dispersion.

Standing in their way is the RFI. It is quite possible that each of those 17 billion signals will turn out to be radio interference from a terrestrial source — Arecibo was located near a city — but the only way to be sure is to systematically remove the interference.

“Humans make a big mess of the radio spectrum.”

“SETI@home observed in what is called the ‘protected band’ because no one is supposed to be transmitting any radio emissions in that band," Korpela tells Supercluster. "But somehow they do.”

Korpela and Anderson are hoping to have the results of their analysis complete and published at some point in 2021 — “It would be nice to have a paper and a press release on the anniversary of the day we shut down,” says Korpela — but they still won’t be finished. The next step would be to follow up on interesting candidates that SETI@home detected. But the catastrophic collapse of Arecibo has scuppered their chances.

“Our plan was to re-observe the best candidates with Arecibo, but its loss makes that a whole lot more difficult,” says Korpela. It’s not that other telescopes could not detect these signals, but with the exception of the FAST telescope in China, every other radio telescope is smaller than the 305-meter diameter Arecibo. Smaller telescopes generally have a larger field of view, which makes pinpointing any candidate source more difficult, and they would also have to listen for longer to detect the same strength of signal.

What of FAST though? The team at the Berkeley SETI Research Center has been in touch and working alongside Chinese scientists for years, and have recently even built a commensal SETI instrument for use on FAST, so there’s already a relationship there. However, “the procedures for using FAST are fairly regimented,” says Korpela. Every observing project on FAST requires a Chinese lead scientist and, according to Korpela, there is some reluctance on the part of the Chinese authorities to allow raw data from FAST outside of the country. Nevertheless, Korpela hopes that these difficulties can be resolved and, follow up work can begin on SETI@home’s best candidates.

Who knows what those observations may bring. But for the public who were emotionally connected with SETI@home, they eagerly await news of a replacement project that can once again involve them in the search for life beyond Earth.

“We don’t have definite plans right now, but there has been some talk about starting another public participation SETI project,” says Korpela. This will be met by its own challenges. The first is the need to find new personnel who can oversee the project for perhaps 20 years and communicate with millions of people who will want to take part. The other problem is funding, and absent a grant from NASA, the National Science Foundation or some other funding body or philanthropist, there will need to be public fundraising to get it off the ground.

Another option is to include Chinese scientists in any future successor to SETI@home.

“There’s certainly room for a SETI@home-type project using the FAST telescope in China,” says Anderson. “We’re trying to get them interested in this.” However, as we have seen, this may depend on China’s willingness to distribute the raw data.

Although what comes next is still up in the air, SETI@home was nevertheless an extraordinary success. While ultimately its search for extraterrestrial intelligence may come to nought, its search for people willing to contribute to SETI found its target, proving there is a hunger and a willingness among a scientifically enthusiastic public to support SETI research. The project may never find intelligent life in space, but it found it here at home.

Keith Cooper
Mathea Madsen
Rachel Scheetz
February 1, 202111:00 AM UTC (UTC +0)