As part of an effort to identify distant planets hospitable to life, NASA has established a crowd-sourcing project in which volunteers search telescopic images for evidence of debris disks around the stars, which are good indicators of exoplanets.
Using the results of that project, researchers at MIT have now trained a machine-learning system to search debris disks itself.
The scale of the search demands automation:
There are nearly 750 million possible light sources in the data accumulated through NASA’s Wide-Field Infrared Survey Explorer (WISE) mission alone.
In tests, the machine-learning system agreed with human identifications of debris disks 97% of the time.
Researchers also trained their system to rate debris disks according to their likelihood of containing detectable exoplanets.
In a paper describing the new work in the journal Astronomy and Computing, the MIT researchers report that their system identified 367 previously unexamined celestial objects as particularly promising candidates for further study.
The work represents an unusual approach to machine learning, which has been championed by one of the paper’s co-authors, Victor Pankratius, a principal research scientist at MIT’s Haystack Observatory.
Typically, a machine-learning system will comb through a wealth of training data, looking for consistent correlations between features of the data and some label applied by a human analyst – in this case, stars circled by debris disks.
But Pankratius argues that in the sciences, machine-learning systems would be more useful if they explicitly incorporated a little bit of scientific understanding, to help guide their searches for correlations of identify deviations from the norm that could be of scientific interest.
“The main vision is to go beyond what AI is focusing on today,”
Pankratius says. “Today, we are collecting data, and we are trying to find features in the data. You end up with billions and billions of features.