As part of an effort
to identify distant planets hospitable to life, NASA has established a crowd-sourcing project in which volunteers search telescopic images for evidence
of debris disks around the stars, which are good indicators of exoplanets.
Using the results of
that project, researchers at MIT have now trained a machine-learning system to
search debris disks itself.
The scale of the
search demands automation:
There are nearly 750
million possible light sources in the data accumulated through NASA’s Wide-Field
Infrared Survey Explorer (WISE) mission alone.
In tests, the
machine-learning system agreed with human identifications of debris disks 97%
of the time.
Researchers also
trained their system to rate debris disks according to their likelihood of
containing detectable exoplanets.
In a paper describing
the new work in the journal Astronomy and Computing, the MIT researchers report
that their system identified 367 previously unexamined celestial objects as
particularly promising candidates for further study.
The work represents
an unusual approach to machine learning, which has been championed by one of
the paper’s co-authors, Victor Pankratius, a principal research scientist at
MIT’s Haystack Observatory.
Typically, a
machine-learning system will comb through a wealth of training data, looking
for consistent correlations between features of the data and some label applied
by a human analyst – in this case, stars circled by debris disks.
But Pankratius argues
that in the sciences, machine-learning systems would be more useful if they
explicitly incorporated a little bit of scientific understanding, to help guide
their searches for correlations of identify deviations from the norm that could
be of scientific interest.
“The main vision is
to go beyond what AI is focusing on today,”
Pankratius says. “Today, we are
collecting data, and we are trying to find features in the data. You end up
with billions and billions of features.