Using artificial intelligence to infer stellar radial velocities - project Astro2RV

This project focuses on inferring radial velocities (RV) from stellar astrometric data.

Project duration October 2023 - June 2025

The Gaia mission and its planned successor, Gaia-NIR, provide precise astrometric information for over a billion galactic and extragalactic stars. Among others, the data provides astrometric properties of stars, which consist of their spatial position and two velocity components perpendicular to the line of sight between Earth and the star. The third velocity component, line of sight or radial velocity (RV), can be measured through spectroscopic measurements. Due to the Doppler effect, the color spectrum of stars that are moving towards Earth will be shifted towards blue, while the spectrum of stars moving away will be shifted towards red. The magnitude of this spectral shift tells us the star's radial velocity. Spectroscopic measurements, however, are missing for the vast majority of stars, due to their light being too faint for the Radial Velocity Spectrometer (RVS) on board the Gaia satellite. Besides that, only a limited number of stars can be observed from the ground.

Using artificial intelligence to infer stellar radial velocities

This study proposes predicting stellar radial velocities using artificial intelligence (AI) techniques and the third Gaia mission data release (Gaia DR3). We investigated whether photometric properties such as stellar brightness, color, and metallicity, along with estimated light extinction from different parts of the galaxy and physical classification labels, contribute to more accurate predictions of stellar radial velocities.

Predicting RVs using only astrometric data on the same dataset was recently tested, so we used the results of that study as a reference point for our project.

Results

Our dataset contained only stars from within our galaxy, the Milky Way. Of these, 33.5 million stars have measured radial velocities, while 117 million stars do not. We tested several modeling approaches: neural networks, Bayesian neural networks, gradient boosted decision trees, and a localized ensemble of decision trees, where the galaxy was divided into several sections and a separate ensemble was trained in each section. Since Bayesian neural networks provide not only predictions of radial velocities but also associated uncertainties—which are always desirable in modeling physical systems—we chose this model. On a randomly selected portion of the dataset with known RVs, this model, in combination with astrometric data, achieved an average error of 28.1 km/s. With the addition of the other measured quantities, the accuracy only slightly improved to 28 km/s. Both results represent an improvement over the reference study, which achieved an accuracy of 29 km/s.

The addition of new observables did not significantly contribute to greater accuracy in predicting stellar radial velocities. However, the predictions of radial velocities for the 117 million stars with unknown velocities within the Milky Way represent an important result of this project, which researchers can use to identify high-velocity objects within our galaxy.

Partners

University of Ljubljana, Faculty of Mathematics and Physics
European Space Agency (ESA)