AIMS-z: astrophysics-informed machine learning for spectroscopic redshifts of quasars

AIMS-z is a multimodal, astrophysics-informed machine-learning framework for determining spectroscopic redshifts of quasars. Its initial application focuses on Euclid slitless spectroscopy and DR1-scale samples, where visual inspection of every source becomes impractical and direct template matching can produce several plausible solutions for noisy, contaminated, or single-line spectra.

AIMS-z grew out of our Euclid Q1 bright quasar analysis, in which visual inspection of Euclid/NISP spectra enabled the identification and redshift determination of nearly 3500 bright quasars. The ambiguities and failure modes we encountered in that sample helped shape a more scalable, quality-aware approach. This is especially relevant to slitless spectroscopic surveys such as Euclid and the Nancy Grace Roman Space Telescope, but the broader framework could also be applied to other spectroscopic datasets, including DESI.

Euclid Q1 bright quasars
Paper overview, quasar catalogue, redshift-quality information, and composite spectrum specbox on GitHub
A simple tool to manipulate and visualize UV/optical/NIR spectra for astronomical research

Why AIMS-z?

The Near-Infrared Spectrometer and Photometer (NISP) onboard Euclid provides slitless spectroscopy across a large field of view. In the Euclid Wide Survey, three red-grism observations with different dispersion directions help disentangle overlapping spectra and produce cleaner combined spectra. The red grisms cover approximately 1206–1892 nm at a resolving power above 480 for a compact source. Further instrument and mission details are presented by Euclid Collaboration: Jahnke et al. (2025) and Euclid Collaboration: Mellier et al. (2025).

The Euclid Q1 study demonstrated this capability by identifying nearly 3500 quasars from Euclid spectra. Nevertheless, a quasar spectrum may contain only a few prominent features and, in some cases, a single dominant emission line. Direct template matching can therefore return several plausible redshift solutions. Visual inspection resolves many of these ambiguities, but it does not scale to the much larger candidate samples expected from future data releases.

Quasar emission lines moving through the Euclid NISP wavelength range as a function of redshift — **Quasar redshift determination with Euclid/NISP.** Strong emission lines move into and out of the observed NISP wavelength range as redshift changes. Some intervals contain several useful features, while others offer only one prominent line, creating degeneracies between otherwise plausible redshift solutions. Figure adapted from the Euclid Q1 bright quasar analysis.

A data-driven regressor can reduce the visual-inspection workload, but without physical constraints it may become overconfident for noisy, contaminated, or partially unusable spectra. AIMS-z is designed to combine machine learning with astrophysical information so that it can estimate a redshift while also identifying when that estimate is likely to be reliable.

How AIMS-z works

AIMS-z brings together several complementary sources of information:

Spectra, initially Euclid slitless spectra, represented with a masked autoencoder that learns useful spectral features.
External photometry and colours, which help distinguish otherwise degenerate redshift solutions.
Template and Pearson-correlation information, which captures physically meaningful emission-line matches.
A hybrid redshift step, combining data-driven predictions with prior-guided spectral constraints.
Human-in-the-loop quality assessment, using selected visually inspected spectra to improve redshift-reliability estimates.

The full algorithm description and validation are in preparation.

Human-in-the-loop quality assessment

Visual inspection remains important for calibrating the reliability of AIMS-z. For selected subsets, we use specbox and related visual-inspection tools to review predicted redshifts, inspect Euclid image cutouts, classify sources, and record whether the spectrum and machine-learning prediction are reliable.

These labels are intended to support a reliability-selection model. The longer-term goal is to identify high-confidence redshifts while prioritising ambiguous or failure-prone cases for further inspection.

Connection to Euclid Q1, DR1, and other surveys

The Euclid Q1 bright quasar study demonstrated the redshift challenges in a visually inspected sample and produced a quasar catalogue, redshift-quality information, and a telluric-free Euclid quasar composite spectrum. AIMS-z takes the next step by turning those lessons into a scalable redshift framework for larger Euclid spectroscopic samples.

Slitless surveys such as Euclid and Roman particularly benefit from a workflow that can combine sparse spectral features, photometry, contamination information, and reliability assessment. Because the framework is not intended to depend on one instrument alone, future work may explore its use with other spectroscopic surveys, including DESI.

The full algorithm description, validation, and Euclid DR1-scale application are in preparation. This page does not present a final DR1 catalogue or released AIMS-z model products.

Status

AIMS-z is under active development. This page will be updated with the paper, algorithm details, validation figures, and data products as the project progresses.

Yuming Fu's Homepage

Why AIMS-z?

How AIMS-z works

Human-in-the-loop quality assessment

Connection to Euclid Q1, DR1, and other surveys

Status

Related resources