The CatNorth Catalog: Over 1.5 Million Reliable Quasar Candidates from Gaia DR3

Reposted from KIAA-PKU

The Kavli Institute for Astronomy and Astrophysics at Peking University announces an advancement in cosmic exploration with the release of the CatNorth quasar candidate catalog, as recently published in The Astrophysical Journal Supplement Series. The international collaborative study, led by Dr. Yuming Fu and Prof. Xue-Bing Wu, presents an enhanced set of over 1.5 million reliable quasar candidates, building upon the initial Gaia DR3 quasar candidate catalog noted for its substantial size but low purity.

Figure 1. The sky density map of the CatNorth quasar candidate catalog in Galactic coordinates.

Quasars, the luminous active galactic neuclei powered by supermassive black holes, play an essential role in the cosmic landscape. As some of the brightest objects in the universe, quasars are indispensable for providing insights into the formation and evolution of galaxies and supermassive black holes, and probing the large-scale structure of the universe. The original Gaia DR3 catalog featured over 6.6 million potential quasars but was hindered by an estimated purity (fraction being real quasars) of about 52%—a level inadequate for detailed quasar and cosmological studies.

By combining Pan-STARRS1 optical and CatWISE2020 infrared data with Gaia DR3 data, PKU astronomers have constructed the CatNorth quasar candidate catalog, comprising over 1.5 million sources, achieving a purity of approximately 90% while maintaining high completeness. Utilizing multi-band photometric data, Gaia proper motion data, and the XGBoost algorithm, the researchers have established a robust and unbiased astronomical classification model.

“By weaving together nearly 30 stellar samples from a variety of databases, including Gaia DR3 and Chinese LAMOST, and incorporating checks on spectroscopically confirmed quasars and galaxies from the Sloan Digital Sky Survey, we ensured a comprehensive and reliable training sample for the classification model,” explains Dr. Fu. “We also built an ensemble regression model with TabNet, FT-Transformer, and XGBoost, to obtain accurate photometric redshifts for all the quasar candidates.”

Figure 2. Histograms of the logarithmic redshift errors of different quasar candidate catalogs as compared to SDSS. CatNorth has the lowest fraction of misidentification of emission lines among the three catalogs.

Prof. Wu further elaborates on the methodology, “The color indices from multiband photometric data, coupled with source extent and proper motion from Gaia, have been pivotal in identifying these cosmic beacons. The refined approach has largely increased the fidelity of our quest for quasars.”

Figure 3. CatNorth sources are well matched to SDSS DR16Q quasars, with a low level of stellar (red density plots) contamination on the color-color diagrams.

“With the HCT telescope, we have identified 8 new quasars out of 10 CatNorth candidates that were missed by another catalog mainly built with color / proper motion cuts, achieving a very high success rate in recognizing true quasars,” Dr. Fu reveals. “This progress not only refines our understanding of the celestial canvas but also demonstrates the incredible synergy of combining multiple astrophysical datasets with advanced machine learning techniques.”

Figure 4. The HCT spectra of 10 randomly selected CatNorth quasar candidates that are not in Quaia. Eight out of ten objects are identified as new quasars.

The CatNorth catalog is a testament to international collaborative astrophysical research, enhancing the LAMOST phase III quasar survey as its main input catalog. Prof. Wu emphasizes the catalog’s long-term impact on the field:“The CatNorth catalog will not only be a key resource for current observational efforts, but also stands as a critical asset for space missions including Euclid and CSST. It will have important contribution to our future efforts to explore the structure of distant universe.”

The full text of the paper is available online at: The CatNorth quasar candidate catalog can be accessed through the database of National Astronomical Data Center of China: