Retrieve the number of species represented by a single genome

Hello GTDB team,

I wonder if there is way to retrieve the total number of GTDB species in the current release of the database that are represented by a single genome.

Thank you

Hi Kafka,

You can obtain this information from the species cluster file that is produced with each release. The latest version is at: https://data.gtdb.ecogenomic.org/releases/latest/auxillary_files/sp_clusters.tsv. This file has one row for each GTDB species clusters and indicates the the number of genomes that comprise a species cluster in the “No. clustered genomes” column.

Cheers,
Donovan

Thank you, Donovan!