Hello GTDB team,
I wonder if there is way to retrieve the total number of GTDB species in the current release of the database that are represented by a single genome.
Thank you
Hello GTDB team,
I wonder if there is way to retrieve the total number of GTDB species in the current release of the database that are represented by a single genome.
Thank you
Hi Kafka,
You can obtain this information from the species cluster file that is produced with each release. The latest version is at: https://data.gtdb.ecogenomic.org/releases/latest/auxillary_files/sp_clusters.tsv. This file has one row for each GTDB species clusters and indicates the the number of genomes that comprise a species cluster in the “No. clustered genomes” column.
Cheers,
Donovan
Thank you, Donovan!