GTDB Forum

194,600 genomes r95 download

Are all the 194, 600 genomes that went into the making of r95 available for download somewhere?

for R95, I believe they are all available through Genbank/RefSeq.

(For R89, some of them were not available in the archives, but I believe that all of those were/are available through the GTDB-Tk database download.)

1 Like

(for large scale download from NCBI, we’ve been trying out genome_updater - our mostly positive experiences here)

1 Like

Hi,

All genomes are available from NCBI so we have elected not to replicate the data on the GTDB. The assembly report files provided on the NCBI FTP site give the URL for the root directory of the genome data at NCBI:
ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_genbank.txt
ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt

These are the files we use to download and rsync data with NCBI.

Cheers,
Donovan

1 Like

Thanks, Titus, most helpful!

Thanks, Donovan, this is very helpful