How to obtain reference genome sequences in GTDB database

kangkwok · November 24, 2023, 2:16pm

Dear all, I have many lists of GTDB taxonomy names of bacterial species that I am interested in (i.e. d__Archaea;p__Methanobacteriota;c__Methanobacteria;o__Methanobacteriales;f__Methanobacteriaceae;g__Methanobrevibacter;s__Methanobrevibacter sp900314635) and every list contain at least 20 species. How can I obtain large quantities of the corresponding reference genome sequences based on the lists of GTDB taxonomy names?

donovan.parks · November 25, 2023, 4:46am

You can download all 85,205 GTDB species representative genomes here (~75 GB):
https://data.gtdb.ecogenomic.org/releases/release214/214.1/genomic_files_reps/gtdb_genomes_reps_r214.tar.gz

You can also use the Advance Search feature to search for a given GTDB taxon and then use the “Genomes” button with a down arrow to download a script which will pull the genome assembly files from NCBI, e.g. for Archaea:

Cheers,
Donovan

kangkwok · November 25, 2023, 7:09am

Thank you for your answer! You mean I should download the database for matches if I need representative sequences from GTDB, and should use the advanced search to retrieve gene assemblies if I need them from NCBI, right?

donovan.parks · November 26, 2023, 4:49pm

You can use the Advance Search and the Genome download button to get different types of files from NCBI (e.g. genome assembly, GFF, CDS). This is the most flexible way to obtain genomes and associated data.

Raaj · December 9, 2023, 4:04am

Thanks for this crucial discussion. I downloaded a text file with the information related to my interest of genomes using advance search tool. Now how to use this file for downloading genomes (.fa files).

donovan.parks · December 12, 2023, 6:07pm

Hi.

The file generated by the Genomes download button is a shell script. You can run it using ./gtdb-adv-search-genomes.sh which will download the data you requested from NCBI.

Cheers,
Donovan

Raaj · December 14, 2023, 2:26am

I appreciate your kind response. Thank you