How to obtain the isolate source of a large number of genomes?

Hello, everyone. Recently, I have been paying attention to the distribution of some species in different habitats around the world. Is there any simple way to obtain the isolated envrionments of several genomes? It is hard to check every genome manually on the website, especially when there are a large number of target genomes. Looking forward to your reply, thanks a lot.

Hi,

The GTDB genome metadata files have the NCBI isolation source (ncbi_isolation_source):

Depending on your questions, you might also find Sandpiper useful. It contains inferred information about the environmental distribution for GTDB taxa.

Cheers,
Donovan

1 Like

Thanks a lot for your kindly reply. Your suggestions are very helpful to me.

Hi,

You can find this information in the ncbi_isolation_source field of the GTDB metadata files:

Cheers,
Donovan