GTDB taxonomy of bv-brc (formerly Patric) genomes

Hey everyone,

I have downloaded all genomes in the bv-brc database. This is ~260000 genomes. BV-brc provides the ncbi taxonomy, but I would like the gtdb taxonomy because its more correct and easier to connect to some amplicon data I have. I can try to run gtdb-tk for all these genomes, but it seems like a waste of computation if this has been done already. Does anyone know whether this endeavour has already been undertaken? And if yes, could you refer me to the data?

Kind regards,


Hi Bram

I’m not aware of anyone having done this, but sounds like it would be worthwhile. Will discuss with the team.

Bw, Phil

Hey, is there any update on this? Also, now I am using the mapping file bac120_metadata.tsv to convert ncbi to gtdb taxonomies. How was this mapping achieved? As I understand it, it is not easy/possible to map ncbi to gtdb 1 to 1.