Stable gene annotation dataset

Hello everyone,

Is there a database of the gene annotations (KEGGs, COGs or similar) for all the representative genomes somewhere keeping up with the updates in an stable manner? So far I used the annotated database from annotree (http://annotree.uwaterloo.ca/) but they discontinued the releases, and I was curious if someone had knowledge of a similar dataset.

I sometimes need the distribution of a gene of interest, and although I can calculate every time for all genomes its presence it would be more computationally efficient and environmental friendly to have a single database from which all of us can search :slight_smile:

1 Like

I have the same question now. I need to use functional annotations from GTDB genomes in the latest release. Which is the easiest way to do this?

Hi Jose,

I finally did a home-made solution to my problem, you can find it here:

I hope it helps

Thanks a lot, that’s quite useful. However, I was wondering if there’s a direct way to get functional annotations for all genes in genomes from GTDB (KEGG, COG, Pfam).