When should I use classify_wf instead of da_novo_wf?

I’ve always used the classify workflow for all my taxonomic classifications but now I’m reading that I should be using the de novo workflow. When would I use classify over the de novo workflow?

Is it possible to do automatic detection of bacteria and archaea like the classify workflow? If not, would it be possible to provide a table that says if each genome is bacteria or archaea?

Hi,

The classify_wf and de_novo_wf are fundamentally different. The classify workflow provides a GTDB taxonomic classification for each of your input genomes. The de novo workflow infers domain-specific trees containing your genome and, generally, the set of GTDB representative genomes. You are free to use this tree as you see fit. Notably, the de novo workflow doesn’t provide you with taxonomic classifications though you could inspect the trees and establish this information yourself. This can be non-trivial though as it is possible (likely) that at least some GTDB taxa will no longer be monophyletic in these newly inferred trees.

https://ecogenomics.github.io/GTDBTk/commands/classify_wf.html
https://ecogenomics.github.io/GTDBTk/commands/de_novo_wf.html

Cheers,
Donovan