Filtration of results based on CheckM


Sorry if I’m missing an obvious flag here, but is there a way to filter out genomes from the classify_wf output tree based on checkm values? Or is there a reference table outside of the online metadata that contains a column for checkm values and the other metadata?

Celeste Lanclos


GTDB-Tk does not calculate CheckM completeness and contamination estimates. While GTDB-Tk may appear similar to CheckM in terms of placing a genome in a reference tree based on marker genes, they are very different under the hood.

If you are looking for CheckM estimates for genomes in the GTDB, these can be found in the GTDB metadata files at:

Thanks, that first link is exactly what I needed!