GTDB Forum

Filtration of results based on CheckM

Hello,

Sorry if I’m missing an obvious flag here, but is there a way to filter out genomes from the classify_wf output tree based on checkm values? Or is there a reference table outside of the online metadata that contains a column for checkm values and the other metadata?

Thanks,
Celeste Lanclos

Hi.

GTDB-Tk does not calculate CheckM completeness and contamination estimates. While GTDB-Tk may appear similar to CheckM in terms of placing a genome in a reference tree based on marker genes, they are very different under the hood.

If you are looking for CheckM estimates for genomes in the GTDB, these can be found in the GTDB metadata files at:
https://data.gtdb.ecogenomic.org/releases/latest/

See also:
https://data.gtdb.ecogenomic.org/releases/latest/auxillary_files/metadata_field_desc.tsv

Cheers,
Donovan

Thanks, that first link is exactly what I needed!

-Celeste