GTDB Forum

CheckM marker set lineage?


Is it possible to find which marker set lineage was used for the CheckM completeness estimates provided on GTDB? I am studying a clade with lineage-specific gene loss, and want to gauge whether the low CheckM completeness is estimated from a lineage-specific gene set that takes this into account.

In general, do updates of CheckM have a more comprehensive reference tree with representatives from uncultivated clades?

Thank you!


You can find the CheckM marker set used for each genome in the GTDB metadata files:

All CheckM related fields start with “checkm_”.

The CheckM reference tree has not been updated since its original release and does not contain an uncultivated clades.