GTDB Forum

CheckM marker set lineage?

Hello,

Is it possible to find which marker set lineage was used for the CheckM completeness estimates provided on GTDB? I am studying a clade with lineage-specific gene loss, and want to gauge whether the low CheckM completeness is estimated from a lineage-specific gene set that takes this into account.

In general, do updates of CheckM have a more comprehensive reference tree with representatives from uncultivated clades?

Thank you!

Hi,

You can find the CheckM marker set used for each genome in the GTDB metadata files:
https://data.ace.uq.edu.au/public/gtdb/data/releases/release95/95.0/ar122_metadata_r95.tar.gz
https://data.ace.uq.edu.au/public/gtdb/data/releases/release95/95.0/bac120_metadata_r95.tar.gz
https://data.ace.uq.edu.au/public/gtdb/data/releases/release95/95.0/auxillary_files/metadata_field_desc.tsv

All CheckM related fields start with “checkm_”.

The CheckM reference tree has not been updated since its original release and does not contain an uncultivated clades.

Cheers,
Donovan