Announcing GTDB R10-RS226

GTDB release R10-RS226 is comprised of 732,475 genomes (22% increase) organized into 143,6141 species clusters (37% increase). Additional statistics for this release are available on the GTDB Statistics page.

Release notes

  • Post-curation cycle, we identified updated spelling for 1 taxon and a valid name for a placeholder:

    • g__Prometheoarchaeum (updated name: Promethearchaeum)
    • f__MK-D1 (updated name: Promethearchaeaceae)
      Note that the LPSN linkouts point to the correct updated names. We encourage users to use the updated names as these will appear in the next release.
  • QC criteria for GTDB was modified to consider CheckM v1 and v2 completeness
    and contamination estimates. In order to pass QC, a genome must have completeness
    >=50%, contamination <5%, and quality (completeness - 5*contamination)
    >=50% using both the CheckM v1 and v2 estimates. The exception is that a contig
    comprised of <10 contigs passes QC if these criteria are meet be either CheckM v1 or v2.

  • Mash is no longer used as a prefilter for establishing GTDB species clusters
    as this was found to be unnecessary with the prefiltering provided internally
    by skani (Shaw et al., Nat Methods, 2023).

  • The 20% most heterogeneous sites were removed from the archaeal MSA using alignment_pruner.pl (broCode/alignment_pruner.pl at master · novigit/broCode · GitHub).

  • The GTDB taxonomy tree now provides links to Sandpiper (https://sandpiper.qut.edu.au) results which provide information about the geographic and environmental distribution of a taxon.

  • We thank Jan Mares for his assistance in curating the class Cyanobacteriia,
    Peter Golyshin for bringing Ferroplasma acidiphilum strain Y (GCF_002078355.1) to our attention, and Brian Kemish for providing IT support to the project.

2 Likes

I was wondering if there is a regular release schedule for GTDB releases. Since GTDB-Tk and GTDB are critical parts of VEBA, I want to make sure that I’m planning my dev schedule around your work if possible.

Hi. GTDB is updated annually in April. The next release will be coming out in the next week or two.

Excellent! Thanks for getting back to me. Good luck with the release. I can structure the new major release around this. I wonder how long it takes for GlobDB to update following GTDB?