Different species shared ANI higher than 95% or even 97% threshold

Recently, I performed an metagenomic analysis using AOA genomes downloaded from GTDB. However, I found some species shared ANI higher than 97%. In the ‘Q&A’ (GTDB - FAQ), the threshold to distinguish different species is 95% ANI and a maximum up to 97% ANI is also allowed (If I understand correctly). However, why these some species even shared ANI higher than 99% but still be assigned as different species? Here are some examples (not all):


These ANI plots were produced by dRep (97% threshold) based on all representive genomes of Nitrososphaeria. In other word, each genome represents a species in GTDB.

Did you also look at the alignment fraction as well? Not sure if drep provides the info but skani does.

Hi. This does indeed appear to be related to GTDB using a 50% alignment fraction (AF) for defining species clusters. The following results are for FastANI and we are now using skani, but this does indicate that the AF between some of these genomes is very low:

This is a challenging situation since the low AF may just be due to relatively poor quality genome assemblies and thus the conclusion that GCA_021786575.1 and GCA_027339745.1 are different species is unclear.