Hi all,
I was wondering how the quality standard for ambiguous nucleotides in scaffolds of genomes included in the GTDB is run, and what could cause a genome to slip past it. Specifically, this entry has 1.7 million Ns in it: GTDB - GCA_003029985.1
I found out when the genome broke some work on the representative species set using anvio, and thought it’d be worth mentioning. The number of ambiguous bases listed is N/A, so that might be why it slipped through the <100000 ambigous bases threshold?
Thanks!
Daan