Hi, just want to report that there is a contamination in the genome GTDB - Loading.... The genomes has perfect matches to 16 different RNA sequences in the ERCC RNA spike-in mix. Considering the mix is artificially designed for quantification in RNAseq and therefore shouldn’t have any matches, let alone 16 different perfect matches in any bacteria, this is almost certainly a contamination issue. I came across this when analyzing the bacterial content of a RNAseq sample with ERCC spike-in.
I know that you guys are not responsible for cleaning up genomes, but maybe consider not using this genome as a representative species/strain, or not including it in your next release at all?
Thanks.