Hi,
I want to make a MSA of all o__Bacillales plus my genomes of interest.
gtdbtk align --identify_dir GTDB_identify --out_dir GTDB_align --taxa_filter o__Bacillales
using gtdb-tk v2.4.0 and R220. According to the advance search on the GTDB homepage there should be 10825 genomes belonging to Bacillales, however only 282 taxa based on assigned taxonomy remained in alignment. Is this due to similarities to my genomes of interest or why are there only 282 taxa included in the analysis? Thank you for clarification.
Hello,
GTDB-Tk only contains representative genomes ( one for each species). When running the align command
in Tk you will not align the 10,825 Bacillales genomes but only the 282 representative genomes in this order.
Regards,
Pierre