GTDB Forum

Compiling directory comprised of all Escherichia .faa files from the ‘gtdb_proteins_aa_reps’

Im attempting to compile a small database of all the Escherichia _protein.faa files from the aa_reps database, however after appending all the codes onto a list (from the bac120_taxonomy file) and then searching the aa_reps database I am only able to extract 19 .faa files out of 23713 possible files - are these missing from the aa_reps database?