Missing ncbi_translation_table values in metadata files

Hi,
I’m using the bac120_metadata_r226.tsv and ar53_metadata_r226.tsv files and noticed that some genomes have “none” for ncbi_translation_table even though ncbi_taxid is present.
Example: JBFJMG01_sp041634015 (taxid 3121703) shows “none” for the translation table.
Even stranger: for Akkermansia muciniphila_A (reference RS_GCF_030848305.1), all genomes have “none” except the uncultured strains.
Is this because the data is missing from NCBI’s original records, or is there another reason?
Thanks!

Hi,

As you suspected the ncbi_translation_table field is set to none if we are unable to automatically parse this information from the NCBI data files. We do this using NCBI metadata files for each genome and not by consulting information for specific NCBI taxid. Unfortunately, this can result in some missing data for genomes that, in some cases, can be inferred indirectly based on other information (i.e. assigned taxid and assumption that all genomes with this taxid use a given translation table).

That said, I think you may have a parsing or sorting issue. I took a look at bac120_metadata_r226.tsv and it lists the ncbi_translation_table for RS_GCF_030848305.1 as table 11. There are a large number of genomes classified as A. muciniphila where the ncbi_translation_table is none though and this is due to this information being missing for these specific genomes.

Cheers,
Donovan