`gtdbtk` meaning of numbers in internal node names in the output tree


thank you for the development and maintenance of awesome software.
I am a beginner bioinformatician and I am tackling my first phylogenetic tree analysis task.
I wonder, what do the numbers in quoted internal node names mean?
Ex. ((RS_GCF_000199675.1:0.138452,RS_GCF_001050195.2:0.122911)'0.999:g__Anaerolinea':0.044629
They’re not recognized by most of soft as branch lengths, only as labels. What do they mean then?

Yours sincerely,

Hi Valentyn,

The 0.999 is the non-parametric bootstrap support for the node. EBI has a nice explanation of these values: Confidence | Phylogenetics.


1 Like

Hi Valentyn,

Just to follow up on Donovan’s comment - one issue is that a newick file allows different ways to store the bootstrap values and internal node names.
For example,
the ARB software environment does it like this:


whereas, the online tree display tool iTOL has opted for this format:

  A tree with internal node IDs:


    A, B, C    : leaf names
    INT1, INT2 : internal node IDs
    0.1, 0.3   : branch lengths
    90,98      : bootstrap values```

1 Like


The next version of GTDB-Tk (should be released next week) will have a helper method to convert the default GTDB-Tk (ARB-style) Newick trees into the format required by iTOL.


1 Like