Hi all!
I installed the new version of gtdb v1.3.0 and database release 95! I tried to run the gtdb program with 4 genomes (but I have 130 genomes ) and I received the following messagge:
WARNING: pplacer requires ~152 GB of RAM to fully load the bacterial tree into memory. However, 131.97GB was detected. This may affect pplacer performance, or fail if there is insufficient scratch space.
I would like to know the impact of this error in my analysis
Because your server has less than the required RAM (131/152GB) , GTDB-Tk will need to use scratch space. If program cant access the scratch space, it will crash and not return any results for the classify step.
Having less than the recommended amount of RAM doesnât affect the accuracy of the tool but instead increases the risk it crashing.
To reduce the memory usage, you can use the --scratch_dir flag ( see https://ecogenomics.github.io/GTDBTk/commands/index.html ) .
Hello @Kalonji_Abondance ,
The --scratch_dir flag is used to create a directory where pplacer will write a mmap-file ( by default gtdbtk.pplacer.scratch). This is equivalent to mmap-file in pplacer. From the test we have done so far on our servers, the difference of speed ,with or without scratch-dir, is minimal.
From the pplacer documentation:
In cases when there isnât enough memory for pplacer to use for internal nodes, or itâs otherwise disadvantageous to use physical memory, itâs possible to instead tell pplacer to mmap a file instead using the --mmap-file flag. This will, very roughly, perform disk IO instead of using physical memory.