Error in GTDB-Tk v2.4.0, gtdbtk classify_wf ; ERROR: Error generating Mash sketch:

Hi, Please help me to resolve the following error :

[2024-06-25 15:03:32] INFO: GTDB-Tk v2.4.0
[2024-06-25 15:03:32] INFO: gtdbtk classify_wf --genome_dir /Volumes/Ext.HD-NRLC_Old/genomes --out_dir /Volumes/Ext.HD-NRLC_Old/gtdbtk_output_new --mash_db /Volumes/Ext.HD-NRLC_Old/gtdbtk_mash_sketch.msh --extension fasta --cpus 4 --tmpdir /Volumes/Ext.HD-NRLC_Old/tmp
[2024-06-25 15:03:32] INFO: Using GTDB-Tk reference data version r220: /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220
[2024-06-25 15:03:32] INFO: Loading reference genomes.
[2024-06-25 15:03:32] INFO: Using Mash version 2.3
[2024-06-25 15:03:32] INFO: Loading data from existing Mash sketch file: /Volumes/Ext.HD-NRLC_Old/gtdbtk_output_new/classify/ani_screen/intermediate_results/mash/gtdbtk.user_query_sketch.msh
[2024-06-25 15:03:32] INFO: Creating Mash sketch file: /Volumes/Ext.HD-NRLC_Old/gtdbtk_mash_sketch.msh
[2024-06-25 15:53:29] INFO: Completed 113,104 genomes in 49.94 minutes (2,264.78 genomes/minute).
[2024-06-25 15:53:29] ERROR: Error generating Mash sketch:
Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCA/000/008/085/GCA_000008085.1_genomic.fna.gz…
Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCA/000/008/885/GCA_000008885.1_genomic.fna.gz…
Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCA/000/009/845/GCA_000009845.1_genomic.fna.gz…
Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCA/000/010/565/GCA_000010565.1_genomic.fna.gz…
Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCA/000/011/445/GCA_000011445.1_genomic.fna.gz…



(omitting the processes)



Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCF/963/378/075/GCF_963378075.1_genomic.fna.gz…
Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCF/963/378/095/GCF_963378095.1_genomic.fna.gz…
Sketching /Volumes/Ext.HD-NRLC_Old/gtdbtk_data/release220/skani/database/GCF/963/378/105/GCF_963378105.1_genomic.fna.gz…
Writing to /Volumes/Ext.HD-NRLC_Old/gtdbtk_mash_sketch.msh…
libc++abi: terminating with uncaught exception of type kj::ExceptionImpl: kj/io.c++:405: failed: ::writev(fd, current, iov.end() - current): Invalid argument; fd = 3
stack: 104b2c679 104b2c97a 104b03884 104af375c 104aba5aa 104af4d14 104afa9c4 104ab07a3

[2024-06-25 15:53:29] ERROR: Controlled exit resulting from an unrecoverable error or warning.

An error would have been occurred when a mash sketch file (gtdbtk_mash_sketch.msh) was being generated (kj::ExceptionImpl). I am using an external HD (1.77TB space/2TB) to keep enough space to write. Is it better to use home directory (only 396.4 GB space/1TB)? Or are there any other solutions?

Hi. The error looks to be with mash itself, a 3rd party program we use internally in GTDB-Tk. I would try writing this to your local disk. Disk space shouldn’t be an issue here, but it might be an I/O issue if mash is producing results far quicker than can be written to an external disk.

Thank you very much [donovan.parks]. I will try again to my local disk.

I tried to use GTDB-Tk v2.4.0 in my local disk instead of using an external HD.
However, same error occured again although the program seemed to run.

[2024-07-03 11:39:08] INFO: GTDB-Tk v2.4.0
[2024-07-03 11:39:08] INFO: gtdbtk classify_wf --genome_dir /Users/us009/gtdbtk_data/genomes/batch_1 --out_dir /Users/us009/gtdbtk_data/gtdbtk_output_new/batch_1 --mash_db /Users/us009/gtdbtk_data/gtdbtk_mash_sketch.msh --extension fasta --cpus 8 --tmpdir /Users/us009/gtdbtk_data/tmp
[2024-07-03 11:39:08] INFO: Using GTDB-Tk reference data version r220: /Users/us009/gtdbtk_data/release220
[2024-07-03 11:39:08] INFO: Loading reference genomes.
[2024-07-03 11:39:08] INFO: Using Mash version 2.3
[2024-07-03 11:39:08] INFO: Creating Mash sketch file: /Users/us009/gtdbtk_data/gtdbtk_output_new/batch_1/classify/ani_screen/intermediate_results/mash/gtdbtk.user_query_sketch.msh
[2024-07-03 11:39:08] INFO: Completed 1 genome in 0.06 seconds (16.72 genomes/second).
[2024-07-03 11:39:08] INFO: Creating Mash sketch file: /Users/us009/gtdbtk_data/gtdbtk_mash_sketch.msh
[2024-07-03 12:01:23] INFO: Completed 113,104 genomes in 22.26 minutes (5,082.09 genomes/minute).
[2024-07-03 12:01:23] ERROR: Error generating Mash sketch:
Sketching /Users/us009/gtdbtk_data/release220/skani/database/GCA/000/008/085/GCA_000008085.1_genomic.fna.gz…
Sketching /Users/us009/gtdbtk_data/release220/skani/database/GCA/000/008/885/GCA_000008885.1_genomic.fna.gz…
Sketching /Users/us009/gtdbtk_data/release220/skani/database/GCA/000/009/845/GCA_000009845.1_genomic.fna.gz…




Sketching /Users/us009/gtdbtk_data/release220/skani/database/GCF/963/378/095/GCF_963378095.1_genomic.fna.gz…
Sketching /Users/us009/gtdbtk_data/release220/skani/database/GCF/963/378/105/GCF_963378105.1_genomic.fna.gz…
Writing to /Users/us009/gtdbtk_data/gtdbtk_mash_sketch.msh…
libc++abi: terminating due to uncaught exception of type kj::ExceptionImpl: kj/io.c++:405: failed: ::writev(fd, current, iov.end() - current): Invalid argument; fd = 3
stack: 1007a1679 1007a197a 100778884 10076875c 10072f5aa 100769d14 10076f9c4 1007257a3
[2024-07-03 12:01:24] ERROR: Controlled exit resulting from an unrecoverable error or warning.

==> Processed 0/1 genomes (0%) | | [?genome/s, ETA ?]

==> Processed 1/1 genomes (100%) |███████████████| [16.67genome/s, ETA 00:00]

==> Processed 0/113104 genomes (0%) | | [?genome/s, ETA ?]
==> Processed 12/113104 genomes (0%) | | [116.40genome/s, ETA 16:11]
==> Processed 24/113104 genomes (0%) | | [87.77genome/s, ETA 21:28]
==> Processed 34/113104 genomes (0%) | | [91.23genome/s, ETA 20:39]
==> Processed 44/113104 genomes (0%) | | [84.14genome/s, ETA 22:23]




==> Processed 113094/113104 genomes (100%) |██████████████▉| [72.50genome/s, ETA 00:00]
==> Processed 113103/113104 genomes (100%) |██████████████▉| [72.97genome/s, ETA 00:00]

==> Processed 113104/113104 genomes (100%) |███████████████| [72.97genome/s, ETA 00:00]

Now I try to figure out the situation, but I do not have any clues to solve this problem.