Hello GTDB Team and Community,
I’m encountering issues with the Release 226 database when running GTDB-Tk v2.3.2. The database fails the integrity check and causes errors during classification.
After downloading the Release 226 database, I’m facing two main issues:
- Database Integrity Check Failure:
gtdbtk check_install
Returns multiple HASH MISMATCH errors for various components.
2.Missing FastANI Directory:
The extracted release226/ directory is missing the fastani/ folder entirely, which causes runtime errors.
Error Details
When running classification:
gtdbtk classify_wf --genome_dir genomes/ --extension fasta --out_dir gtdbtk_analysis --cpus 32 --prefix test --skip_ani_screen
I get the following error:
text
TASK: Traversing tree to determine classification method.
INFO: Completed 36 genomes in 0.01 seconds (6,272.90 genomes/second).
ERROR: Reference genome missing from FastANI database: gtdbtk/release226/fastani/database/GCF/003/697/165/GCF_003697165.2_genomic.fna.gz
ERROR: Controlled exit resulting from an unrecoverable error or warning.
What I’ve Tried
Database Verification:
Downloaded Release 226 multiple times
Verified file integrity using checksums
Tried both full and lightweight versions
Directory Structure:
The release226/ directory contains:
text
markers/
masks/
metadata/
mrca_red/
msa/
pplacer/
radii/
skani/
split/
taxonomy/
But is missing the fastani/ directory.
Alternative Versions:
Release 214 works correctly
The issue is specific to Release 226
Environment Details
GTDB-Tk Version: 2.3.2
Database Version: Release 226
Download Source: https://data.gtdb.aau.ecogenomic.org/releases/release226/226.0/auxillary_files/gtdbtk_package/full_package/gtdbtk_r226_data.tar.gz
Operating System: Linux
Questions
Is there a known issue with the Release 226 database distribution?
Are there additional download steps required for the FastANI component?
Should the FastANI database be downloaded separately?
Is Release 226 fully compatible with GTDB-Tk v2.3.2?
Temporary Workaround
Currently, I’m using Release 214 which works without issues, but I’d like to use the latest Release 226 for the most up-to-date taxonomy.
Any guidance on resolving this issue would be greatly appreciated. Thank you for your help and for maintaining this excellent tool!
Best regards,
Yingying Qiu