GTDB Forum

Main download server returns Last-Modified headers that are unparseable by Groovy

Hi,

We’ve had problems downloading files using Nextflow, written in Groovy. Download fails with a message like ‘Unparseable date “Wednesday, 10-Feb-2021 12:05:11 GMT”’. The mirror site works for the same file.

I’ve noticed that the main server returns Last-Modified headers with hyphens in the date part of the timestamp, e.g. “Last-Modified: Friday, 12-Feb-2021 06:34:13 GMT”, while the mirror does not: “Last-Modified: Thu, 16 Jul 2020 22:39:52 GMT”.

URLs tried:

Main server: https://data.gtdb.ecogenomic.org/releases/latest/genomic_files_reps/bac120_ssu_reps.tar.gz

Mirror: https://data.ace.uq.edu.au/public/gtdb/data/releases/latest/genomic_files_reps/bac120_ssu_reps.tar.gz

We would of course prefer to use the main server.

/Daniel

Hello,

Thanks for your Email,
We have moved our repository to a nginx server and the ‘$date_gmt’ variable for the last-modified parameter produces a date/time format which is now obsolete ( headers with hyphens).
We are currently working on changing the format to the new preferred method (with no hyphen).

In the meantime, the erroneous last_modified header has been been switched off.

You should now be able to download the GTDB files.

Regards,
Pierre

Thanks Pierre,

Unfortunately we still can’t download and get the same error message. I have verified that the last modified header is not sent and there are no other timestamp headers that look different than the mirror site, so I’m puzzled. I’ll continue looking into this, and in the meanwhile we’re getting files from the GTDB mirror.

/Daniel

Hi Daniel,

I’m not able to replicate this issue - I suspect that this might be a locale issue? It might be worth raising it with the Nextflow community.

groovy> def current = file('https://data.gtdb.ecogenomic.org/releases/latest/genomic_files_reps/bac120_ssu_reps.tar.gz') 
groovy> println nextflow.util.CacheHelper.hasher(current).hash() 
groovy> def mirror = file('https://data.ace.uq.edu.au/public/gtdb/data/releases/latest/genomic_files_reps/bac120_ssu_reps.tar.gz') 
groovy> println nextflow.util.CacheHelper.hasher(mirror).hash() 
 
bef7f5e00430246426ee603c1ab9999a
dc0395aec57ee4d7b5f2efebfcb5a5a1

Aaron

Thanks Aaron,

It works for me too, from the Nextflow console. Sorry for not checking myself properly. I’ll get in touch with my colleague who reported the problem to me to begin with and see if she can replicate the problem using different locale settings. In any case, you can consider this closed.

/Daniel