-
Notifications
You must be signed in to change notification settings - Fork 56
Description
Before opening an issue, please:
- Make sure you are using the latest version using
datasets --version - Review our documentation
Describe the bug
Inconsistent taxonomy queries being returned with --rank specified
To Reproduce
Here is an example using a while loop where it took 13 attempts to retrieve the expected metadata:
while true; do echo "attempting"; datasets summary taxonomy taxon 138948 --rank species --as-json-lines; done
attempting
attempting
attempting
attempting
attempting
attempting
attempting
attempting
attempting
attempting
attempting
attempting
attempting
{"query":["138948"],"taxonomy":{"children":[138948],"classification":{"acellular_root":{"id":10239,"name":"Viruses"},"class":{"id":2732506,"name":"Pisoniviricetes"},"family":{"id":12058,"name":"Picornaviridae"},"genus":{"id":12059,"name":"Enterovirus"},"kingdom":{"id":2732396,"name":"Orthornavirae"},"order":{"id":464095,"name":"Picornavirales"},"phylum":{"id":2732408,"name":"Pisuviricota"},"realm":{"id":2559587,"name":"Riboviria"},"species":{"id":3428500,"name":"Enterovirus alphacoxsackie"}},"counts":[{"count":43,"type":"COUNT_TYPE_ASSEMBLY"},{"count":3,"type":"COUNT_TYPE_GENE"},{"count":3,"type":"COUNT_TYPE_PROTEIN_CODING"}],"current_scientific_name":{"name":"Enterovirus alphacoxsackie"},"genomic_moltype":"ssRNA(+)","group_name":"viruses","parents":[1,10239,2559587,2732396,2732408,2732506,464095,12058,2946630,2960224,12059],"rank":"SPECIES","tax_id":3428500}}
attemptingThis behavior does NOT occur without --rank:
attempting
{"query":["138948"],"taxonomy":{"children":[1530249,469959,306587,297248,306586,306588,39054,2884179,2870395,2870394,2870393,2487724,1435148,1530250,2760809,42788,42787,86107,42786,42785,42784,33757,31704,42773,42771,42769,172022],"classification":{"acellular_root":{"id":10239,"name":"Viruses"},"class":{"id":2732506,"name":"Pisoniviricetes"},"family":{"id":12058,"name":"Picornaviridae"},"genus":{"id":12059,"name":"Enterovirus"},"kingdom":{"id":2732396,"name":"Orthornavirae"},"order":{"id":464095,"name":"Picornavirales"},"phylum":{"id":2732408,"name":"Pisuviricota"},"realm":{"id":2559587,"name":"Riboviria"},"species":{"id":3428500,"name":"Enterovirus alphacoxsackie"}},"counts":[{"count":43,"type":"COUNT_TYPE_ASSEMBLY"},{"count":3,"type":"COUNT_TYPE_GENE"},{"count":3,"type":"COUNT_TYPE_PROTEIN_CODING"}],"current_scientific_name":{"name":"Enterovirus A"},"genomic_moltype":"ssRNA(+)","group_name":"viruses","parents":[1,10239,2559587,2732396,2732408,2732506,464095,12058,2946630,2960224,12059,3428500],"secondary_tax_ids":[29269],"tax_id":138948}}
attempting
{"query":["138948"],"taxonomy":{"children":[1530249,469959,306587,297248,306586,306588,39054,2884179,2870395,2870394,2870393,2487724,1435148,1530250,2760809,42788,42787,86107,42786,42785,42784,33757,31704,42773,42771,42769,172022],"classification":{"acellular_root":{"id":10239,"name":"Viruses"},"class":{"id":2732506,"name":"Pisoniviricetes"},"family":{"id":12058,"name":"Picornaviridae"},"genus":{"id":12059,"name":"Enterovirus"},"kingdom":{"id":2732396,"name":"Orthornavirae"},"order":{"id":464095,"name":"Picornavirales"},"phylum":{"id":2732408,"name":"Pisuviricota"},"realm":{"id":2559587,"name":"Riboviria"},"species":{"id":3428500,"name":"Enterovirus alphacoxsackie"}},"counts":[{"count":43,"type":"COUNT_TYPE_ASSEMBLY"},{"count":3,"type":"COUNT_TYPE_GENE"},{"count":3,"type":"COUNT_TYPE_PROTEIN_CODING"}],"current_scientific_name":{"name":"Enterovirus A"},"genomic_moltype":"ssRNA(+)","group_name":"viruses","parents":[1,10239,2559587,2732396,2732408,2732506,464095,12058,2946630,2960224,12059,3428500],"secondary_tax_ids":[29269],"tax_id":138948}}
attempting
{"query":["138948"],"taxonomy":{"children":[1530249,469959,306587,297248,306586,306588,39054,2884179,2870395,2870394,2870393,2487724,1435148,1530250,2760809,42788,42787,86107,42786,42785,42784,33757,31704,42773,42771,42769,172022],"classification":{"acellular_root":{"id":10239,"name":"Viruses"},"class":{"id":2732506,"name":"Pisoniviricetes"},"family":{"id":12058,"name":"Picornaviridae"},"genus":{"id":12059,"name":"Enterovirus"},"kingdom":{"id":2732396,"name":"Orthornavirae"},"order":{"id":464095,"name":"Picornavirales"},"phylum":{"id":2732408,"name":"Pisuviricota"},"realm":{"id":2559587,"name":"Riboviria"},"species":{"id":3428500,"name":"Enterovirus alphacoxsackie"}},"counts":[{"count":43,"type":"COUNT_TYPE_ASSEMBLY"},{"count":3,"type":"COUNT_TYPE_GENE"},{"count":3,"type":"COUNT_TYPE_PROTEIN_CODING"}],"current_scientific_name":{"name":"Enterovirus A"},"genomic_moltype":"ssRNA(+)","group_name":"viruses","parents":[1,10239,2559587,2732396,2732408,2732506,464095,12058,2946630,2960224,12059,3428500],"secondary_tax_ids":[29269],"tax_id":138948}}
Steps to reproduce the behavior:
Run datasets with --rank specified. These runs were initialized without prior queries, so the initial failures are likely not due to querying the API too often.
Expected behavior
Queries should be successful and consistent. Does this have to do with the February 2026 deprecation schedule being implemented early on some servers?