Skip to content

datasets download virus genome taxon panics with segmentation fault for all taxa, starting around 2025-11-24 9am UTC #539

@corneliusroemer

Description

@corneliusroemer

Describe the bug
Since around 9am UTC today, all datasets download virus genome taxon requests fail with:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x18 pc=0x1026d67f0]

At Loculus, we mirror certain taxa every 2 hours so we can pin down to within 2hr when this started:

https://github.com/loculus-project/loculus/actions/workflows/datasets-mirror-priority-1.yml

To Reproduce

datasets download virus genome taxon 11234 --debug

Expected behavior
No panic

Logs

Issue seems to be related to the taxonomy service. Failure is in ResolveTaxons

$ datasets download virus genome taxon 11234 --filename 11234.zip --debug
2025/11/24 13:26:24
POST /datasets/v2/taxonomy/taxon_suggest HTTP/1.1
Host: api.ncbi.nlm.nih.gov
User-Agent: OpenAPI-Generator/1.0.0/go
Content-Length: 128
Accept: application/json
Content-Type: application/json
Ncbi-Phid: 71E4B0D51DB8E7C53AA5BB51
X-Datasets-Client: datasets-cli
X-Datasets-Client-Arch: arm64
X-Datasets-Client-Cmd: download virus genome taxon 11234 --filename 11234.zip --debug
X-Datasets-Client-Os: darwin
X-Datasets-Client-Version: 18.9.0
Accept-Encoding: gzip

{"exact_match":true,"tax_rank_filter":"higher_taxon","taxon_query":"11234","taxon_resource_filter":"TAXON_RESOURCE_FILTER_ALL"}

2025/11/24 13:26:24
HTTP/2.0 200 OK
Access-Control-Expose-Headers: X-RateLimit-Limit,X-RateLimit-Remaining
Content-Security-Policy: upgrade-insecure-requests
Content-Type: application/json
Date: Mon, 24 Nov 2025 12:26:23 GMT
Grpc-Metadata-Via: h2 linkerd
Ncbi-Phid: 71E4B0D51DB8E7C53AA5BB51.1.1.1
Server: Finatra
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-Datasets-Version: 18.9.1
X-Ratelimit-Limit: 5
X-Ratelimit-Remaining: 4
X-Ua-Compatible: IE=Edge
X-Xss-Protection: 1; mode=block


2025/11/24 13:26:24
POST /datasets/v2/taxonomy HTTP/1.1
Host: api.ncbi.nlm.nih.gov
User-Agent: OpenAPI-Generator/1.0.0/go
Content-Length: 51
Accept: application/json
Content-Type: application/json
Ncbi-Phid: 71E4B0D51DB8E7C53AA5BB51
X-Datasets-Client: datasets-cli
X-Datasets-Client-Arch: arm64
X-Datasets-Client-Cmd: download virus genome taxon 11234 --filename 11234.zip --debug
X-Datasets-Client-Os: darwin
X-Datasets-Client-Version: 18.9.0
Accept-Encoding: gzip

{"returned_content":"METADATA","taxons":["11234"]}

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x18 pc=0x1026d67f0]

goroutine 1 [running]:
datasets_cli/v2/datasets.(*taxonAutosuggestApi).GetMetadata(0x14000049aa0, {0x16dc95bbd, 0x5}, {0x1026fa873, 0x8})
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:143 +0x150
datasets_cli/v2/datasets.(*taxonAutosuggestApi).CheckLineage(0x16dc95bbd?, {0x16dc95bbd, 0x5}, 0x27ff)
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:148 +0x34
datasets_cli/v2/datasets.(*taxonAutosuggestApi).GetOrganisms(0x14000049aa0, {0x16dc95bbd?, 0x0?}, 0x1, {0x102705507, 0x19}, {0x1026f924a, 0x5}, 0xa, {0x14000049b64, ...})
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:415 +0x114
datasets_cli/v2/datasets.RetrieveTaxIdsForTaxons(0x1400022ef08, {0x14000113040, 0x1, 0x16dc95bbd?}, 0x1, {0x102705507, 0x19}, {0x1026f924a, 0x5}, {0x14000049b64, ...})
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:112 +0xf4
datasets_cli/v2/datasets.createDownloadVirusGenomeTaxonCmd.func1(0x1400022ef08, {0x14000131200?, 0x1?, 0x4?})
	apps/public/Datasets/v2/datasets/DownloadVirusGenomeTaxon.go:43 +0x14c
github.com/spf13/cobra.(*Command).execute(0x1400022ef08, {0x140001311c0, 0x4, 0x4})
	external/gazelle~~go_deps~com_github_spf13_cobra/command.go:985 +0x834
github.com/spf13/cobra.(*Command).ExecuteC(0x102dcdbe0)
	external/gazelle~~go_deps~com_github_spf13_cobra/command.go:1117 +0x344
github.com/spf13/cobra.(*Command).Execute(...)
	external/gazelle~~go_deps~com_github_spf13_cobra/command.go:1041
datasets_cli/v2/datasets.Execute()
	apps/public/Datasets/v2/datasets/root.go:422 +0x24
main.main()
	apps/public/Datasets/v2/cmd/datasets/main.go:10 +0x1c

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions