Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server datasets missing, track hub parse failing for chm13v2/hs1 reference genome #1643

Open
kpalin opened this issue Jan 24, 2025 · 2 comments
Assignees
Milestone

Comments

@kpalin
Copy link

kpalin commented Jan 24, 2025

Running IGV version 2.19.1 viewing the T2T reference chm12v2 a.k.a. hs1. I'm trying to get extra annotations (like repeat masker) from online.

  1. After File->Load from Server I get "No datasets are available for the current genome (hs1)."
  2. There is no Genomes -> Load Track Hub menu option. Instead there is "Load genome from UCSC GenArk" which does not have hs1.
  3. After Genomes -> Load genome from URL -> https://hgdownload.soe.ucsc.edu/gbdb/hs1/hubs/public/hub.txt I get Error loading genome: For input string: "3.4" That's not what happens with for example https://hgdownload.soe.ucsc.edu/hubs/GCF/000/002/285/GCF_000002285.5/hub.txt which works fine. (Note: the hs1 hub.txt does not contain string "3.4".)

I think the track hub feature is great improvement for IGV, if only..

@jrobinso
Copy link
Contributor

Not all genomes have extra annotations under "File -> Load from Server". In fact most don't.

I will look into the issue with https://hgdownload.soe.ucsc.edu/gbdb/hs1/hubs/public/hub.txt and see if it can be resolved.

@jrobinso
Copy link
Contributor

jrobinso commented Jan 24, 2025

OK I've reproduced the issue with the hs1 hub. This is obviously an important hub, I am going to prioritize fixing this for the next release. The problem is a little deeper than the parsing error (IGV was not expecting floating point numbers here), the auto selection for initial load results in way too many tracks. The hub support was designed for and tested against Genark hubs, this one is going to require some different rules.

In the meantime, you can load individual tracks from this hub by URL with this admittedly cumbersome process.

(1) Fnd the track you are looking for in the text file https://hgdownload.soe.ucsc.edu/gbdb/hs1/hubs/public/hub.txt, for example

track rnaseq_k100-dual-21mer
type bigWig
visibility dense
shortLabel RNA-Seq k100 dual 21mer
negateValues off
bigDataUrl /gbdb/hs1/rnaseq/RNAseq_k100_dual_21mer.bw
parent rnaseq
longLabel RNA-Seq k100 dual 21mer filtering

(2) Prepend the host "https://hgdownload.soe.ucsc.edu" to the value of bigDataURL (assuming bigDataUrl starts with a forward slash). For example

https://hgdownload.soe.ucsc.edu/gbdb/hs1/rnaseq/RNAseq_k100_dual_21mer.bw

(3) Use "File > Load from URL" to load the tracks

@jrobinso jrobinso self-assigned this Jan 24, 2025
@jrobinso jrobinso added this to the 2.19.2 milestone Jan 24, 2025
jrobinso added a commit that referenced this issue Jan 26, 2025
* Bug - fix parsing of non integer priorities
* Bug fix - handle absolute data URLs in track hub file (see issued #1643)
* Track hubs: restrict initial track selection list to Gene group if total track count is > 20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants