Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

positions with units and/or geneticposition #24

Open
nlwashington opened this issue May 30, 2015 · 12 comments
Open

positions with units and/or geneticposition #24

nlwashington opened this issue May 30, 2015 · 12 comments

Comments

@nlwashington
Copy link

there is an implicit assumption that positional coordinates are made using sequencing maps (and therefore the unit of measure is a nucleotide/basepair or amino acid). however, this same model could be used with other kinds of maps and units of measure, such as centimorgans, centirays, for integration with genetic maps.

some things that could be done to address this:

  1. add a new subclass of position to be something like "GeneticPosition"
  2. add a new kind of data_property for a :genetic_position, where the unit is implicit (though i dislike this)
  3. add a new property to Position classes for a :unit that can be specified using something like the unit ontology.
@nlwashington
Copy link
Author

@mbrush and @cmungall may be interested

@selewis
Copy link

selewis commented May 30, 2015

The math-y representation for this is Allen Interval Calculus. Good for sequence and for time, anything 1-D. Chris knows all about it. I have a figure from an old proposal that lays it all out. Math-wise I could see it being used on paths through a graph.

Also need to deal with +/- for positions that are uncertain.

@cmungall
Copy link
Contributor

There are a lot of built in assumptions in FALDO, generalizing classes or properties after the fact could lead to some oddities. Another approach would be to create a more general framework and retrospectively seamlessly slot in FALDO as a specific instance of this. There's no reason why this shouldn't be general enough to encompass other kinds of entities on all kinds of other maps too.

@cmungall
Copy link
Contributor

Allen Interval Algebra applied to genomic intervals: http://dx.doi.org/10.1101/006650
...a case of a formalism in search of a use case though. If you have number values for positions (as you almost always do), the 'reasoning' is trivial.

@JervenBolleman
Copy link
Collaborator

Option 3 would break all existing queries so I dislike that. While breaking queries is tolerable if they can be fixed automatically, in this case I don't think they can. The FALDO use case was aimed at sequence databases but it can be extended to deal with genetic maps.

I think option 1 and 2 combined is the direction we should go e.g. something like this

@prefix faldo:<http://biohackathon.org/resource/faldo>
faldo:CentiRayPosition rdfs:subClassOf faldo:FuzzyPosition ;

faldo:centiRayReference a  owl:ObjectProperty ;
         rdfs:domain faldo:RadiationHybridMap ;
         rdfs:range   faldo:CentiRayPosition .

#could be hasCentiRayPosition depending on pull #23
faldo:centiRayPosition a owl:DatatypeProperty ; 
         rdfs:domain  faldo:CentiRayPosition .

faldo:RadiationHybridMap a owl:Class .

faldo:radiationDoseUsedToGenerateMapInRads ;
        rdfs:range faldo:RadiationHybridMap .

A similar solution would be required for Cytogenetic Maps.

@nlwashington
Copy link
Author

generally it seems fine.... but dosing property here would still require units.

@nlwashington
Copy link
Author

any updates to this? Don't forget about centiMorgan positions.

@nlwashington
Copy link
Author

also, position values need to be able to be floats/ non-integer numbers.

@JervenBolleman
Copy link
Collaborator

@nlwashington do you have any modelling suggestions as which unit ontology to use for the dosing propery/values? e.g. http://www.qudt.org/qudt/owl/1.0.0/quantity/Instances.html#DoseEquivalent
unit would be sievert, which is annoying in the field as that is 100 times larger than a rad.

faldo:centiRayPosition a owl:DatatypeProperty ; 
          rdfs:range xsd:decimal ;

Would give us a wide range of position values.

@JervenBolleman
Copy link
Collaborator

@prefix faldo:<http://biohackathon.org/resource/faldo>
faldo:CentiMorganPosition rdfs:subClassOf faldo:FuzzyPosition ;

faldo:centiMorganReference a  owl:ObjectProperty ;
         rdfs:domain faldo:LinkageDistanceMap ;
         rdfs:range   faldo:CentiRayPosition .

faldo:centiMorganPosition a owl:DatatypeProperty ; 
         rdfs:domain  faldo:CentiMorganPosition .

#please think of a better name
faldo:LinkageDistanceMap a owl:Class .

@nlwashington could you try sharing some data snippets encoded in this way and tell us if this works for you?

@nlwashington
Copy link
Author

ok, let me prepare something...

@jmcmurry
Copy link

jmcmurry commented Mar 9, 2017

Came across this thread. Nicole is no longer on the Monarch project so I'm copying in @kshefchek and @mbrush

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants