Conversation
content/reference-strategy/index.md
Outdated
| 1. Compose a record set URI: `https://country.register.gov.uk/records/` | ||
| 1. Concatenate the URI with the value: `https://country.register.gov.uk/records/GB` | ||
|
|
||
| There is a special case here were a foreign key refers to a register register |
content/reference-strategy/index.md
Outdated
|
|
||
|
|
||
| TODO: Where does the context live? It can't be part of the register API | ||
| because a Register could be part of multiple environments. |
There was a problem hiding this comment.
What is an example of a register being part of multiple environments?
There was a problem hiding this comment.
The immediate situation would be a Register being part of the register.gov.uk environment and being part of a local/test environment.
Another one, artificial case right now, would be a register from NHS that is part of the NHS environment and part of ours.
There was a problem hiding this comment.
Does that mean the CURIE doesn't resolve to a "single source of truth" but instead just one of many places where that truth is published?
There was a problem hiding this comment.
It resolves to the source of truth you decided. Which most of the time is our environment.
There was a problem hiding this comment.
@michaelabenyohai and I had a chat about another topic that led to a discussion around the possibility to encode more specific data instead of the CURIE (as per the spec) so it could be proved correct. But right now it is just a thought.
content/reference-strategy/index.md
Outdated
| * `example:32` -> `https://example.org/32` | ||
|
|
||
|
|
||
| TODO: Where does the context live? It can't be part of the register API |
There was a problem hiding this comment.
If someone tampers with the context, then a CURIE would resolve to the wrong data.
There was a problem hiding this comment.
Ah yes, it's something to keep an eye on.
bffede7 to
dbc76ee
Compare
michaelabenyohai
left a comment
There was a problem hiding this comment.
I've added a few thoughts that came to mind. I'm not disagreeing with your points, just mentioning things we might want to note that we have considered.
On a more general note, do we want to consider how this affects our current field naming rules and conventions and the difficulties we have had with this recently?
Specifically, we still have the rule that the key field of a register must have the same name as the register. This guidance instructs that these fields must no longer be created as a foreign key (which they always have previously). This in turn means that this field name can never be reused as a link from another register (which we currently do a lot). Instead, we will have to create a new field with a slightly different name, so could end up proliferating fields that come in "pairs" with similar-but-not-quite-the-same name.
I feel we could treat this as the key field should describe "the thing", whereas the curie field in the "other thing" register should describe the relationship of "the thing" to "other thing". In practice I think this could be hard though.
We've previously considered removing the rule that the key field always has the same name as the register and start calling it id (although there's no reason why it has to be the same in every register, since it is the key property of the entry that is actually mandatory to name correctly). The crux of my point is whether we need accelerate this change to make it feasible to remove foreign keys in practice?
content/reference-strategy/index.md
Outdated
|
|
||
| ### Foreign Key | ||
|
|
||
| A foreign key is a regular string that _happens_ to be an identifier in |
There was a problem hiding this comment.
It doesn't always have to be a string. It could also be any other datatype. Not sure if we care about that detail here.
But on that note. Is there ever any benefit in being able to say a link also has a datatype like integer?
There was a problem hiding this comment.
Thinking about this, it connects with another discussion about expectations around primary key values: #18
In there I was only considering strings but your comment opens the door to more (potential) complexity.
I think we care about how flexible we must be with identifiers that can be used in CURIEs (and because historical reasons in foreign keys). In particular with CURIEs, the value needs to be encoded in terms of a URL path so foo:1 would be the same if the identifier is a string or an integer. I'm leaning towards exploring issues derived to restrict identifiers to be only a subset of UTF-8.
content/reference-strategy/index.md
Outdated
| when found in another register it acts as a foreign key. To get the URI of the | ||
| referenced resource you have to: | ||
|
|
||
| 1. Given a fieldname `country` |
There was a problem hiding this comment.
This should actually be:
Given a field with name
x, where the record forxin the Field register has theregisterfield populated with valuecountry
I.e. to be clear, the field name does not always have the same name as the foreign register to be a foreign key, though it often does.
There was a problem hiding this comment.
Do you know of any cases where this is not the case?
There was a problem hiding this comment.
The field allergen-group is one and the field fields is another.
There was a problem hiding this comment.
I think this point still needs addressing.
There was a problem hiding this comment.
Ok, changed with your suggestion.
content/reference-strategy/index.md
Outdated
| 1. Compose a record set URI: `https://country.register.gov.uk/records/` | ||
| 1. Concatenate the URI with the value: `https://country.register.gov.uk/records/GB` | ||
|
|
||
| There is a special case here where a foreign key refers to a register register |
There was a problem hiding this comment.
I would counter that this is not what it actually means. In the example below it still just resolves to https://register.register.gov.uk/country - but we hoped that people might then follow that to the contents of https://country.register.gov.uk/records. Though that is a problem in itself. We'd have the same problem with curies if we hadn't done the URI consolidation work.
There was a problem hiding this comment.
I think I'll need a chat with you on this one.
| * To know a field contains a foreign key, you need to look up the register field | ||
| in the field definition. If it is informed, the datatype is not “string” but | ||
| “key”. | ||
| * Only able to reference one register per field. |
There was a problem hiding this comment.
Is this always a problem? Or does it help consumers to know that the thing in a field will always be in the same place? Are there performance benefits knowing you don't have to check the location of every single record?
There was a problem hiding this comment.
You are right in that it is not a problem in itself, just a restriction. The problem derived from this restriction is the need to have multiple fields for similar things when the situation arises. I'll think in a better way to phrase this.
content/reference-strategy/index.md
Outdated
| in the field definition. If it is informed, the datatype is not “string” but | ||
| “key”. | ||
| * Only able to reference one register per field. | ||
| * Linking a full set of records is a special case. |
There was a problem hiding this comment.
I'd say linking a full set of records is not possible (see above).
content/reference-strategy/index.md
Outdated
|
|
||
| #### Problems | ||
|
|
||
| * To know a field contains a foreign key, you need to look up the register field |
There was a problem hiding this comment.
You have the same problem with CURIEs that we don't mention below. That is you still need to look up the field definition to find out whether it is a curie or a string. "country:GB" is a valid string as well as a valid curie and they mean very different things.
Or is the point you are trying to get at that you have to check a field other than the datatype field in the field definition?
There was a problem hiding this comment.
Yes, my point was around not being able to know by the datatype.
content/reference-strategy/index.md
Outdated
| This RFC accepts CURIE as the only mechanism for linking between register from | ||
| now on given they offer the flexibility required by registers. | ||
|
|
||
| Existing registers that use foreign keys will be maintained to avoid |
There was a problem hiding this comment.
Will we make a plan to evolve them to use curies?
There was a problem hiding this comment.
To clarify, this plan is out of scope for this RFC.
Very good point, I'll address it in the RFC. |
e3ecace to
7afe483
Compare
|
@michaelabenyohai I've changed a few bits to address some of your comments, please have another read and see if it's better now. |
|
The proposal makes sense to me based on what I know about how things work now. Am I right in thinking that to actually replace the existing foreign keys with curies we first need to create the metadata log? Also, even if we're not adding any more foreign keys, will we retain any information in the spec about the semantics of them, or will it become purely an implementation detail of specific ORJ registers? I'm just wondering whether a spec-compliant client would still be expected to resolve these kinds of links. |
Yes, in particular the schema evolution (adding fields) #5
Good point, I think the spec should, at best, mention that at some point a register had this assumption; so some sort of informative section. But a client shouldn't be expected to implement legacy assumptions. Moreover, the current spec doesn't define foreign keys at all, so in a way we are addressing an assumption we have encoded in our implementation and clients. I opened an issue in the spec repo to address this issue: openregister/specification#84 |
There was a problem hiding this comment.
The limitation of not knowing how to resolve a CURIE to a URL could be solved by including the context within the registers resource.
The expansion mechanism would then be
- Given a CURIE
country:GB - And a register resource
- Take the curie's prefix:
country - Look up the prefix in the context object of the register resource to get a URL:
https://country.register.gov.uk - Add
/records/to it:https://country.register.gov.uk/records/ - Add the CURIE value to it:
https://country.register.gov.uk/records/GB
In theory we could provide this information through the API without storing the data in the register itself, for example by making all registers on .register.gov.uk have a context that resolves all CURIEs to .register.gov.uk.
content/reference-strategy/index.md
Outdated
|
|
||
| Existing registers that use foreign keys will be maintained to avoid | ||
| disruption in the registers ecosystem. An independent task will define and | ||
| execute the migration from foreing keys to CURIEs. |
There was a problem hiding this comment.
actually you could just drop this whole sentence as it doesn't say when this will happen or how.
content/reference-strategy/index.md
Outdated
| to a register (set of records). It can be seen as the natural consequence of | ||
| [RFC 0002: URI Consolidation](https://github.com/openregister/registers-rfcs/blob/master/content/uri-consolidation/index.md). | ||
|
|
||
| Note that other types of references (e.g. subset of records) are out of the |
There was a problem hiding this comment.
"out of the scope" should be "out of scope"
This is indeed the current thinking for how to handle reference resolution. It requires schema evolution to add this information to the catalog (register register). But it's not of this RFC to say how the catalog works. There are more twists on this, happy to discuss it further. |
content/reference-strategy/index.md
Outdated
| * Only able to reference one register per field. | ||
| * Linking a full set of records is a special case. | ||
| * You need a field for each register even if they have the same type of | ||
| relationship with the record (e.g. owned by, managed by). |
There was a problem hiding this comment.
I don't understand this point.
There was a problem hiding this comment.
At all or part of it? Let me try again:
With foreign keys, the field has a local identifier so the field itself is dedicated to a single type of link (e.g. local-authority-eng). If you need to provide links to e.g. local-authority-nir as well, you need another column. Even though both have the same intention (relationship wise).
There was a problem hiding this comment.
Yes that makes sense now. I think it would be good to clarify that you "need a field for each register linked to"
RFC describing the two mechanisms for linking registers and the proposal to use CURIEs from now on. Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
9d7ea58 to
159b93a
Compare
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Context
This RFC proposes consolidating references in Registers to CURIEs only.
Guidance to review