From a38bccd8f30ad3c37609545ff1370454e6c317f6 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 14 Jun 2018 15:54:26 +0100 Subject: [PATCH 01/35] Add summary and motivation for the metadata log RFC Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 41 +++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) create mode 100644 content/metadata-log/index.md diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md new file mode 100644 index 0000000..7300e8b --- /dev/null +++ b/content/metadata-log/index.md @@ -0,0 +1,41 @@ +--- +rfc: +start_date: 2018-06-14 +pr: +status: draft +--- + +# Metadata log + +## Summary + +This RFC proposes a complementary log to record metadata changes without +affecting the current data log. + + +## Motivation + +Registers started as pure data, and slowly they added different bits of +metadata. The reference implementation has a few bits of metadata (e.g. +description, name, fields) but the specification offers no way to consume +them. + +This RFC aims to keep backwards compatibility by creating a new metadata log +to encode metadata changes with references to the data log to keep +coordination with the original data log. + + +## Explanation + +A metadata log is a list of **changesets** where each changeset has: + +* `target`: A reference to the data log entry hash it applies. +* `parent`: A reference to the previous changeset hash. +* `delta`: A list of pairs (`key`, `blob`) describing the **delta** of changes where: + * `key`: Name of the piece of data (e.g. "name", "description", + "field:country"). + * `blob`: A reference to the relevant **blob** hash. +* `timestamp`: The datetime where this changeset was created. + + + From 8bc33212019ea502c5ee24851a0e511c257b8411 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 14 Jun 2018 18:30:35 +0100 Subject: [PATCH 02/35] Describe delta Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 45 +++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 7300e8b..fa05a4c 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -38,4 +38,49 @@ A metadata log is a list of **changesets** where each changeset has: * `timestamp`: The datetime where this changeset was created. +### Target +The `target` property is crucial to keep a connection between the two logs. It +uses the data entry hash to prevent unexpected replacements of data that could +occur in the data log. + +### Parent + +The `parent` property works in a similar way as in Git's commits. The +intention is to explore a linked list structure instead of the ordered list +implemented for the data log. + +### Delta + +The `delta` property keeps the data to apply on top of the previous metadata +state. A delta allows mutliple bits of data so it can describe an update for +multiple keys at the same time. For example: + +```elm +a0 : Delta +a0 = + [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") + , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") + , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") + ] +``` + +This delta applied to an empty state yields the same state: + +```elm +apply : Delta -> State -> State + +m0 : State +m0 = + [] + +m1 : State +m1 = + apply a0 m0 + +m1 == a0 +-- [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") +-- , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") +-- , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") +-- ] +``` From c84dabd51c004b19cb68d885c1285a0785a705c2 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 15 Jun 2018 06:55:18 +0100 Subject: [PATCH 03/35] Extend the delta example with a second patch Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 36 ++++++++++++++++++++++++++++++----- 1 file changed, 31 insertions(+), 5 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index fa05a4c..25d4cbc 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -52,6 +52,9 @@ implemented for the data log. ### Delta +TODO: Ensure fields express everything they need to express (e.g. datatype, +cardinality) + The `delta` property keeps the data to apply on top of the previous metadata state. A delta allows mutliple bits of data so it can describe an update for multiple keys at the same time. For example: @@ -78,9 +81,32 @@ m1 : State m1 = apply a0 m0 -m1 == a0 --- [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") --- , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") --- , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") --- ] +m1 == a0 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") + , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") + , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") + ] +``` + +A second delta `a1` such as + +```elm +a1 : Delta +a1 = + [ ("field:name", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") ] ``` + +is applied to a previous state `m1` as: + +```elm +m2 : State +m2 = + apply a1 m1 + +m2 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") + , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") + , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") + , ("field:name", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") + ] +``` + + From 6cd1c7af005761cd8edd56620c9f7fac9238ca30 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 15 Jun 2018 10:31:45 +0100 Subject: [PATCH 04/35] Add Blob description Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 25d4cbc..9e85436 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -52,8 +52,11 @@ implemented for the data log. ### Delta +--- TODO: Ensure fields express everything they need to express (e.g. datatype, -cardinality) +cardinality, description). Ideally description should be handled aside from +datatype and cardinality so we could hash-check the pair never changes. +--- The `delta` property keeps the data to apply on top of the previous metadata state. A delta allows mutliple bits of data so it can describe an update for @@ -64,7 +67,7 @@ a0 : Delta a0 = [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") + , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] ``` @@ -83,7 +86,7 @@ m1 = m1 == a0 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") + , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] ``` @@ -92,7 +95,7 @@ A second delta `a1` such as ```elm a1 : Delta a1 = - [ ("field:name", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") ] + [ ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] ``` is applied to a previous state `m1` as: @@ -104,9 +107,18 @@ m2 = m2 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") - , ("field:name", "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8") + , ("field:country", d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") + , ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] ``` +**Blobs**, similarly to Items, can be treated as a dictionary (hash map): +```elm +blobs: Dict String Blob +blobs = + Dict [ ("aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", Blob.String "country") + , ("701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", Blob.String "Country") + , (d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", Blob.FieldType {cardinality: "1", datatype: "string"}) + ] +``` From fef0534a79641de007e0be769c82f1942ba96566 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 15 Jun 2018 15:35:11 +0100 Subject: [PATCH 05/35] Add blob example Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 9e85436..a7ba2c6 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -84,6 +84,7 @@ m1 : State m1 = apply a0 m0 +-- TODO: What if each pair has an Action (e.g. Add foo or Remove bar?) m1 == a0 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") @@ -107,7 +108,7 @@ m2 = m2 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") + , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") , ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] ``` @@ -115,10 +116,25 @@ m2 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7 **Blobs**, similarly to Items, can be treated as a dictionary (hash map): ```elm -blobs: Dict String Blob +type Cardinality + = One + | Many + +type Blob + = BlobString String + | BlobFieldType {cardinality: Cardinality, datatype: Datatype} + +blobs: Dict Key Blob blobs = - Dict [ ("aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", Blob.String "country") - , ("701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", Blob.String "Country") - , (d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", Blob.FieldType {cardinality: "1", datatype: "string"}) + Dict [ ("aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", BlobString "country") + , ("701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", BlobString "Country") + , ("d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", BlobFieldType {cardinality: One, datatype: DatatypeString}) ] ``` + +Blob hashing uses the same algorithm as the Items. + +--- +TODO: Blobs that are strings, to be valid JSON need to have quotes. Do we want +this? +--- From db640204fba4c74ea3fa971af78f585d4a67a94b Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 15 Jun 2018 15:44:21 +0100 Subject: [PATCH 06/35] Add changeset timestamp description Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index a7ba2c6..d2b02b6 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -29,14 +29,19 @@ coordination with the original data log. A metadata log is a list of **changesets** where each changeset has: +* `timestamp`: The datetime where this changeset was created. * `target`: A reference to the data log entry hash it applies. * `parent`: A reference to the previous changeset hash. * `delta`: A list of pairs (`key`, `blob`) describing the **delta** of changes where: * `key`: Name of the piece of data (e.g. "name", "description", "field:country"). * `blob`: A reference to the relevant **blob** hash. -* `timestamp`: The datetime where this changeset was created. +### Timestamp + +The `timestamp` property describes when the changeset was recorded. Similarly +to Entry timestamps, they don't define the order of the metadata log, they are +mostly informative. ### Target @@ -53,9 +58,11 @@ implemented for the data log. ### Delta --- + TODO: Ensure fields express everything they need to express (e.g. datatype, cardinality, description). Ideally description should be handled aside from datatype and cardinality so we could hash-check the pair never changes. + --- The `delta` property keeps the data to apply on top of the previous metadata @@ -135,6 +142,8 @@ blobs = Blob hashing uses the same algorithm as the Items. --- + TODO: Blobs that are strings, to be valid JSON need to have quotes. Do we want this? + --- From ea95f619c57969af4c25282a9680c1527331554b Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 15 Jun 2018 16:09:30 +0100 Subject: [PATCH 07/35] Add example of changesets Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 50 +++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index d2b02b6..c8996ce 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -49,12 +49,27 @@ The `target` property is crucial to keep a connection between the two logs. It uses the data entry hash to prevent unexpected replacements of data that could occur in the data log. +The first changeset can optionally not provide a `target` and it's assumed it +will apply to the data log from the first entry. This allows recording the +first changeset when the data log doesn't exist which is expected because you +need a schema to validate the data is valid. + +--- + +TODO: Is is acceptable to have multiple changesets with no `target` if none of +the previous changesets have one? It would allow incrementally defining the +schema before the first data entry gets recorded. + +--- + ### Parent The `parent` property works in a similar way as in Git's commits. The intention is to explore a linked list structure instead of the ordered list implemented for the data log. +The first changeset is expected to have no parent. + ### Delta --- @@ -147,3 +162,38 @@ TODO: Blobs that are strings, to be valid JSON need to have quotes. Do we want this? --- + +### Changeset example + +This code exemplifies two changesets where the first is the parent of the +second. + +```elm +type Changeset = + { timestamp: DateTime + , target: Option Hash + , parent: Option Hash + , delta: Delta + } + +chs0 = + { timestamp: DateTime "2018-06-14T15:51:00Z" + , target: Nothing + , parent: Nothing + , delta: [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") + , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") + , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") + ] + } + +getHash chs0 -- Hash "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6" + +chs1 = + { timestamp: DateTime "2018-06-14T15:59:00Z" + , target: Some "0000000000000000000000000000000000000000000000000000000000000000" + , parent: Some "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6" + , delta: [ ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] + } + +getHash chs1 -- Hash "62bf2dae9312a9080f945caaf035fd512c8d5ddd1189cfb7ae04489e564ca379" +``` From 350bdede8a4bdec6d62f284e31a46e4ee2d716d2 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Mon, 18 Jun 2018 11:48:31 +0100 Subject: [PATCH 08/35] Reword changeset's target section Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 40 ++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 19 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index c8996ce..2d4308b 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -39,28 +39,30 @@ A metadata log is a list of **changesets** where each changeset has: ### Timestamp -The `timestamp` property describes when the changeset was recorded. Similarly -to Entry timestamps, they don't define the order of the metadata log, they are -mostly informative. +The `timestamp` property describes when the changeset was recorded. They don't +define the order of the metadata log. ### Target -The `target` property is crucial to keep a connection between the two logs. It -uses the data entry hash to prevent unexpected replacements of data that could -occur in the data log. - -The first changeset can optionally not provide a `target` and it's assumed it -will apply to the data log from the first entry. This allows recording the -first changeset when the data log doesn't exist which is expected because you -need a schema to validate the data is valid. - ---- - -TODO: Is is acceptable to have multiple changesets with no `target` if none of -the previous changesets have one? It would allow incrementally defining the -schema before the first data entry gets recorded. - ---- +The `target` property is crucial to keep a connection between the data log and +the metadata log. It uses the data entry hash to prevent unexpected +replacements of data that could occur in the data log. + +The first changeset expects `target` to be nil given that the first item in +the data log must conform to a defined schema. Optionally, new changesets can +be recorded on top of the first one without `target`. Once there is a +changeset with a explicit `target` no more nil `target` properties are +allowed. + +Rough algorithm given a new changeset: + +1. If it's the first changeset: + * Succeed if `target` is nil. + * Fail otherwise. +2. If it's not the first changeset: + * If `target` is nil: + * Succeed if parent's `target` is nil. + * Fail otherwise. ### Parent From c1de34a13894980adc74ae5fb99fa9f16dc8b8f4 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Mon, 18 Jun 2018 12:16:19 +0100 Subject: [PATCH 09/35] Sketch two ideas to express field identity Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 42 ++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 2d4308b..a5a02ff 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -70,15 +70,51 @@ The `parent` property works in a similar way as in Git's commits. The intention is to explore a linked list structure instead of the ordered list implemented for the data log. -The first changeset is expected to have no parent. +The first changeset is expected to have no parent. Any other changeset will +have a single parent hash informed. ### Delta --- TODO: Ensure fields express everything they need to express (e.g. datatype, -cardinality, description). Ideally description should be handled aside from -datatype and cardinality so we could hash-check the pair never changes. +cardinality, description). A change needs to allow for changing label and +description but not id, datatype or cardinality. + +```elm +type PrimitiveType + = StringType + | IntegerType + | CurieType + | ... -- Not all listed for brevity + +-- Cardinality renamed here because it's combined with the data type itself as +-- they are two parts of the same thing. +type Datatype + = One PrimitiveType + | Many PrimitiveType + +type Field = + { id: String + , datatype: Datatype + , label: Maybe String + , description: Maybe String + } + +f1: Field +f1 = + { id: "name" + , datatype: One StringType + , label: Just "Name" + , description: Just "The name of the country" + } + +-- Returns the result of hashing the relevant parts of the identity: id, datatype +Field.identity: Field -> Hash + +-- Returns the Field Blob hash +Field.hash: Field -> Hash +``` --- From d7c4393507b48707fc7885a15cbfed63c958ddce Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Mon, 18 Jun 2018 13:49:40 +0100 Subject: [PATCH 10/35] Reshuffle definitions Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 161 ++++++++++++++++++++-------------- 1 file changed, 94 insertions(+), 67 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index a5a02ff..673d1fd 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -24,6 +24,9 @@ This RFC aims to keep backwards compatibility by creating a new metadata log to encode metadata changes with references to the data log to keep coordination with the original data log. +### Use cases (potential) + +TODO ## Explanation @@ -32,10 +35,29 @@ A metadata log is a list of **changesets** where each changeset has: * `timestamp`: The datetime where this changeset was created. * `target`: A reference to the data log entry hash it applies. * `parent`: A reference to the previous changeset hash. -* `delta`: A list of pairs (`key`, `blob`) describing the **delta** of changes where: +* `delta`: An ordered set of pairs (`key`, `hash`) describing the **delta** of changes where: * `key`: Name of the piece of data (e.g. "name", "description", "field:country"). - * `blob`: A reference to the relevant **blob** hash. + * `hash`: A reference to the relevant **blob** hash. + +``` +type Delta = + OrdSet (Key, Hash) + +type Changeset = + { timestamp: Timestamp + , target: Maybe Hash + , parent: Maybe Hash + , delta: Delta + } + + +-- TODO: Remove if it's too prescriptive and irrelevant +-- An object stored in the store +type Object + = Blb Blob -- TODO: Define Blob + | Chs Changeset +``` ### Timestamp @@ -75,11 +97,29 @@ have a single parent hash informed. ### Delta ---- +The `delta` property has the data to apply on top of the previous metadata +state. A delta allows mutliple bits of data so it can describe an update for +multiple unique keys at the same time. For example: + +```elm +type Delta = + OrdSet Key Hash + +a0 : Delta +a0 = + [ ("custodian", "72bb44793c2e42872ebc892f411dab0f700049231a6a4169f87a20580d7cd516") + , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") + , ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") + , ("name", "dcd1d5223f73b3a965c07e3ff5dbee3eedcfedb806686a05b9b3868a2c3d6d50") + ] +``` + +Note a delta is an ordered set ordered by key. -TODO: Ensure fields express everything they need to express (e.g. datatype, -cardinality, description). A change needs to allow for changing label and -description but not id, datatype or cardinality. + +TODO: This attempt to describe the resulting metadata state shows that the very +first changeset requires at least `id`, `custodian` and the field acting as +the identifier for data items (i.e. primary key). ```elm type PrimitiveType @@ -101,54 +141,38 @@ type Field = , description: Maybe String } -f1: Field -f1 = - { id: "name" - , datatype: One StringType - , label: Just "Name" - , description: Just "The name of the country" - } - --- Returns the result of hashing the relevant parts of the identity: id, datatype -Field.identity: Field -> Hash - --- Returns the Field Blob hash -Field.hash: Field -> Hash +type State + = Empty + | State { id: String + , name: Maybe String + , description: Maybe String + , custodian: String + , fields: Set Field + , primaryKey: FieldId -- TODO: Any other better name to describe the Id? + } ``` ---- - -The `delta` property keeps the data to apply on top of the previous metadata -state. A delta allows mutliple bits of data so it can describe an update for -multiple keys at the same time. For example: - -```elm -a0 : Delta -a0 = - [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") - , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") - ] -``` - -This delta applied to an empty state yields the same state: +TODO (relevant?): This delta applied to an empty state yields the +derreferenced objects: ```elm apply : Delta -> State -> State m0 : State m0 = - [] + Empty m1 : State m1 = apply a0 m0 --- TODO: What if each pair has an Action (e.g. Add foo or Remove bar?) -m1 == a0 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") - , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") - ] +m1 == State { id: "country" + , name: Just "Country" + , description: Nothing + , custodian: "Foreign & Commonwealth Office" + , fields: Set [ { id: "country", datatype: One StringType } ] + , primaryKey: "country" + } ``` A second delta `a1` such as @@ -166,40 +190,43 @@ m2 : State m2 = apply a1 m1 -m2 == [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") - , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") - , ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") - ] +m2 == State { id: "country" + , name: Just "Country" + , description: Nothing + , custodian: "Foreign & Commonwealth Office" + fields: Set [ Field { id: "name" + , datatype: One StringType + , label: Just "Name" + , description: Just "the name of the country" + } + , Field { id: "name" + , datatype: One StringType + , label: Just "Name" + , description: Just "the name of the country" + } + ] + , primaryKey: "country" + } + ``` **Blobs**, similarly to Items, can be treated as a dictionary (hash map): ```elm -type Cardinality - = One - | Many +-- TODO: Probably better to stick to the current definition and use a +-- JSON-like string +type Blob = String -type Blob - = BlobString String - | BlobFieldType {cardinality: Cardinality, datatype: Datatype} - -blobs: Dict Key Blob +blobs: Dict Hash Blob blobs = - Dict [ ("aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", BlobString "country") - , ("701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", BlobString "Country") - , ("d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", BlobFieldType {cardinality: One, datatype: DatatypeString}) + Dict [ ("aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", "\"country\"") + , ("701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", "\"Country\"") + , ("d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", "{\"cardinality\":\"1\",\"datatype\":\"string\"}") ] ``` Blob hashing uses the same algorithm as the Items. ---- - -TODO: Blobs that are strings, to be valid JSON need to have quotes. Do we want -this? - ---- ### Changeset example @@ -209,8 +236,8 @@ second. ```elm type Changeset = { timestamp: DateTime - , target: Option Hash - , parent: Option Hash + , target: Maybe Hash + , parent: Maybe Hash , delta: Delta } @@ -228,8 +255,8 @@ getHash chs0 -- Hash "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b chs1 = { timestamp: DateTime "2018-06-14T15:59:00Z" - , target: Some "0000000000000000000000000000000000000000000000000000000000000000" - , parent: Some "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6" + , target: Just "0000000000000000000000000000000000000000000000000000000000000000" + , parent: Just "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6" , delta: [ ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] } From f3c955f86c7c6aef9d0eda6fe5835063b65d3289 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Tue, 19 Jun 2018 09:08:34 +0100 Subject: [PATCH 11/35] Remove unnecessary details about a hypothetical object store Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 7 ------- 1 file changed, 7 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 673d1fd..685d017 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -50,13 +50,6 @@ type Changeset = , parent: Maybe Hash , delta: Delta } - - --- TODO: Remove if it's too prescriptive and irrelevant --- An object stored in the store -type Object - = Blb Blob -- TODO: Define Blob - | Chs Changeset ``` ### Timestamp From 574686bf9ff6ffec48a25f5eaaa8c51047803ffe Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Tue, 19 Jun 2018 16:56:55 +0100 Subject: [PATCH 12/35] Add json example Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 51 +++++++++++++++++++++++++++++++---- 1 file changed, 46 insertions(+), 5 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 685d017..59a97df 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -40,7 +40,7 @@ A metadata log is a list of **changesets** where each changeset has: "field:country"). * `hash`: A reference to the relevant **blob** hash. -``` +```elm type Delta = OrdSet (Key, Hash) @@ -187,15 +187,15 @@ m2 == State { id: "country" , name: Just "Country" , description: Nothing , custodian: "Foreign & Commonwealth Office" - fields: Set [ Field { id: "name" + fields: Set [ Field { id: "country" , datatype: One StringType - , label: Just "Name" - , description: Just "the name of the country" + , label: Just "Country" + , description: Just "The country's 2-letter ISO 3166-2 alpha2 code." } , Field { id: "name" , datatype: One StringType , label: Just "Name" - , description: Just "the name of the country" + , description: Just "The name of the country." } ] , primaryKey: "country" @@ -255,3 +255,44 @@ chs1 = getHash chs1 -- Hash "62bf2dae9312a9080f945caaf035fd512c8d5ddd1189cfb7ae04489e564ca379" ``` + +## High-level (porcelain) API + +There are new endpoints that surface the computed state for the metadata log. + +### Get the computed schema + +* Endpoint: `GET /schema/` +* Parameters: + * `entry-number`: A valid data log entry number (range: 1..[end of log]). + +TODO: The schema will not be provided as CSV unless we find it essential and +we find a reasonable way to flatten the structure. + +Response example in JSON: + +```json +{ + "id": "country", + "name": "Country", + "custodian": "Foreign & Commonwealth Office", + "fields": [ + { + "id": "country", + "datatype": "string", + "label": "Name", + "description": "The country's 2-letter ISO 3166-2 alpha2 code.", + }, { + "id": "name", + "datatype": "string", + "label": "Name", + "description": "The name of the country." + } + ], + "primary-key": "country" +} +``` + +## Low-level (plumbing) API + +The following endpoints are low level From 967616dfcb6926cb85c1b6db105ac41743fec2e1 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Wed, 20 Jun 2018 14:49:48 +0100 Subject: [PATCH 13/35] Fix hash values in pseudo-code Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 59a97df..a7634bb 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -100,10 +100,10 @@ type Delta = a0 : Delta a0 = - [ ("custodian", "72bb44793c2e42872ebc892f411dab0f700049231a6a4169f87a20580d7cd516") - , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") - , ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") - , ("name", "dcd1d5223f73b3a965c07e3ff5dbee3eedcfedb806686a05b9b3868a2c3d6d50") + [ ("custodian", Hash "72bb44793c2e42872ebc892f411dab0f700049231a6a4169f87a20580d7cd516") + , ("field:country", Hash "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") + , ("id", Hash "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") + , ("name", Hash "dcd1d5223f73b3a965c07e3ff5dbee3eedcfedb806686a05b9b3868a2c3d6d50") ] ``` @@ -173,7 +173,7 @@ A second delta `a1` such as ```elm a1 : Delta a1 = - [ ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] + [ ("field:name", Hash "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] ``` is applied to a previous state `m1` as: @@ -212,9 +212,9 @@ type Blob = String blobs: Dict Hash Blob blobs = - Dict [ ("aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", "\"country\"") - , ("701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", "\"Country\"") - , ("d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", "{\"cardinality\":\"1\",\"datatype\":\"string\"}") + Dict [ (Hash "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", "\"country\"") + , (Hash "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", "\"Country\"") + , (Hash "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", "{\"cardinality\":\"1\",\"datatype\":\"string\"}") ] ``` @@ -238,9 +238,9 @@ chs0 = { timestamp: DateTime "2018-06-14T15:51:00Z" , target: Nothing , parent: Nothing - , delta: [ ("id", "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") - , ("name", "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") - , ("field:country", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") + , delta: [ ("id", Hash "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7") + , ("name", Hash "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5") + , ("field:country", Hash "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] } @@ -248,9 +248,9 @@ getHash chs0 -- Hash "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b chs1 = { timestamp: DateTime "2018-06-14T15:59:00Z" - , target: Just "0000000000000000000000000000000000000000000000000000000000000000" - , parent: Just "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6" - , delta: [ ("field:name", "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] + , target: Just (Hash "0000000000000000000000000000000000000000000000000000000000000000") + , parent: Just (Hash "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6") + , delta: [ ("field:name", Hash "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b") ] } getHash chs1 -- Hash "62bf2dae9312a9080f945caaf035fd512c8d5ddd1189cfb7ae04489e564ca379" From feb2b156e50cf3405e6273403edff571174a75a8 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 21 Jun 2018 10:36:01 +0100 Subject: [PATCH 14/35] Fix schema example in json (missing cardinality) Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index a7634bb..2ddb36f 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -280,11 +280,13 @@ Response example in JSON: { "id": "country", "datatype": "string", + "cardinality": "1", "label": "Name", "description": "The country's 2-letter ISO 3166-2 alpha2 code.", }, { "id": "name", "datatype": "string", + "cardinality": "1", "label": "Name", "description": "The name of the country." } From f0233189be63d71680b364cfcc88acc34c1ab342 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 21 Jun 2018 10:44:28 +0100 Subject: [PATCH 15/35] Add changeset API endpoint Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 2ddb36f..0acf7d9 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -264,7 +264,7 @@ There are new endpoints that surface the computed state for the metadata log. * Endpoint: `GET /schema/` * Parameters: - * `entry-number`: A valid data log entry number (range: 1..[end of log]). + * `entry-number` (Optional): A valid data log entry number (range: 1..[end of log]). TODO: The schema will not be provided as CSV unless we find it essential and we find a reasonable way to flatten the structure. @@ -298,3 +298,27 @@ Response example in JSON: ## Low-level (plumbing) API The following endpoints are low level + +## Get the list of changesets + +* Endpoint: `GET /meta/changesets/` + +```json +[ + { + "id": "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6", + "timestamp": "2018-06-14T15:51:00Z", + "delta": { + "id": "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", + "name": "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", + "field:country": "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b" + } + }, { + "id": "62bf2dae9312a9080f945caaf035fd512c8d5ddd1189cfb7ae04489e564ca379", + "timestamp": "2018-06-14T15:59:00Z", + "target": "0000000000000000000000000000000000000000000000000000000000000000", + "parent": "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6", + "delta": { "field:name": "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b" } + } +] +``` From 157f42fb81ee0d9efa20a6a77f34924f46b15bb8 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 21 Jun 2018 13:31:19 +0100 Subject: [PATCH 16/35] Note the possibility of using empty hash instead of nil targets Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 0acf7d9..d8bc2da 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -69,6 +69,14 @@ be recorded on top of the first one without `target`. Once there is a changeset with a explicit `target` no more nil `target` properties are allowed. +--- + +TODO: Alternative to nil, use the empty hash: + + "target": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", + +--- + Rough algorithm given a new changeset: 1. If it's the first changeset: From cdae0eb026603ddee6828cd3bf359aa5a52b188f Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 21 Jun 2018 13:54:16 +0100 Subject: [PATCH 17/35] Change examples to be HTTP request/response Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 51 ++++++++++++++++++++++++++++++++--- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index d8bc2da..a4610cd 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -277,9 +277,16 @@ There are new endpoints that surface the computed state for the metadata log. TODO: The schema will not be provided as CSV unless we find it essential and we find a reasonable way to flatten the structure. -Response example in JSON: +```http +GET /schema/ HTTP/2 +Host: country.register.gov.uk +Accept: application/json +``` + +```http +HTTP/2 200 +Content-Type: application/json -```json { "id": "country", "name": "Country", @@ -310,8 +317,21 @@ The following endpoints are low level ## Get the list of changesets * Endpoint: `GET /meta/changesets/` +* Parameters: + * `page-index` (Optional): Collection page number. Defaults to 1. + * `page-size` (Optional): Collection page size. Defaults to 100. + +```http +GET /meta/changesets/ HTTP/2 +Host: country.register.gov.uk +Accept: application/json +``` + +```http +HTTP/2 200 +Content-Type: application/json +Link: ; rel="next" -```json [ { "id": "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6", @@ -330,3 +350,28 @@ The following endpoints are low level } ] ``` + +## Get a single changeset + +* Endpoint: `GET /meta/changesets/{id}` + +```http +GET /meta/changesets/adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6 HTTP/2 +Host: country.register.gov.uk +Accept: application/json +``` + +```http +HTTP/2 200 +Content-Type: application/json + +{ + "id": "adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6", + "timestamp": "2018-06-14T15:51:00Z", + "delta": { + "id": "aff64e4fd520bd185cb01adab98d2d20060f621c62d5cad5204712cfa2294ef7", + "name": "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", + "field:country": "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b" + } +} +``` From 20b093f8db363b289a00b940a641c6deee942b01 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 21 Jun 2018 14:30:58 +0100 Subject: [PATCH 18/35] Add blob endpoints Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 64 +++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index a4610cd..9f3368e 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -354,6 +354,8 @@ Link: ; rel="next" ## Get a single changeset * Endpoint: `GET /meta/changesets/{id}` +* Parameters: + * `id` (Required): The changeset hash. ```http GET /meta/changesets/adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6 HTTP/2 @@ -375,3 +377,65 @@ Content-Type: application/json } } ``` + +## Get the list of blobs + +* Endpoint: `GET /meta/blobs/` +* Parameters: + * `page-index` (Optional): Collection page number. Defaults to 1. + * `page-size` (Optional): Collection page size. Defaults to 100. + + +```http +GET /meta/blobs/ HTTP/2 +Host: country.register.gov.uk +Accept: application/json +``` + +```http +HTTP/2 200 +Content-Type: application/json +Link: ; rel="next" + +[ + { + "id": "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", + "value": "\"country\"" + }, { + "id": "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", + "value": "{\"cardinality\":\"1\",\"datatype\":\"string\"}" + } +] +``` + +--- + +TODO: Blobs return their value stringified. What are the implications of +returning the value in JSON instead? + +--- + +## Get a blob by id + +* Endpoint: `GET /meta/blobs/{id}` +* Parameters: + * `id` (Required): The Blob hash. + + +```http +GET /meta/blobs/701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5 HTTP/2 +Host: country.register.gov.uk +Accept: application/json +``` + +```http +HTTP/2 200 +Content-Type: application/json + +{ + "id": "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", + "value": "\"country\"" +} +``` + + From d08ed42096fcbddbc6ee514dc7beb8d9ed7c48c7 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 21 Jun 2018 15:45:16 +0100 Subject: [PATCH 19/35] Add note on hash algorithm at the schema level Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 9f3368e..2212ce7 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -148,6 +148,7 @@ type State , name: Maybe String , description: Maybe String , custodian: String + , hashAlgorithm: HashAlg, -- TODO: Can we introduce this bit of information at the register level? , fields: Set Field , primaryKey: FieldId -- TODO: Any other better name to describe the Id? } @@ -413,6 +414,33 @@ Link: ; rel="next" TODO: Blobs return their value stringified. What are the implications of returning the value in JSON instead? +Should blobs be annotated by type? + +``` +type Blob + = Value String -- Leaf like label, description or custodian + | Field String + | ... ? +``` + +```http +HTTP/2 200 +Content-Type: application/json +Link: ; rel="next" + +[ + { + "id": "701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5", + "type": "value", + "value": "country" + }, { + "id": "d22869a1fd9fc929c2a07f476dd579af97691b2d0f4d231e8300e20c0326dd6b", + "type": "field", + "value": {"cardinality": "1", "datatype": "string"} + } +] +``` + --- ## Get a blob by id From ede33f81fe8766a409a4867f9761e78029d79180 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 22 Jun 2018 09:34:09 +0100 Subject: [PATCH 20/35] Change examples to HTTP 1.1 Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 2212ce7..a57f5f9 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -279,13 +279,13 @@ TODO: The schema will not be provided as CSV unless we find it essential and we find a reasonable way to flatten the structure. ```http -GET /schema/ HTTP/2 +GET /schema/ HTTP/1.1 Host: country.register.gov.uk Accept: application/json ``` ```http -HTTP/2 200 +HTTP/1.1 200 OK Content-Type: application/json { @@ -323,13 +323,13 @@ The following endpoints are low level * `page-size` (Optional): Collection page size. Defaults to 100. ```http -GET /meta/changesets/ HTTP/2 +GET /meta/changesets/ HTTP/1.1 Host: country.register.gov.uk Accept: application/json ``` ```http -HTTP/2 200 +HTTP/1.1 200 OK Content-Type: application/json Link: ; rel="next" @@ -359,13 +359,13 @@ Link: ; rel="next" * `id` (Required): The changeset hash. ```http -GET /meta/changesets/adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6 HTTP/2 +GET /meta/changesets/adcd501c027ad83fbdf4c3423630da89b2c013b9e8641ec0c2679ed33b2cc0d6 HTTP/1.1 Host: country.register.gov.uk Accept: application/json ``` ```http -HTTP/2 200 +HTTP/1.1 200 OK Content-Type: application/json { @@ -388,13 +388,13 @@ Content-Type: application/json ```http -GET /meta/blobs/ HTTP/2 +GET /meta/blobs/ HTTP/1.1 Host: country.register.gov.uk Accept: application/json ``` ```http -HTTP/2 200 +HTTP/1.1 200 OK Content-Type: application/json Link: ; rel="next" @@ -424,7 +424,7 @@ type Blob ``` ```http -HTTP/2 200 +HTTP/1.1 200 OK Content-Type: application/json Link: ; rel="next" @@ -451,13 +451,13 @@ Link: ; rel="next" ```http -GET /meta/blobs/701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5 HTTP/2 +GET /meta/blobs/701d021d08c54579f23343581e45b65ffb1150b2c99f94352fdac4b7036dbbd5 HTTP/1.1 Host: country.register.gov.uk Accept: application/json ``` ```http -HTTP/2 200 +HTTP/1.1 200 OK Content-Type: application/json { From ccb8cd7419995c1d0fdc6adf8c0a554ccddccfaf Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 22 Jun 2018 10:32:26 +0100 Subject: [PATCH 21/35] Add metalog type Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index a57f5f9..2a4ce18 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -50,6 +50,16 @@ type Changeset = , parent: Maybe Hash , delta: Delta } + +type MetaLog = + Dict Hash Changeset + +-- TODO: Relevant? +first : MetaLog -> Changeset + +current : MetaLog -> Changeset + +previous : Changeset -> MetaLog -> Changeset ``` ### Timestamp From 62c3ca1e247e4ac36983c7d7734211813083d9f7 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Fri, 22 Jun 2018 10:46:12 +0100 Subject: [PATCH 22/35] Add schema function Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 2a4ce18..965197c 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -20,7 +20,7 @@ metadata. The reference implementation has a few bits of metadata (e.g. description, name, fields) but the specification offers no way to consume them. -This RFC aims to keep backwards compatibility by creating a new metadata log +This RFC aims to keep backwards compatibility by creating a new log to encode metadata changes with references to the data log to keep coordination with the original data log. @@ -54,12 +54,10 @@ type Changeset = type MetaLog = Dict Hash Changeset --- TODO: Relevant? -first : MetaLog -> Changeset - -current : MetaLog -> Changeset - previous : Changeset -> MetaLog -> Changeset + +-- Computes a schema from the given changeset +schema : Changeset -> MetaLog -> Schema ``` ### Timestamp From 02d8cbf9360d756a259f71371492a54ff9027973 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Wed, 4 Jul 2018 14:30:41 +0100 Subject: [PATCH 23/35] Add use cases for schema consumption Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 965197c..9cd4411 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -26,7 +26,34 @@ coordination with the original data log. ### Use cases (potential) -TODO +#### As a user I want to get a records and validate them against the schema. + +1. `GET /records/foo` +2. `GET /schema/` +3. validate + +#### As a user I want to validate a record I got _some_ time ago and validate it against the schema. + +1. `GET /records/foo` +2. (time passes) +3. `GET /schema/` +4. validate + + +#### As a user I want to get a record at an arbitrary log size and validate it against the schema. + +1. `GET /records/foo?size=10` +2. `GET /schema/?size=10` +3. validate + +--- + +TODO: Is the last use case artificial? Also, if schemas only allow adding +fields and all fields are optional, the schema at size HEAD should be +sufficient. + +--- + ## Explanation From 8d841aa1d4d8763017efbddc722dc3eddac46528 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Wed, 4 Jul 2018 14:35:52 +0100 Subject: [PATCH 24/35] Add use case with obsolete schema Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 9cd4411..03f4a72 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -54,6 +54,22 @@ sufficient. --- +#### As a user I want to validate a record against the latest (correct) schema. + +1. `GET /schema/` +2. (time passes) +3. `GET /records/foo` +4. `GET /schema/` +5. validate + +Essentially this means that either we provide a way to know if the schema is +the latest or we require to always fetch a new version. + +The issue arises when a new record is validated against an old schema if and +only if the new record has fields informed that were defined in newer versions +of the schema. + + ## Explanation From b24633a79c5d3de33e465c9f06c3adc74d72b1dc Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Wed, 4 Jul 2018 14:41:30 +0100 Subject: [PATCH 25/35] Add potential use case for schema arbitrary version Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 03f4a72..7bed2bb 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -69,6 +69,15 @@ The issue arises when a new record is validated against an old schema if and only if the new record has fields informed that were defined in newer versions of the schema. +#### As a user I want to get a schema version + +1. `GET /schema/{hash}` + +--- + +TODO: Is this needed? It's not well defined use case. + +--- ## Explanation From 96d4f9570382393094f7313e949eca81c931f7bb Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Wed, 4 Jul 2018 14:49:41 +0100 Subject: [PATCH 26/35] Add list of benefits Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 7bed2bb..5740b3c 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -375,7 +375,7 @@ Content-Type: application/json The following endpoints are low level -## Get the list of changesets +### Get the list of changesets * Endpoint: `GET /meta/changesets/` * Parameters: @@ -412,7 +412,7 @@ Link: ; rel="next" ] ``` -## Get a single changeset +### Get a single changeset * Endpoint: `GET /meta/changesets/{id}` * Parameters: @@ -439,7 +439,7 @@ Content-Type: application/json } ``` -## Get the list of blobs +### Get the list of blobs * Endpoint: `GET /meta/blobs/` * Parameters: @@ -503,7 +503,7 @@ Link: ; rel="next" --- -## Get a blob by id +### Get a blob by id * Endpoint: `GET /meta/blobs/{id}` * Parameters: @@ -526,4 +526,9 @@ Content-Type: application/json } ``` +## Benefits +* It's not a breaking change. + * Users consuming from `/records/` are not affected. + * Users consuming from `/entries/` are not affected. +* Users interested in data changes don't need to care about metadata changes. From 3c8e88ba07aaf7161fa0418ea817815f8a66a1f5 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 5 Jul 2018 08:56:55 +0100 Subject: [PATCH 27/35] Cleanup Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 5740b3c..77f95b3 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -24,7 +24,7 @@ This RFC aims to keep backwards compatibility by creating a new log to encode metadata changes with references to the data log to keep coordination with the original data log. -### Use cases (potential) +### Use cases #### As a user I want to get a records and validate them against the schema. From d05f481d239e003ca3125fbd07abbe5fdeec4cd8 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 5 Jul 2018 15:32:01 +0100 Subject: [PATCH 28/35] Add note about blob keyset pagination Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 63 +++++++++++++++++++++++++++++------ 1 file changed, 53 insertions(+), 10 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 77f95b3..71af047 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -15,10 +15,9 @@ affecting the current data log. ## Motivation -Registers started as pure data, and slowly they added different bits of -metadata. The reference implementation has a few bits of metadata (e.g. -description, name, fields) but the specification offers no way to consume -them. +Registers started as pure data, and slowly added different bits of metadata. +The reference implementation has a few bits of metadata (e.g. description, +name, fields) but the specification offers no way to consume them. This RFC aims to keep backwards compatibility by creating a new log to encode metadata changes with references to the data log to keep @@ -79,6 +78,12 @@ TODO: Is this needed? It's not well defined use case. --- +--- + +TODO: What are the use cases for metadata outside of field information? + +--- + ## Explanation @@ -335,9 +340,15 @@ There are new endpoints that surface the computed state for the metadata log. * Parameters: * `entry-number` (Optional): A valid data log entry number (range: 1..[end of log]). +--- + TODO: The schema will not be provided as CSV unless we find it essential and we find a reasonable way to flatten the structure. +Also our datatypes are not compatible with CSVW, XSD or JSON-Schema. + +--- + ```http GET /schema/ HTTP/1.1 Host: country.register.gov.uk @@ -379,8 +390,8 @@ The following endpoints are low level * Endpoint: `GET /meta/changesets/` * Parameters: - * `page-index` (Optional): Collection page number. Defaults to 1. - * `page-size` (Optional): Collection page size. Defaults to 100. + * `cursor` (Optional): Opaque pointer to the next collection page. + * `size` (Optional): Collection size. Defaults to 100. ```http GET /meta/changesets/ HTTP/1.1 @@ -443,12 +454,44 @@ Content-Type: application/json * Endpoint: `GET /meta/blobs/` * Parameters: - * `page-index` (Optional): Collection page number. Defaults to 1. - * `page-size` (Optional): Collection page size. Defaults to 100. + * `cursor` (Optional): Opaque pointer to the next collection page. + * `size` (Optional): Collection size. Defaults to 100. + +--- + +TODO: This pagination diverges from the ones offered by the data log. This is +to explore the idea of paginating naturally unordered collections in an +ordered way. In SQL this could take the form of a keyset pagination such as: + +```sql +CREATE TABLE ( + n SERIAL, + value VARCHAR(255) NOT NULL CHECK (value <> ''), + id bytea PRIMARY KEY +); + +CREATE INDEX n_idx ON blob USING btree (n); + +SELECT id, value +FROM blob +WHERE n > ?cursor +ORDER BY n ASC +LIMIT 2; +``` + +Size could be fixed if we don't need to provide this flexibility. Cursor could +be more obscure so users don't attempt to change the querystring by hand. + +The value of using something like keyset pagination is to ensure pages are +always the same because the set has a forced order that doesn't depend on the +hash but an incremental number (could be a timestamp) so effectively only +half-empty pages would change when adding a new element. + +--- ```http -GET /meta/blobs/ HTTP/1.1 +GET /meta/blobs/?size=2 HTTP/1.1 Host: country.register.gov.uk Accept: application/json ``` @@ -456,7 +499,7 @@ Accept: application/json ```http HTTP/1.1 200 OK Content-Type: application/json -Link: ; rel="next" +Link: ; rel="next" [ { From 33ac2cb02a5bca8412e22f9e34e5f99037fef3c3 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 5 Jul 2018 16:06:41 +0100 Subject: [PATCH 29/35] Add note on parameters and spec Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 71af047..3103c9d 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -459,6 +459,9 @@ Content-Type: application/json --- +TODO: Parameters are informative and implementation dependant in the current +specification. + TODO: This pagination diverges from the ones offered by the data log. This is to explore the idea of paginating naturally unordered collections in an ordered way. In SQL this could take the form of a keyset pagination such as: From 3aac2e8854d441e11bc925661fc45ab548ed3658 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Mon, 9 Jul 2018 15:41:45 +0100 Subject: [PATCH 30/35] Fix sql todo Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 66 +++++++++++++++++++---------------- 1 file changed, 35 insertions(+), 31 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 3103c9d..ca9d407 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -90,7 +90,7 @@ TODO: What are the use cases for metadata outside of field information? A metadata log is a list of **changesets** where each changeset has: * `timestamp`: The datetime where this changeset was created. -* `target`: A reference to the data log entry hash it applies. +* `target`: A reference to the data log entry hash it applies to. * `parent`: A reference to the previous changeset hash. * `delta`: An ordered set of pairs (`key`, `hash`) describing the **delta** of changes where: * `key`: Name of the piece of data (e.g. "name", "description", @@ -134,14 +134,6 @@ be recorded on top of the first one without `target`. Once there is a changeset with a explicit `target` no more nil `target` properties are allowed. ---- - -TODO: Alternative to nil, use the empty hash: - - "target": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", - ---- - Rough algorithm given a new changeset: 1. If it's the first changeset: @@ -152,6 +144,14 @@ Rough algorithm given a new changeset: * Succeed if parent's `target` is nil. * Fail otherwise. +--- + +TODO: In a situation (arguably common) where metadata is defined once and not +changed at all, the `target` would be nil for a long time (forever?). Does +it matter? + +--- + ### Parent The `parent` property works in a similar way as in Git's commits. The @@ -207,16 +207,19 @@ type Field = , description: Maybe String } +type Snapshot + { id: String + , name: Maybe String + , description: Maybe String + , custodian: String + , hashAlgorithm: HashAlg, -- TODO: Can we introduce this bit of information at the register level? + , fields: Set Field + , primaryKey: Field + } + type State = Empty - | State { id: String - , name: Maybe String - , description: Maybe String - , custodian: String - , hashAlgorithm: HashAlg, -- TODO: Can we introduce this bit of information at the register level? - , fields: Set Field - , primaryKey: FieldId -- TODO: Any other better name to describe the Id? - } + | State Snapshot ``` TODO (relevant?): This delta applied to an empty state yields the @@ -237,8 +240,12 @@ m1 == State { id: "country" , name: Just "Country" , description: Nothing , custodian: "Foreign & Commonwealth Office" - , fields: Set [ { id: "country", datatype: One StringType } ] - , primaryKey: "country" + , fields: Set [] -- Empty set because only the Primary key is defined + , primaryKey: Field { id: "country" + , datatype: One StringType + , label: Just "Country" + , description: Just "The country's 2-letter ISO 3166-2 alpha2 code." + } } ``` @@ -261,18 +268,17 @@ m2 == State { id: "country" , name: Just "Country" , description: Nothing , custodian: "Foreign & Commonwealth Office" - fields: Set [ Field { id: "country" - , datatype: One StringType - , label: Just "Country" - , description: Just "The country's 2-letter ISO 3166-2 alpha2 code." - } - , Field { id: "name" + fields: Set [ Field { id: "name" , datatype: One StringType , label: Just "Name" , description: Just "The name of the country." } ] - , primaryKey: "country" + , primaryKey: Field { id: "country" + , datatype: One StringType + , label: Just "Country" + , description: Just "The country's 2-letter ISO 3166-2 alpha2 code." + } } ``` @@ -280,8 +286,6 @@ m2 == State { id: "country" **Blobs**, similarly to Items, can be treated as a dictionary (hash map): ```elm --- TODO: Probably better to stick to the current definition and use a --- JSON-like string type Blob = String blobs: Dict Hash Blob @@ -467,16 +471,16 @@ to explore the idea of paginating naturally unordered collections in an ordered way. In SQL this could take the form of a keyset pagination such as: ```sql -CREATE TABLE ( +CREATE TABLE blobs ( n SERIAL, value VARCHAR(255) NOT NULL CHECK (value <> ''), id bytea PRIMARY KEY ); -CREATE INDEX n_idx ON blob USING btree (n); +CREATE INDEX n_idx ON blobs USING btree (n); SELECT id, value -FROM blob +FROM blobs WHERE n > ?cursor ORDER BY n ASC LIMIT 2; From b46aea06669ebb4ccdf6d0192c47c567d64e6d36 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Mon, 9 Jul 2018 15:53:32 +0100 Subject: [PATCH 31/35] Change from label to name Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index ca9d407..c746f5d 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -203,7 +203,7 @@ type Datatype type Field = { id: String , datatype: Datatype - , label: Maybe String + , name: Maybe String , description: Maybe String } @@ -243,7 +243,7 @@ m1 == State { id: "country" , fields: Set [] -- Empty set because only the Primary key is defined , primaryKey: Field { id: "country" , datatype: One StringType - , label: Just "Country" + , name: Just "Country" , description: Just "The country's 2-letter ISO 3166-2 alpha2 code." } } @@ -270,13 +270,13 @@ m2 == State { id: "country" , custodian: "Foreign & Commonwealth Office" fields: Set [ Field { id: "name" , datatype: One StringType - , label: Just "Name" + , name: Just "Name" , description: Just "The name of the country." } ] , primaryKey: Field { id: "country" , datatype: One StringType - , label: Just "Country" + , name: Just "Country" , description: Just "The country's 2-letter ISO 3166-2 alpha2 code." } } @@ -372,13 +372,13 @@ Content-Type: application/json "id": "country", "datatype": "string", "cardinality": "1", - "label": "Name", + "name": "Name", "description": "The country's 2-letter ISO 3166-2 alpha2 code.", }, { "id": "name", "datatype": "string", "cardinality": "1", - "label": "Name", + "name": "Name", "description": "The name of the country." } ], @@ -528,7 +528,7 @@ Should blobs be annotated by type? ``` type Blob - = Value String -- Leaf like label, description or custodian + = Value String -- Leaf like name, description or custodian | Field String | ... ? ``` From 14a39523c9c380752356db82a708b65b5901d620 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Mon, 9 Jul 2018 16:02:08 +0100 Subject: [PATCH 32/35] Fix pagination inconsistencies Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index c746f5d..67c448e 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -406,7 +406,7 @@ Accept: application/json ```http HTTP/1.1 200 OK Content-Type: application/json -Link: ; rel="next" +Link: ; rel="next" [ { @@ -506,7 +506,7 @@ Accept: application/json ```http HTTP/1.1 200 OK Content-Type: application/json -Link: ; rel="next" +Link: ; rel="next" [ { @@ -536,7 +536,7 @@ type Blob ```http HTTP/1.1 200 OK Content-Type: application/json -Link: ; rel="next" +Link: ; rel="next" [ { From e167215f7f6ca9e6f12088251111e2a11c1ed858 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 12 Jul 2018 11:27:13 +0100 Subject: [PATCH 33/35] Rewording motivation Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index 67c448e..b656afb 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -19,9 +19,9 @@ Registers started as pure data, and slowly added different bits of metadata. The reference implementation has a few bits of metadata (e.g. description, name, fields) but the specification offers no way to consume them. -This RFC aims to keep backwards compatibility by creating a new log -to encode metadata changes with references to the data log to keep -coordination with the original data log. +This RFC proposes a new log to encode metadata changes which has references to +the original data log. An important side effect of this approach is that it +maintains backwards compatibility. ### Use cases From 638df103ffb439d9659df38a56cb4d3160257b65 Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Mon, 16 Jul 2018 08:51:56 +0100 Subject: [PATCH 34/35] Fix style as suggested by @bahmady Find the original suggestions in PR #19 Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 37 +++++++++++++++++------------------ 1 file changed, 18 insertions(+), 19 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index b656afb..a8de27b 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -15,23 +15,22 @@ affecting the current data log. ## Motivation -Registers started as pure data, and slowly added different bits of metadata. -The reference implementation has a few bits of metadata (e.g. description, -name, fields) but the specification offers no way to consume them. +Registers started as pure data, but have evolved as metadata has been added. +The reference implementation has some metadata (e.g. description, name, +fields) but the specification offers no way to consume them. This RFC proposes a new log to encode metadata changes which has references to -the original data log. An important side effect of this approach is that it -maintains backwards compatibility. +the original data log, in order to maintain backwards compatibility. ### Use cases -#### As a user I want to get a records and validate them against the schema. +#### As a user I want to get a records and validate them against the schema 1. `GET /records/foo` 2. `GET /schema/` 3. validate -#### As a user I want to validate a record I got _some_ time ago and validate it against the schema. +#### As a user I want to validate a record I got some time ago and validate it against the schema. 1. `GET /records/foo` 2. (time passes) @@ -39,7 +38,7 @@ maintains backwards compatibility. 4. validate -#### As a user I want to get a record at an arbitrary log size and validate it against the schema. +#### As a user I want to get a record at an arbitrary log size and validate it against the schema 1. `GET /records/foo?size=10` 2. `GET /schema/?size=10` @@ -53,7 +52,7 @@ sufficient. --- -#### As a user I want to validate a record against the latest (correct) schema. +#### As a user I want to validate a record against the latest (correct) schema 1. `GET /schema/` 2. (time passes) @@ -62,7 +61,7 @@ sufficient. 5. validate Essentially this means that either we provide a way to know if the schema is -the latest or we require to always fetch a new version. +the latest version, or we make it a requirement to always fetch a new version. The issue arises when a new record is validated against an old schema if and only if the new record has fields informed that were defined in newer versions @@ -119,8 +118,8 @@ schema : Changeset -> MetaLog -> Schema ### Timestamp -The `timestamp` property describes when the changeset was recorded. They don't -define the order of the metadata log. +The `timestamp` property describes when the changeset was recorded. It does +not define the order of the metadata log. ### Target @@ -131,10 +130,10 @@ replacements of data that could occur in the data log. The first changeset expects `target` to be nil given that the first item in the data log must conform to a defined schema. Optionally, new changesets can be recorded on top of the first one without `target`. Once there is a -changeset with a explicit `target` no more nil `target` properties are +changeset with a explicit `target`, no more nil `target` properties are allowed. -Rough algorithm given a new changeset: +A rough algorithm given a new changeset is as follows: 1. If it's the first changeset: * Succeed if `target` is nil. @@ -154,7 +153,7 @@ it matter? ### Parent -The `parent` property works in a similar way as in Git's commits. The +The `parent` property works in a similar way as in Git commits. The intention is to explore a linked list structure instead of the ordered list implemented for the data log. @@ -164,7 +163,7 @@ have a single parent hash informed. ### Delta The `delta` property has the data to apply on top of the previous metadata -state. A delta allows mutliple bits of data so it can describe an update for +state. A delta allows multiple bits of data so it can describe an update for multiple unique keys at the same time. For example: ```elm @@ -180,7 +179,7 @@ a0 = ] ``` -Note a delta is an ordered set ordered by key. +Note that a delta is an ordered set ordered by key. TODO: This attempt to describe the resulting metadata state shows that the very @@ -249,7 +248,7 @@ m1 == State { id: "country" } ``` -A second delta `a1` such as +A second delta `a1`, such as: ```elm a1 : Delta @@ -388,7 +387,7 @@ Content-Type: application/json ## Low-level (plumbing) API -The following endpoints are low level +The following endpoints are low-level. ### Get the list of changesets From fe9c4885b301eb7b537cea7584138763d593ba8a Mon Sep 17 00:00:00 2001 From: Arnau Siches Date: Thu, 26 Jul 2018 10:24:05 +0100 Subject: [PATCH 35/35] Reword warning Signed-off-by: Arnau Siches --- content/metadata-log/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/metadata-log/index.md b/content/metadata-log/index.md index a8de27b..027ca7a 100644 --- a/content/metadata-log/index.md +++ b/content/metadata-log/index.md @@ -63,9 +63,9 @@ sufficient. Essentially this means that either we provide a way to know if the schema is the latest version, or we make it a requirement to always fetch a new version. -The issue arises when a new record is validated against an old schema if and -only if the new record has fields informed that were defined in newer versions -of the schema. +One potential issue arises when a new record is validated against an old +schema. This happens if (and only if) the new record has fields which were +defined in newer versions of the schema. #### As a user I want to get a schema version