Skip to content
This repository was archived by the owner on Mar 15, 2021. It is now read-only.
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 139 additions & 0 deletions content/entry-invalidation/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
rfc:
start_date: 2018-02-12
pr:
status: draft
---

# Entry Invalidation

## Summary

This RFC proposes a way to invalidate an entry so previous changes considered
mistakes can be removed from the computed list of records.


## Motivation

Currently, the append-only nature of Registers doesn't cope well with
mistakes. A situation that has arisen more than once is getting two records
about the same thing with different identifiers because a human mistake.
Another situation is having a change in the history of a record that is wrong.

We need a way to mark entries as invalid so we can keep compatibility with
other copies of the Register (consumers, replicas, etc) and allow computing
the correct(ed) list of records.


## Explanation

[TODO: Currently two alternatives are explored]


### Alternative A: Special entry

The proposal introduces a new type of entry that allows listing a set of entry
identifiers. This mechanism would allow invalidating discrete entries and by
extension invalidating the full history for a key when all entries for that
key are flagged as invalid. It also introduces a new RSF command to avoid
overloading the `append-entry` command.

[TODO: entry number or entry hash? Entry number is aligned with existing ways
to refer to entries. Entry hash invalidates the entry by its content]

#### RSF

```
invalidate-entry user 2018-02-12T10:11:12Z 3;234;355
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalidate-entry doesn't match the naming in the JSON representation below

  • invalidate vs revoke
  • entry vs entries

```

#### JSON

```json
{
"entry-number": "356",
"revoked-entries": ["3", "234", "355"],
"entry-timestamp": "2018-02-12T10:11:12Z"
}
```

#### Properties

* New entry type without key. Breaking change diverging from previous designs.
* Records are unaffected.
* Revoked entries require a mechanism/documentation to explain how to apply
them when computing the list of records.
* Entry proofs are kept intact and invalidations are part of the tree.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be careful that this feature doesn't get used excessively by custodians, as it could make data generally unavailable even though it's still part of the log. For example, if a CURIE points to a revoked record, then it's basically a dead link, and we should avoid this happening.

I think we should define the cases where we expect this to be used, and require the custodian to express why a record has been revoked as part of the entry. This would help users understand what's going on and prevent people from jumping to conclusions about government censoring data or whatever.

Valid reasons for revoking stuff:

  • A new key was added to the register by mistake
  • The custodian made an error and the item for an entry has incorrect values
  • The custodian entered information in multiple steps so that items are missing values that alter its meaning (for example, a full name should have been changed but only the first name field was changed)
  • The information contained in an entry was thought to be correct but later turned out to be inaccurate

Reasons why the custodian might want to revoke stuff, but they shouldn't:

  • The record is no longer applicable (use start-date and end-date instead)
  • Personal information needs to be removed for legal reasons (redact information instead)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough review, I have to re-read what I wrote. It's been a while. In any case there is another case (which is why this is important sooner rather than latter):

An entry introduces a record under a new key but it should've been an update to an existing key.

That means the entry needs to be marked as "ignore this one, mistake", a start/end date is the wrong level of abstraction to flag this. It's not about the data, it's about the metadata.

In any case, I'll revisit this soon and reply to your comments.


### Alternative B: Reserved entry key + new item shape

The proposal introduces a reserved key and a new type of item where entry
identifiers are listed. This mechanism allows invalidating discrete entries
and by extension invalidating the full history for a key.


[TODO: Review if "chore:revoked-entries" is the right reserved key]


#### RSF

```
add-item {"id":"chore:revoked-entries","entry-numbers":["3","234","355"]}
append-entry user chore:revoked-entries 2018-02-12T10:11:12Z sha-256:0000000000000000000000000000000000000000000000000000000000000000
```

#### JSON

**Entry 356:**

```json
{
"index-entry-number": "356",
"entry-number": "356",
"key": "chore:revoked-entries",
"item-hash": [ "sha-256:0000000000000000000000000000000000000000000000000000000000000000"],
"entry-timestamp": "2018-02-12T10:11:12Z"
}
```

**Item sha-256:0000000000000000000000000000000000000000000000000000000000000000:**

```json
{
"id": "chore:revoked-entries",
"entry-numbers": ["3", "234", "355"]
}
```

#### CSV

**Entry 356:**

```csv
index-entry-number,entry-number,entry-timestamp,key,item-hash
356,356,2018-02-12T10:11:12Z,chore:revoked-entries,"sha-256:0000000000000000000000000000000000000000000000000000000000000000"
```

**Item sha-256:0000000000000000000000000000000000000000000000000000000000000000:**

```csv
id,entry-numbers
chore:revoked-entries,3;234;355;
```

#### Properties

* Entries are consumed regularly (no breaking changes).
* This approach uses reserved key in the user space. Requires avoiding clashes
with user data.
* Knowledge of the reserved key is required to filter it out when collecting
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given users are already confused by the API including archived data, I don't think it's reasonable to expect them to filter out special keys as well.

For me this is the difference between being able to quickly put together a script that uses the API and just giving up and finding a client library to do it for me.

list data.
* Records don't list reserved keys. You don't get the same information if you
get it from `/records` or from `/entries`. Emphasises the dichotomy between
porcelain and plumbing.
* Revoked entries require a mechanism/documentation to explain how to apply
them when creating the list of records.
* Entry proofs are kept intact and invalidations are part of the tree.