Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: BODS simplification #737

Open
kd-ods opened this issue Dec 6, 2024 · 4 comments
Open

Feature: BODS simplification #737

kd-ods opened this issue Dec 6, 2024 · 4 comments

Comments

@kd-ods
Copy link
Collaborator

kd-ods commented Dec 6, 2024

This ticket helps track progress towards developing a particular feature in BODS where changes or revisions to the standard may be required. It should be placed on the BODS Feature Tracker, under the relevant status column. Comments on this ticket can be used to help track high-level work towards this feature or to refine this set of information.

See Feature development in BODS in the Handbook.

Feature name: BODS simplification

Feature background

Briefly describe the purpose of this feature

BODS has been developed in parallel with the roll out of beneficial transparency reforms across the world. Assumptions about what and how information about beneficial ownership would be disclosed were made early on in the standard's development. These are represented in the fields and objects it now contains.

The purpose of this work is to ascertain whether there are, as of BODS 0.4, objects and fields in BODS which will not be useful for the majority of users.

What user needs are met by introducing or developing this feature in BODS?

Ultimately, any paring back or simplification would aim to make the data standard easier to understand, and to put into use. The effort might include:

  • Ensuring that 'metadata' elements of statements (i.e. data outside of the recordDetails object) are not unnecessarily detailed. See this request on taking an axe to metadata.
  • Ensuring that the scope of information representable in BODS directly relates to beneficial ownership concepts.
  • Restructuring or simplifying objects

What impact would not meeting these needs have?

Any scope creep and over-complication in modelling could deter potential users of BODS.

How urgent is it to meet the above needs?

This is less urgent than it is important.

Feature work tracking

[Link to proposals, bugs and issues in the repository to help track work on this feature]

@kd-ods
Copy link
Collaborator Author

kd-ods commented Dec 6, 2024

The inclusion of information about politically-exposed persons (PEPs) in BODS could be an example of scope creep. Is PEP information really a core element of beneficial ownership information? Certainly it needs to be brought together with beneficial ownership information for anti-corruption purposes. But should it have a place in the BODS schema?

The BODS development team would be interested to hear opinions on this from users, and potential users, of the standard. (And - of course - any ideas about other areas of BODS 0.4 which might benefit from simplification.)

@jpmckinney
Copy link

jpmckinney commented Jan 17, 2025

Previous simplification proposals

  • Review terms used in documentation and/or schema, including terms in JSON Schema property descriptions, for consistency and clarity.

    • All of these are synonyms in colloquial English: "statement”, "declaration”, "claim" (used to define or as an alternative to "statement") and “assertion" (used in Source only). Use at most two synonyms regularly (likely "statement" and "declaration") to reduce cognitive burden. Use "claim" minimally (e.g. in the definition of Statement). Avoid assertion, asserted, etc. entirely; it's not necessary.
  • Rename terms for clarity, consistency and familiarity to users. Note: Some of these terms might presently be used in different contexts and/or inconsistently. The proposal, in such cases, is to rename where appropriate. This should be paired with an effort to use terms clearly and consistently, in general.

    • Record Details: subject -> entity. The documentation uses forms like "the subject (an entity)" and states "The subject MUST be an entity." Instead of frequently explaining that a subject is an entity (where that is the case), it would relieve cognitive burden to simply use "entity".
    • (NEW proposal) More generally, I recommend avoiding the term "subject" across the entire documentation. For example, “person or entity” is much clearer than “subject” (in the cases where a subject can be either). It is very important to limit the number of jargon terms in order to promote reader comprehension. There are not so many additional characters to compromise length or speed; "person or entity" will be read as a chunk.
    • (NEW proposal) Source: assertedBy -> statedBy. This serves the purpose above, about reducing the number of unnecessary synonyms. Similarly, simplify the phrase "the information asserted in this Statement" to "the information asserted in this Statement", and change "assertion" to "statement" in the descriptions of the Source's name and uri.
    • Statement: declaration and declarationSubject (changed from declarantRecord, which caused confusion as to its semantics) -> declaration/id and either declaration/about or declaration/describes: Both the name and definition of declarationSubject are confusing ("A recordId value for the subject of a beneficial ownership network (always an entity or person)."). This example makes the concept somewhat clearer. Its meaning is "The person or entity whose beneficial ownership network is described by the declaration, identified by its recordId."
  • Revisit whether "records" are a concept desired by users. I include the bullets below about renaming, only for completeness. I think records are confusing enough that it doesn't matter much what term is used, and I'm not sure that the old proposals below do much to improve the situation.

    • Statement: recordId, recordType, recordStatus -> subjectId, ... (or something else, unless there is good evidence that publishers and users understand "record"). This proposal followed from a discussion about how users, especially, are not interested in data that models some publisher's system. They are interested in data that models entities, people and relationships – and the statements and declarations via which this information is communicated in the real world (e.g. via paper forms, electronic submissions, etc.). If I ask a random user, "What is the record of this statement?", they will not understand the question. If I ask, "what is the subject of this statement?", they will understand that I am asking about the "the thing or person that is being discussed, described or dealt with" (dictionary) in the statement – which is precisely the desired interpretation ("A unique identifier for the record (within the publisher's system) the thing to which this Statement relates."). Of course, this change can only be made if "subject" is freed up per above. If that's not possible, then we can explore other terms.
    • Statement: recordDetails -> predicates. "predicates" matches "subject".

Ask users whether they prefer fields to be collected into objects, or all put at the top level

See discussion here and here.

There seems to be resistance to using objects. At this point, I think it is best to ask a representative sample of users.

Above, I suggest a declaration object with an id field and either an about or describes field. BODS is not shy about nesting (see Address, Country, Jurisdiction, Identifier, UnspecifiedRecord, etc. all of which can be flattened instead of using objects). If, for a good reason, declaration cannot be an object, then simply keep declaration as-is, but rename declarationSubject to either declarationAbout or declarationDescribes.

Draw a clearer line between declaration and record

BODS describes both declarations and records as being "within the publisher’s system". I understand that "record" is meant to reference a (not necessarily unique) representation of an entity (organization), natural person or relationship within a publisher's electronic system. I don't think it's necessary for a declaration to be defined as being "within the publisher's system"; this just muddies the distinction between record" and "declaration". Its definition should be closer to what's on the Concepts page.

People or entities are obliged in some jurisdictions to disclose their beneficial ownership. They declare this information to a designated agency. Each declaration is a set of claims about the entities, people and relationships within the subject’s beneficial ownership network. Information about those entities, people and relationships is captured by the agency in records which are updated as new claims are made.

Some improvements that can be made to the quoted text based on the above:

  • Avoid introducing the term "claim" (which is quickly abandoned in the text). Instead, use the term "statement". As much as possible, "statement" should be used, such that "claim" only appears in the definition of a Statement ("A claim about a person, entity or relationship, made at a particular point in time."). For example, within the definition of declaration, the phrase "Where a Statement is a claim from a particular declaration" can be simplified to "Where a Statement is a claim from a particular declaration".
  • "subject" is unclear. In Record details (relationship), "The subject MUST be an entity." In the quoted text, I think the intended meaning is "the person or entity's beneficial ownership network". If that's the case, simply write "the person or entity's beneficial ownership network", instead of adding a word that means different things in different parts of the standard's documentation. On the other hand, if the meaning is just "entity", simply write "entity".

Avoid "element"

"Element" is used in place of "entity, person or relationship". In general, I think it is clearer to consistently use "entity, person or relationship", to reduce the number of words with special meaning specific to BODS. In many cases, "element" is followed by "(person, entity and relationship)" anyway, and, in some cases, the part of the phrase using "element" could be deleted entirely.

I believe in some cases, "element" is used where a specific element is intended. For example, my understanding is that "intermediary elements" can only ever mean "intermediary relationships". There is just no way to "intermediate" between people, between entities, or between people and entities without relationships... Abandoning the term "element" will force greater clarity, like in this example.

If "element" is retained, then please delete all non-jargon occurrences. For example:

Other observations

  • Avoid the suffix "details", like in "publicationDetails". You are describing the publication. Simply name the field publication, like with the super-majority of fields: source, jurisdiction, identifiers, etc.
  • Are annotations necessary? Who needs these? Has anyone used them?
  • The supporting spreadsheet and slides from Implementation proposal: separating core data and metadata within statements #477 are no longer publicly accessible, making it harder to trace the discussion.
  • As an exercise, ask anyone who is not an expert in BODS to interpret this sentence from the documentation, "Each BODS statement¹ issued in relation to any element² of a subject’s³ declaredbeneficial ownership network⁵, at any point in time, MUST contain the subject’s³ record⁶Id as its declaration⁴Subject³."

@kd-ods
Copy link
Collaborator Author

kd-ods commented Jan 20, 2025

Thanks for this audit and recommendations, @jpmckinney.

I think avoiding the terms 'element' and 'claim' would be simple enough and improve clarity. Similarly, renaming assertedBy to statedBy is a simple change which would tighten things up.

Other suggestions - such as those related to the use of 'record', 'subject' and 'declaration' - have wider-reaching implications than the above terminology changes. It will certainly be worth spending time exploring those implications with you and other interested parties as we come to the next round of development. (And it occurs to me that there's a good way to test any proposal coming out of such an exploration. It should result in a more palatable version of the unbearably clunky sentence you flag up.)

For the moment, the above is a valuable to-do list of issues to consider under the heading of simplification.

The supporting spreadsheet and slides from #477 are no longer publicly accessible, making it harder to trace the discussion.

I'm not sure how those view permissions got changed. I've reverted them now: you should be able to view the files.

@jpmckinney
Copy link

'declaration' as a term is fine – moreso improving its definition, and renaming 'declarationSubject'.

On 'subject', the proposal is essentially to replace it with "person or entity" or "entity", depending on the sentence. There are fewer occurrences than you might expect. They are mostly in the schema.

I think we're stuck with 'record' as the least-bad option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

2 participants