-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(tree) Fix document corruption due to loss of change info for transient nodes in sequence field #16190
fix(tree) Fix document corruption due to loss of change info for transient nodes in sequence field #16190
Conversation
⯅ @fluid-example/bundle-size-tests: +2.6 KB
Baseline commit: b04b832 |
/** | ||
* @returns `true` iff the given delta has a visible impact on the document tree. | ||
*/ | ||
export function isDeltaVisible(delta: Delta.MarkList): boolean { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it difficult to avoid creating "invisible" deltas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, yes, because it might require following a sequence of moves to see if they span the boundary between transient and non-transient stuff.
For the tests that use this function, we could avoid it in more cases if NodeChangeComposer
had the ability to return undefined in order to signify that the outcome of the composition turned out to be a no-op.
case Delta.MarkType.MoveOut: | ||
case Delta.MarkType.MoveIn: | ||
case Delta.MarkType.Delete: | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't these marks count towards visibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, they indeed should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my mistake. They do now.
if (newMarkIsTransient) { | ||
return withNodeChange(baseMark, nodeChange); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this could be moved before the outer if-block.
if (newMarkIsTransient) { | ||
return withNodeChange(baseMark, nodeChange); | ||
} | ||
// TODO: Make `withNodeChange` preserve type information so we don't need to cast here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we just need to merge the Modify
and Noop
mark types.
const nonTransient = withNodeChange(baseMark, nodeChange) as | ||
| Insert<TNodeChange> | ||
| Revive<TNodeChange>; | ||
delete nonTransient.detachedBy; | ||
return nonTransient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to make assumptions about what kinds of marks newMark
could be, and I think we should make those explicit. Our format allows composing a transient insert with a modify to the inserted node, which would behave incorrectly here.
// TODO: Create modify mark for transient node. | ||
return { count: 0 }; | ||
|
||
assert(isDeleteMark(newMark), "Unexpected mark type"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also assert isGenerativeMark(baseMark)
instead of doing the cast below.
const nonTransient = withNodeChange( | ||
baseMark, | ||
nodeChange, | ||
) as GenerativeMark<TNodeChange>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe because markIsTransient
implies the mark is a GenerativeMark
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
// - MoveIn marks are invalid in an existing cell. | ||
assert(newMark.type !== "MoveIn", "Invalid MoveIn after transient"); | ||
// - ReturnTo marks are invalid in for a cell whose node has not been moved out. | ||
assert(newMark.type !== "ReturnTo", "Invalid ReturnTo after transient"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I completely understand these assumptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I now realize it's more complicated than that.
The existence of the rebase sandwich means we can have a MoveIn mark that targets an already existing cell. This would have to be in a scenario where local changes with a move get rebased over something else. The something else can't be targetting the cell that the MoveIn is targetting though, because no concurrent change can know about that cell.
For the ReturnTo, I think it's possible for it to occur after a transient, but only a muted ReturnTo.
Why possible: if the transient is a revive, then it's possible that the newMark comes from a client that knew about the node, and tried to move it out and return it.
Why muted: only a single specific node will ever occupy a given cell, and the presence of a transient mark tells us that node just got deleted. Return marks that attempt to move a deleted node end up being muted.
Let me know if you think that makes sense. If so, I'll update the code (to check that the ReturnTo mark is muted), and if not, we should probably just chat.
@@ -223,9 +223,22 @@ export interface HasRevisionTag { | |||
} | |||
export const HasRevisionTag = Type.Object({ revision: Type.Optional(RevisionTagSchema) }); | |||
|
|||
export interface Transient { | |||
/** | |||
* When populated, the content is inserted/revived then deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems slightly confusing to say "when populated," as the field is not optional on this interface.
return [inverse]; | ||
if (mark.detachedBy !== undefined) { | ||
assert(revision !== undefined, "Unable to revert to undefined revision"); | ||
return [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is saying that the inverse of a transient insert is an active revive. Are you sure this is right?
const detachEvent = { revision: detachEventRev, index: computedDetachIndex }; | ||
const mark: Reattach<never> = { | ||
type: "Revive", | ||
content: reviver(detachedBy, computedDetachIndex, count), | ||
content: reviver(detachEventRev, computedDetachIndex, count), | ||
count, | ||
detachEvent, | ||
}; | ||
if (!isIntention) { | ||
mark.inverseOf = detachedBy; | ||
mark.inverseOf = detachEventRev; | ||
} | ||
if (detachedBy !== undefined) { | ||
mark.detachedBy = detachedBy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this made me think we should consider renaming detachEvent
and detachedBy
to be more distinct somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I had trouble comming up with better names and then I forgot about it. delete
may be clearer than detachedBy
. Do you have suggestions?
Ultimately we're not going to care for very long if we switch to the anchor & effects format.
this.advanceForRevision(markLength - inLength, mark.revision); | ||
this.advanceForRevision(outLength - markLength, mark.detachedBy.revision); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are inLength
and outLength
always 0 for transient marks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
@@ -50,9 +51,34 @@ describe("SequenceField - Compose", () => { | |||
title.startsWith("((delete, insert), revive)") || | |||
title.startsWith("((move, insert), revive)") || | |||
!SF.areComposable([taggedA, taggedB, taggedC]) | |||
) { | |||
// These changes do not form a valid sequence of composable changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems inconsistent with the more detailed one below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need help seeing how. I admittedly don't recall what's wrong with ((delete, insert), revive)
and ((move, insert), revive)
, but the fact that they were grouped with !SF.areComposable([...])
makes me think they similarly didn't make sense together.
Motivation
Fixes the fact that we were dropping all nested changes to a subtree when composing the (re)creation of that subtree, changes to it, and the deletion of the subtree. See added/updated tests for details. Note that this is for sequence field only.
Dropping the nested changes used to produce unexpected results as well as potential divergence in state between clients, which eventually leads to document corruption.
Changes
This is accomplished by adding information on
Insert
andRevive
marks to convey whether the content has been deleted (in addition to inserted/revived), and aChangeAtomId
describing the deletion.The
ChangeAtomId
info is used to properly order and match up (in terms of cell index) different marks. For example, by detecting that a later revive should be composed with the transient insert.This PR also disables some composition tests. These tests check that left-associative and right-associative compositions produce the same results. The changes in this PR do not introduce the discrepancy in how those compositions are processed, but they make the existing discrepancy obvious by making the output of the composition different. This reduction in the correctness of
compose
is considered an acceptable price to pay for the benefit of handling transient node changes better, especially when considering that the prior compose logic, while consistent w.r.t. to associativity, was producing results that are consistently wrong (this is because repair data needs to be created for pairs of inserts and deletes).