-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modeling RichText with Automerge #193
Comments
Are you writing a new rich-text editor or trying to integrate with an existing one? I think the best approach here is going to be determined by that question. I'd suggest you consider a couple of other features like cursors (which need to be tracked in the document but won't be permanently persisted), and comments (which might be special markers?) The automerge-slate repo has an example of the pain in translating another text editor's OT-style into Automerge, and it models the document as a tree. Having done a lot of markup manipulation in past lives I'm somewhat nervous about inventing new representations without doing a bunch of extra research but maybe I can spare some time to dig in on what you're working on. |
Right now I'm integrating with quill but I think same general approach would work across the board & likely would apply even if I rolled out my own editor (which I don't want to do).
I have non rich text with cursors already which is actually fairly easy, they are stored separately because they don't get concurrent updates only actor owning that cursor does.
I've seen it, but I think it suffers from the outlined problem.
I'm nervous myself, also why I started this issue. It does seem that y-js author came to the same conclusion as myself in regards to marker tokens
That alone obviously does not mean much, but I also have not found anything else other than RGASplit (see #195) which may offer a way to deal with outlined problem. Mostly I opened to invite others (more qualified than myself) to offer some thoughts, feedback, ideas. |
Actually you're right, I was thinking of cursor as single position (which is easy to update base where inserted occurred before / after) but if you extend that to selections that becomes a lot more tricky. |
Definitely worth considering how moving text will work too. |
You mean cut & paste ? |
Side remark: this echos quite a bit constraints I had in https://github.com/gozala/dominion where node moves were preferred over remove / insert. I end up wish something like register where I could stash nodes and then insert them from. Obviously doing this in collaborative setting seems far more challenging. |
I was hoping to get a basic demo working of what I was trying to build before sharing, but I'm also happy to do it much sooner as potentially it may save me from working towards the dead end |
I have spent a while thinking about this too, and I also think that a single document sequence with marker characters is the way to go. If you represent a document as a tree, there are a lot of operations that require deleting and re-inserting nodes (e.g. hitting enter in the middle of a paragraph, causing it to split into two paragraphs), which don't merge well in a concurrent setting. If a document is just a flat sequence, hitting enter in the middle of a paragraph just means inserting a My intention was that Automerge.Text should be usable for this purpose. That is, I want Automerge.Text to be able to contain marker objects as well as characters from the text, along the lines of: new Automerge.Text(['h', 'e', 'l', 'l', 'o', ' ', {bold:true}, 'w', 'o', 'r', 'l', 'd', {bold:false}]) At the moment I think that's not working, but that's a bug that should be fixed. And I think it's a good idea to create an Regarding cursor positions, the safest would be to use the internal element ID to refer to a position within the text, since it remains stable, even as people insert and delete text before and after the position. The Text class has a |
What happens if the element cursor was at got removed ? Would index somehow be restored via tombstone ? Would you do the same for selections, meaning element IDs for begin / end indexes ? This makes me also wonder what if actor receives cursor / selection for anchored to elements that are not (yet) present in the document. |
For what it's worth I chose to avoid open / close style markers that introduce complication due to nesting {bold:true}hello {italic:true}wold{italic:false}{bold:false} with a markers that imply flat structure: {bold:true}hello {bold:true, italic:true}wold{} I'd be interested to hear arguments in favor of the nested approach assuming there are some. |
Yes, currently tombstones are retained indefinitely anyway, so the element ID resolution will work even after the element has been deleted. And yes, representing ranges as a start-ID and end-ID should work fine. |
I agree with that: in a concurrent editing setting it will probably not be possible to ensure proper nesting, so whatever thing maps the markers into a DOM-like tree structure will have to handle arbitrarily ordered markers. It seems better to think of the markers as state transitions ("stuff from here onwards is bold! stuff from here onwards is non-bold!") rather than opening and closing tags. |
Maybe I am overlooking something, but "If a document is just a flat sequence, hitting enter in the middle of a paragraph just means inserting a '\n' element into that sequence." means we can not express tree structures. I can not imagine WYSIWYG without at least an anchor which has to be split as well. |
Depends on the editor, for example Quill treats every line of text as paragraph so there model would really be just inserting However it is true that in some cases you may need something more more complex that is insert |
I've published my current draft of the discussed At the moment I'm using array / list instead of Concurrent selections are broken right now, but shouldn't be too hard to fix once #198 is fixed. Interesting / relevant bits are |
Here’s one way to do a link before link { link: { to: "https://github.com", target: "_blank", etc } }link here{ link: "end" } after link |
That is more or less what my current implementation encodes to, following to be precise: before link {attributes:{link:"https://github.com"}, length:0}link here{attributes:{}} after link P.S.: |
Okay, I've been working on this problem a bit and I think the current design is not going to work, but that we might have a way forward. The fundamental problem is that an inline-control-characters design needs to apply formatting to spans either in some order (left-to-right?) or tries to consolidate all local formatting operations based on local knowledge... but can't predict what other users are doing. This means that we either lose causal ordering in the first case, or fidelity in the second. Fortunately, we have a system already present in automerge that can manage keeping things ordered during merges: automerge.array! My new design theory is that we should keep an ordered array of formatting spans. Each span will refer to a start and end character and will be applied in-order on the user's machine when the document is materialized. This is potentially slow, but should at least provide correct results and we can worry about optimization next. |
Important: an Ordered Array of Formatting Spans will henceforth be known as the OAFS model. |
Is that separate array from the text characters? And I presume it refers to start & end chars by an id isn’t it ? What happens when span range needs to change either grow or shrink ? I don’t know what you have in mind but I’m suspecting that If removing a span is ever necessary that is going to lead to some errors. |
The structure will look like this, if you imagine a bold operation on "cada", and unbold of "raca" below:
|
@pvh I think primary problem with just computed formatting dictionaries (in front position only & no span length) is that could lead to multiple such dictionaries in the same position or is there something else ? If that is only problem is there no way to handle that differently like collapse such dictionaries ? At the camp I remember we were talking about an idea of having a special operation like insertion of empty dict that would only insert one if it’s not already there. Doing it in user space I’d imagine that by having separate dict where id of element prior to control char is used as a key and dict and as value so that all formatting attrs are changing same dict and (presumably) avoid multiple formatters next to each other. Alternative user space solution I’ve considered was to use leftmost formatting dicts as the canonical and let actor that created second formatting dict do the merge + removal which I believe would converge |
Yes, you need to collapse them. You can see a non-causal implementation here: https://github.com/inkandswitch/pushpin/blob/quill-rich-text/src/renderer/TextDelta.ts The issue is that you can't (in an open system) know who or what is doing what formatting out there. I think this approach is likely a significant improvement in fidelity of intent preservation. |
@pvh pointed major flaw with my approach on the call other day, that worth posting for anyone reading this, including future myself who already forgot this. So issue with collapsing all formatting into dictionaries is following: Alicev-a1
Alice makes change to v-a2
BobBob starts from v-a1
And makes change v-b2
MergeOnce changes a2 & b2 reach both peers they converge to
Instead of intended
That is because when each participant injected new formatting rules neither was aware of formatting that had to be copied over. So all the formatting needs to be preserved in data structure and collapsed on the fly, although I suspect we'd need to have collapsed cache for responsive local edits and more expensive re-computations could be performed when receiving changes from others, which might be deferred to be performed when local edits idle. |
I would also like to encourage someone better qualified than myself to take a look at xi-editor text representation which seems to employ interesting approach that is promising fast operation and efficient memory representation while keeping append-only history. |
@pvh I was reading on https://braid.news/ other day which took me to sync9 performance overview that GC semantics via acknowledgement messages. I have not looked into details, but I'm guessing it's similar to idea I have being entertaining in my head for quite some time - let an actor propose compaction block with pointers to the heads of all participants. If all the other actors agree / acknowledge history prior to compaction could be deleted. However I believe there is contingency with approach you were proposing - storing formatting information as a separate structure that refers to ranges based on character id's with in the text. All this got me wondering if instead of anchoring fragments to text identifiers they were anchored to the offsets instead which could be adjusting on inbound operations. This starts to sound similar to what Xi Text engine does from my recollection, but I'd need to read that over to confirm. Just wanted to throw this over here. |
Acknowledgement messages are interesting, but acknowledgement only works if you know who is out there. In other words, you have to know the set of other users in the universe to safely compact which is tricky. If you're willing to simply move forward without knowing who else is out there then you don't need acknowledgements. You just post a compaction message and anyone else who wants to keep collaborating with you will have to "rebase" work they want to keep on top of that or be dropped. I'm not too sure about the cost of tombstones in the rich text formatting. Presumably if you were compacting you could rectify those markers as well, but once you start doing offset math you're kind of back in OT land which is... fine? but reintroduces a lot of the complexity we've been trying to avoid I think. Maybe @ept will have some input here. |
@Gozala I am curious if you think cases like automerge/automerge#193 (comment) can be avoided by using markers to represent picking up and dropping attributes, rather than representing the state of all attributes: Alicev-a1
Alice makes change to v-a2
BobBob starts from v-a1
And makes change v-b2
MergeOnce changes a2 & b2 reach both peers they converge to
Potential improvements would be to allow picking up the adoption or removal of attributes, and then to drop a reference to the pickup, e.g. For example, I think @pvh's "abracadabra" edits could be represented by (assuming the formatting operations can have unique, orderable IDs (using integers here just to express the idea), so a site with both format operations knows not to apply
These obviously don't directly map to anything like a valid DOM tree, but presumably the view layer could handle the translation to the final state: <p>abraca<b>da</b>bra</p> |
In case this helps anyone: I had a ton of trouble using markers, so I just decided to stick the attribute object on each letter (definitely space inefficient, but it works!*). The sample code provided in Edit*: I had a small one line bug in my original implementation too that would sometimes mess up copy pasting large sections. Should be fixed in the code linked, but please be warned it's not rigorously tested code! |
Awesome, @j-berman! That code was never merged because I never solved those problems and then got pulled onto other tasks. This is a great contribution, even if it is inefficient. We can always make correct things faster but getting the wrong answer quickly is not much use at all. |
Glad I could give back something of value. Automerge is so awesome, I'm very grateful! |
I have a vague feeling that with ProseMirror it might be possible to overcome the limitations mentioned in the description by hooking into its low-level document transformation APIs. Not involved in ProseMirror's development myself, but have been using it and had to poke around its innards for integrations I'm working on. It seems to be among the most flexible and mature editor frameworks right now and teaching it Automerge could be a major boost to both ecosystems. |
Hi, |
Hi @inventivejon, yes, there is new development going on: next month, @geoffreylitt and @sliminality will be kicking off a research project to figure out the best way of handling rich text in CRDTs such as Automerge, with the support of the @inkandswitch research lab. I will let them reach out if they want contributions to this project. |
Hi @ept, |
Hi @ept! Do you have news about this research project? Did it end up starting? |
@burdiyan Yes, the project is currently running. We will update this issue when there are results to share. |
Update: we have now published our research on rich text CRDTs: https://www.inkandswitch.com/peritext/ The Peritext implementation is currently independent from Automerge (to allow simpler code and faster iteration), but we have designed it with a view towards integrating it into Automerge in the future. |
I would to figure out / invite to collaborate on a best practice for modeling a state for the rich text editor. Most popular rich text editors on the web (ProseMirror, Quill, Slate) use some for of tree representation. Quill is an exception as it represents document as array of formatted segments, but is still a subject to the same problem - One of the actors needs to split formatted segment of text (either insert something with a different format, or reformat) which ends up translating to deleting slice & inserting it afterwords, problem being that identity of the deleted & reinserted slice has different identity & there for any edits by other actors to the original slice no longer correspond to a new one.
I have discussed this to a greater length on slack but the best solution I could came up to was to represent rich text as sequence of tokens where tokens are either characters or formatting markers.
For example following state corresponds to:
Following markdown:
That is:
hello beautiful wold
This way splitting a formatted segment just means inserting a marker and causality is preserved, however it comes with a pain of mapping from / to text editor data model & making sure that overlapping markers are merged etc.. Furthermore
Automerge.Text
no longer seems to fit the bill.I also have considered storing markers (offset, length as counters) separate from text but that seems even more troublesome so I have opted out.
I would very much love something like
Automerge.RichText
that bakes some some of these opinions internally and exposes higher level API likeAutomerge.Text
but with methods likeformat(offset, length, format)
.P.S.: I'm also working on
above mentioned
RichText` API, but also feel like discussing it here is probably a good idea.The text was updated successfully, but these errors were encountered: