Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Snapshot create and restore support for projections #277

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

robertlemke
Copy link
Member

@robertlemke robertlemke commented Aug 21, 2020

  • interface for createSnapshot() and restoreSnapshot()
  • CLI commands for create snapshot and restore snapshot
  • trait for Doctrine based Projectors helping with creating and restoring of snapshots
  • persist Snapshot objects so that ProjectionManager and others can look up which snapshots (should) exist and retrieve metadata
  • implemente deleting of snapshots
  • provide unit tests for all important parts

Ideas for later:

  • automatically create snapshots after each x events, configurable per projection
  • automatic clean up of old / not needed snapshots
  • create / restore snapshots for multiple projections at once
  • dump snapshots to and restore from file system / cloud storage, for quick disaster recovery

This is basically working, but the snapshot metadata is not persisted
yet and thus one very concrete snapshot is currently hardcoded in
ProjectionManager
@robertlemke robertlemke marked this pull request as draft August 21, 2020 12:06
@robertlemke
Copy link
Member Author

Instructions:

  • choose a Doctrine-based projector
  • add implements SnapshotAwareProjectorInterface
  • implement the getTableNames() method
  • adjust the hardcoded Snapshot object in ProjectionManager->restoreSnapshot()
  • ./flow projection:replay --maximum-sequence-number xxx yourprojection
  • ./flow projection:createsnapshot yourprojection
  • see the new table(s) in the database, prefixed with "snap_"
  • ./flow projection:catchup
  • check that your projection is up to date. See also the applied events table
  • ./flow projection:restoresnapshot uuid-of-your-snapshot
  • your projection should be back to the xx sequence number

@bwaidelich
Copy link
Member

Sorry for being the sceptic voice.. I probably over-react a little but I removed a lot of code from this package in order to keep it focused and scaleable so please bear with me and hear me out:

In event sourcing snapshotting is usually applied to aggregates as performance optimization, i.e. in order to speed up reconstitution that is needed for every mutation.
It adds a layer of complexity and potential bugs though, so it is generally advised to only be used if there are no better alternatives.
From http://cqrs.nu/Faq/event-sourcing:

An optimization where a snapshot of the aggregate's state is also saved (conceptually) in the event queue every so often, so that event application can start from the snapshot instead of from scratch. This can speed things up. [...]

Snapshotting has a number of drawbacks related to re-introducing current state in the database. Rather than assume you will need it, start without snapshotting, and add it only after profiling shows you that it will help.

Having said that I still think that it can be quite a useful tool but I wonder if the framework is a good place for the implementation.
I.e. it's totally possible to implement aggregate snapshots today like:

<?php
class SomeAggregateRepository {

  public function load(SomeAggregateId $id): SomeAggregate
  {
    $snapshot = $this->loadSnapshot($id);
    $eventStream = $this->eventStore->load($this->streamName($id), $snapshot->sequenceNumber());
    return SomeAggregate::reconstituteFromSnapshotAndEventStream($snapshot, $eventStream);
  }

  public function save(SomeAggregate $aggregate): void
  {
    $this->eventStore->commit($this->streamName($id), $aggregate->pullUncommittedEvents(), $aggregate->getReconstitutionVersion());
    if ($this->requiresSnapshot($aggregate)) {
      $this->saveSnapshot($aggregate);
    }
  }

  // ...
}

We should add a proper example to the documentation and/or provide a base implementation, but IMO this could be done in a separate package because most of the code will be specific to the aggregate implementation.

Now this PR is about snapshots for projections and – if I get it right – it serves a different scenario.
Maybe you thought it all through but I got the feeling that we might need to discuss the scenario a bit more before we can agree on a solution.

A snapshot as described above doesn't make sense for projections IMO because the events are usually only applied once and the projection is only replayed if its implementation changed (which renders any previous snapshot invalid most probably).
From looking at the code (and from what we discussed the other day) I think, this one is more about a backup mechanism that allows to restore a previous projection state for debugging purposes.

With #269 this should already be possible via

./flow projection:replay projection --maximumSequenceNumber 123

and this PR is "merely" there to speed up that process, right?

@bwaidelich
Copy link
Member

Sorry if the above didn't sound constructive, I was mainly trying to turn my initial thoughts into writing.

TL;DR: I'm not sure about the new interfaces and traits and would prefer a first implementation in the affected model. E.g. for the event-sourced Content Repository two new CLI commands

./flow contentsubgraph:backup
./flow contentsubgraph:restore

..at least until we learned more about it :)

@albe
Copy link
Member

albe commented Aug 23, 2020

@bwaidelich that were exactly my thoughts when I saw the PR and thought about it :)

@robertlemke
Copy link
Member Author

@bwaidelich thanks for your comments. I guess it's mostly a misunderstanding.

The motivation for this feature is mostly an optimization for debugging projections. It can be implemented as a separate package, if you like. It has nothing to do and does not aim for providing something like snapshots for aggregates.

Consider having an event store full of 1.000.000 events and trying to debug / improve a specific projection. You'll end up in a replay - check database - catch up - adjust code - replay -loop. In my actual case a replay of the respective projection takes a few minutes. Now, if I could create a snapshot at event 990.000 and reset to / catch from there, this would be a big time-saver.

If that is a too exotic feature, just let me know. @skurfuerst asked me to implement it for the ESCR development.

@skurfuerst
Copy link
Member

diff --git a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php
index 57b5196..80b6429 100644
--- a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php
+++ b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php
@@ -37,6 +37,8 @@ use Neos\EventSourcedContentRepository\Domain\Context\NodeAggregate\Event\RootNo
 use Neos\EventSourcedContentRepository\Domain\Context\NodeAggregate\NodeAggregateClassification;
 use Neos\EventSourcedContentRepository\Domain\ValueObject\SerializedPropertyValues;
 use Neos\EventSourcedContentRepository\Infrastructure\Projection\AbstractProcessedEventsAwareProjector;
+use Neos\EventSourcing\Projection\DoctrineProjectorSnapshotTrait;
+use Neos\EventSourcing\Projection\SnapshotAwareProjectorInterface;
 use Neos\Flow\Annotations as Flow;

 /**
@@ -44,11 +46,12 @@ use Neos\Flow\Annotations as Flow;
  *
  * @Flow\Scope("singleton")
  */
-class GraphProjector extends AbstractProcessedEventsAwareProjector
+class GraphProjector extends AbstractProcessedEventsAwareProjector implements SnapshotAwareProjectorInterface
 {
     use RestrictionRelations;
     use NodeRemoval;
     use NodeMove;
+    use DoctrineProjectorSnapshotTrait;

     const RELATION_DEFAULT_OFFSET = 128;

@@ -58,6 +61,16 @@ class GraphProjector extends AbstractProcessedEventsAwareProjector
      */
     protected $projectionContentGraph;

+    protected static function getTableNames(): array
+    {
+        return [
+            'neos_contentgraph_node',
+            'neos_contentgraph_hierarchyrelation',
+            'neos_contentgraph_referencerelation',
+            'neos_contentgraph_restrictionrelation'
+        ];
+    }
+
     /**
      * @throws \Throwable
      */
diff --git a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php
index 1b247b1..b61dbd1 100644
--- a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php
+++ b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php
@@ -169,7 +169,7 @@ final class ContentGraph implements ContentGraphInterface
      */
     public function findNodeAggregateByIdentifier(
         ContentStreamIdentifier $contentStreamIdentifier,
-        NodeAggregateIdentifier $nodeAggregateIdentifier
+            NodeAggregateIdentifier $nodeAggregateIdentifier
     ): ?NodeAggregate {
         $connection = $this->client->getConnection();

@bwaidelich
Copy link
Member

bwaidelich commented Jul 23, 2021

This one got stuck after I shared my concerns. Now, almost one year later, I still think that "snapshots" should be avoided if possible (and apparently I'm not the only one, see https://twitter.com/mat_mcloughlin/status/1418191418309451780) and certainly don't be a part of the core functionality.

Having said that, of course that's just my personal opinion and – more importantly – it doesn't mean that we should not make debugging large projections easier and faster!
I'm just convinced that it is much easier to solve this for a concrete scenario rather than with a generic pattern.

btw: I shared the twitter link above not to "prove my point" (and it doesn't for all cases) but it contains an interesting idea: How can we prevent (too) long living streams?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants