Library for synchronizing ordered data with the minimum of insert and delete operations.
Optimized for large datasets. Suitable for keeping in sync internal data with outside source. See demos.
- PHP 7.4, 8.0, 8.1, 8.2, 8.3
$ composer require raigu/ordered-lists-sync
\Raigu\OrderedListsSynchronization\Synchronization
compares source and target lists. It detects which elements have
been added or removed in source compared to the target and calls corresponding callback.
The source and target must be of type Iterator
.
$synchronization = new \Raigu\OrderedListsSynchronization\Synchronization();
$synchronization(
$source = new ArrayIterator(['A', 'B', 'D']),
$target = new ArrayIterator(['B', 'C', 'D']),
$add = function ($element) { echo "ADD: {$element}\n"; },
$remove = function ($element) { echo "REMOVE: {$element}\n"; }
);
Output:
ADD: A
REMOVE: C
Sample use cases with demo code:
- Keeping a relational database table in sync with another ( demo: ./demo/database_tables.php)
- Keeping local data in sync with data available in internet ( demo: ./demo/internet_to_database.php)
- Syncing entities identified by unique id and having several attributes (demo: ./demo/sync_entities.php)
The \Raigu\OrderedListsSynchronization\Synchronization
has only one purpose. Therefore, it
is designed so the instance will be a function,
no method is exposed. It is an object because it allows extending it in future (for example adding logging).
Using Iterator
type for source and target has several advantages.
First, you can create your own iterators as separate components which are easier to develop and test.
Secondly, more complex problems can be solved. Specially if there are memory constraints.
You can make source or target as generator, as an
instance implementing Iterate
or IteratorAggregate interface or use
Sandard PHP Library (SPL) iterators.
Thirdly, it allows creating components which do not do any job in constructor. This is convenient when using dependency injection or declarative programming.
$ git clone git@github.com:raigu/ordered-lists-sync.git
$ cd ordered-lists-sync
$ composer install
$ composer test
$ composer coverage
Algorithm works same way like we use dictionary. If all words are ordered, then we know exactly where some word should be. If we have two dictionaries and start to compare them word by word from start, then we can detect added or removed words.
Example (V
denotes current position of the iterator):
V
Source: A, B, D
V
Target: B, C, D
A < B => if source is before target, then this means that value is missing in target. If there would be an A in target, then it should be here before B. But it is not. Therefore, it is missing and should be added. Add A and move source cursor:
V
Source: A, B, D
V
Target: B, C, D
B = B => if the source and target are equal, then they are sync. Move both cursors.
V
Source: A, B, D
V
Target: B, C, D
D > C => if source is after the target, then this means that value has been removed from source. Therefore, remove C and move the cursor forward:
V
Source: A, B, D
V
Target: B, C, D
D = D => they are in sync. Move both cursors.