Skip to content

yaml diff Array Options

William W. Kimball, Jr., MBA, MSIS edited this page Oct 25, 2020 · 4 revisions
  1. Introduction
  2. By Position
  3. By Value
  4. Configuration File Options
    1. Configuration File Section: defaults
    2. Configuration File Section: rules

This document is part of the body of knowledge about yaml-diff, one of the reference command-line tools provided by the YAML Path project.

Introduction

The yaml-diff command-line tool enables users to control how Arrays (AKA Lists or Sequences) are compared. This is different from merging Arrays-of-Hashes, discussed elsewhere. By default, the elements from both documents are compared based on their ordinal position in each Array. While this is ideal for many use-cases, it is not so for every use-case. As such, yaml-diff offers some options for how it compares Array elements. These options include:

  1. position (the default) tests the equality of each element in the document pair by its ordinal position. Differences are reported as changes. When the left-hand document (LHS) has more elements than the right-hand document (RHS), the additional LHS elements are reported as deletions. When the LHS has fewer elements than RHS, the additional RHS elements are reported as additions.
  2. value synchronizes the two Arrays by the values of their elements and then comparing the result. This is especially helpful when you are more interested in elements unique to the two Arrays, regardless their relative ordinal positions. Changes are possible in this mode but only when two elements at exactly the same position are both different and otherwise unmatched across both Arrays. Otherwise, only additions and deletions are possible because all other elements will have been matched up.

Each of these scenarios will be explored through comparisons of different arrangements of Array elements.

By Position

File: LHS1.yaml

---
same_elements:
  - alpha
  - bravo

one_change:
  - alpha
  - bravo

one_addition:
  - alpha

one_deletion:
  - alpha
  - bravo

File: RHS1.yaml

same_elements:
  - alpha
  - bravo

one_change:
  - alpha
  - charlie

one_addition:
  - alpha
  - bravo

one_deletion:
  - alpha

By default or when using position against each of these Arrays, the difference becomes:

c one_change[1]
< bravo
---
> charlie

a one_addition[1]
> bravo

d one_deletion[1]
< bravo

By Value

File: LHS2.yaml

---
rearranged_array:
  - alpha
  - bravo
  - charlie

with_duplicates:
  - alpha
  - bravo
  - alpha

with_additions:
  - alpha
  - bravo

with_deletions:
  - alpha
  - bravo
  - charlie

with_change:
  - alpha
  - bravo
  - delta

File: RHS2.yaml

---
rearranged_array:
  - charlie
  - alpha
  - bravo

with_duplicates:
  - bravo
  - alpha
  - alpha

with_additions:
  - bravo
  - charlie
  - alpha
  - delta

with_deletions:
  - bravo

with_change:
  - alpha
  - charlie
  - delta

When using the value option against these arrays, the differences are revealed as:

a with_additions[1]
> charlie

a with_additions[3]
> delta

d with_deletions[0]
< alpha

d with_deletions[2]
< charlie

c with_change[1]
< bravo
---
> charlie

For contrast, a position comparison would produce a very different report:

c rearranged_array[0]
< alpha
---
> charlie

c rearranged_array[1]
< bravo
---
> alpha

c rearranged_array[2]
< charlie
---
> bravo

c with_duplicates[0]
< alpha
---
> bravo

c with_duplicates[1]
< bravo
---
> alpha

c with_additions[0]
< alpha
---
> bravo

c with_additions[1]
< bravo
---
> charlie

a with_additions[2]
> alpha

a with_additions[3]
> delta

c with_deletions[0]
< alpha
---
> bravo

d with_deletions[1]
< bravo

d with_deletions[2]
< charlie

c with_change[1]
< bravo
---
> charlie

As you can see, a comparison by value is far smaller than by position for these documents. Once the elements are synchronized, there are actually far fewer differences to report. When it is more informative to compare Arrays by the distinctiveness of their elements rather than the order of them, use the value option.

Configuration File Options

The yaml-diff tool can read per YAML Path comparison options from an INI-Style configuration file via its --config (-c) argument. Whereas the --arrays (-A) argument supplies an overarching mode for comparing Arrays, using a configuration file permits far more precise control whenever you need a different mode for specific parts of the comparison documents.

Configuration File Section: defaults

The [defaults] section permits a key named, arrays, which behaves identically to the --arrays (-A) command-line argument to the yaml-diff tool. The [defaults]arrays setting is overridden by the same-named command-line argument, when supplied. In practice, this file may look like:

File diff-options.ini

[defaults]
arrays = position

Note the spaces around the = sign are optional but only an = sign may be used to separate each key from its value.

Configuration File Section: rules

The [rules] section takes any YAML Paths as keys and any of the Array comparison modes that are available to the --arrays (-A) command-line argument. This enables extremely fine precision for applying the available modes.

Using the LHS2.yaml and RHS2.yaml documents as all prior examples, adding a configuration file with these contents:

[defaults]
arrays = value

[rules]
rearranged_array = position
with_change = position

... changes the difference report to:

c rearranged_array[0]
< alpha
---
> charlie

c rearranged_array[1]
< bravo
---
> alpha

c rearranged_array[2]
< charlie
---
> bravo

a with_additions[1]
> charlie

a with_additions[3]
> delta

d with_deletions[0]
< alpha

d with_deletions[2]
< charlie

c with_change[1]
< bravo
---
> charlie

Notice the following:

  1. The default comparison mode for all Arrays was set to value; different from the internal default mode, position.
  2. The Arrays at "rearranged_array" ("/rearranged_array") and "with_change" ("/with_change") were compared using the position mode.
Clone this wiki locally