Compare and contrast similar objects and highlight their differences
A simple equality test can determine whether two values are identical or not, but it offers no explanation for how two unequal values are different. Chiaroscuro provides a means to compare two objects and see their structural differences.
- structurally compare different values of the same type
- provides a recursive structural breakdown of differences between product types
- provides a diff (using Myers' algorithm) between sequence types with different elements
- presents results as an immutable datatype which can be rendered as a tabular tree of differences
- diff sequences can be customized for more meaningful output
All Chiaroscuro terms and types are defined in the chiaroscuro
package:
import chiaroscuro.*
Two values of the same type can be compared with the contrastWith
method,
provided by Chiaroscuro. This will return an instance of Semblance
,
describing the similarity of one value with the other, and will be one of three
cases:
Identical
, if the two values are the sameDifferent
, if the two values are different, without any further detailBreakdown
, if the two values are different, with a breakdown of their similarities and differences
The last of these three cases is the most interesting, though it is only a
possible result for values of certain types, namely types which can be
destructured into components which can themselves be compared, recursively.
Simple types like Text
(or String
) and primitive types like Int
or
Double
can only ever be Identical
or Different
, but product types, like
case class instances, sum types, like enums or sealed traits, and sequence
types, like List
or IArray
can result in semblances which are neither
Identical
nor Different
but a Breakdown
.
The three cases of Semblance
are defined as follows:
Identical(value: Text)
Different(left: Text, right: Text)
Breakdown(comparison: IArray[(Text, Semblance)], left: Text, right: Text)
Identical
includes just a textual representation of the two identical values.
Different
includes textual representations of both values, since they will
not be the same. Breakdown
also includes textual representations of the left
and right values, but additionally includes a sequence (an IArray
) of
labelled Semblance
s, each of which may be Identical
, Different
or another
Breakdown
. This sequence represents a breakdown of the different components of
the two objects, comparing like-for-like, however the breakdown depends on the
type of objects being contrasted.
Typically, for product types such as case classes, the comparison sequence will
contain an entry for each parameter of the case class, showing whether that
parameter is the same or differs between the two values. In the case where it
differs, and the parameter is another product type, a nested Breakdown
instance
may recursively show
For sequence types, such as List
, a diff between the elements of the left
sequence and the elements of the right sequence will yield a comparison
sequence, labeled for the index of the left and/or right sequence, where each
entry represents a one-to-one comparison between two elements, an addition (to
the right side) or a deletion (from the left side). Each one-to-one comparison
may be Identical
, Different
or a recursive Breakdown
value.
This format provides a convenient and concise way of describing the structural differences between two values. A contrast between two deep structures with few differences will yield a tree structure where identical branches are pruned, and only differing branches are expanded.
When performing a diff between two sequences of elements, whatever their type, a comparison between any two elements will judge them to be either the same or different, however small the difference. The non-appearance of an element in one sequence and its appearance in the other would be considered a deletion or insertion. In general, the result of a diff could be presented as an alternating series of blocks of identical elements and blocks of deletions (from the left side) and insertions (on the right side).
Often this is the best that can be achieved, and Chiaroscuro would present each
deletion and insertion as Different
nodes: absent on one side, and present on
the other, with no further breakdown possible.
But what if further analysis on the blocks of differences (those between the blocks of identical elements) were possible, and elements appearing on both sides deemed similar could be compared to each other? With a definition for what it means for two elements to be similar (for a given element type) Chiaroscuro makes this possible, and provides a breakdown of the differences between similar elements.
For example, consider the board members of an imaginary company,
import anticipation.Text
import gossamer.t
enum Role:
case Ceo, Cto, Coo, Cmo, Cfo
case class Member(role: Role, name: Text)
val boardMembers: List[Member] = List(
Member(Role.Ceo, t"Jane"),
Member(Role.Cto, t"Leopold"),
Member(Role.Coo, t"Simon"),
Member(Role.Cmo, t"Helen")
)
being compared to the board members a year later:
val boardMembers2: List[Member] = List(
Member(Role.Ceo, t"Leopold"),
Member(Role.Cto, t"Linda"),
Member(Role.Coo, t"Simon"),
Member(Role.Cfo, t"Anna"),
Member(Role.Cmo, t"Helen")
)
A simplistic diff would identify that the COO (Simon) and CMO (Helen) remained unchanged, and two sets of changes:
- deletion of Jane (CEO), deletion of Leopold (CTO), insertion of Leopold (CEO) and insertion of Linda (CTO), and
- insertion of Anna (CFO)
However, the first set of changes could be presented more simply if we were to
provide an indication of similarity between two Member
instances. Two
possibilities suggest themselves:
- two
Member
s with the same name are similar (regardless of role) - two
Member
s with the same role are similar (regardless of name)
The presence of a Similarity
typeclass instance can be used to specify this similarity. Defining,
given Similarity[Member] = _.role == _.role
will transform the `Semblance` output comparing these two sequences to
directly compare `Member(Role.Ceo, t"Jane")` and
`Member(Role.Ceo, t"Leopold")`, resulting in a `Breakdown` indicating
the name having changed, and a similar name change for the corresponding
members with the `Role.Cto` role.
The insertion of a `Member` with the `Role.Cfo` role would remain as an
insertion.
Likewise, providing the alternative `given` definition,
```scala
given Similarity[Member] = _.name == _.name
would show Jane as a deletion, Leopold as a role change from CTO to CEO, and Linda and Anna as insertions.
The implementation of Similarity
can be any function comparing two elements,
and a good implementation can dramatically improve the readability of the
Semblance
output. Some ideas for similarity implementations include considering:
Text
elements with a minimum edit distance less than 4 to be similarDouble
elements which differ by less than 5% to be similarJson
objects with equal"id"
fields to be similar- a choice the same case of an enum (regardless of parameters) to indicate similarity
Chiaroscuro is classified as maturescent. For reference, Soundness projects are categorized into one of the following five stability levels:
- embryonic: for experimental or demonstrative purposes only, without any guarantees of longevity
- fledgling: of proven utility, seeking contributions, but liable to significant redesigns
- maturescent: major design decisions broady settled, seeking probatory adoption and refinement
- dependable: production-ready, subject to controlled ongoing maintenance and enhancement; tagged as version
1.0.0
or later - adamantine: proven, reliable and production-ready, with no further breaking changes ever anticipated
Projects at any stability level, even embryonic projects, can still be used, as long as caution is taken to avoid a mismatch between the project's stability level and the required stability and maintainability of your own project.
Chiaroscuro is designed to be small. Its entire source code currently consists of 179 lines of code.
Chiaroscuro will ultimately be built by Fury, when it is published. In the meantime, two possibilities are offered, however they are acknowledged to be fragile, inadequately tested, and unsuitable for anything more than experimentation. They are provided only for the necessity of providing some answer to the question, "how can I try Chiaroscuro?".
-
Copy the sources into your own project
Read the
fury
file in the repository root to understand Chiaroscuro's build structure, dependencies and source location; the file format should be short and quite intuitive. Copy the sources into a source directory in your own project, then repeat (recursively) for each of the dependencies.The sources are compiled against the latest nightly release of Scala 3. There should be no problem to compile the project together with all of its dependencies in a single compilation.
-
Build with Wrath
Wrath is a bootstrapping script for building Chiaroscuro and other projects in the absence of a fully-featured build tool. It is designed to read the
fury
file in the project directory, and produce a collection of JAR files which can be added to a classpath, by compiling the project and all of its dependencies, including the Scala compiler itself.Download the latest version of
wrath
, make it executable, and add it to your path, for example by copying it to/usr/local/bin/
.Clone this repository inside an empty directory, so that the build can safely make clones of repositories it depends on as peers of
chiaroscuro
. Runwrath -F
in the repository root. This will download and compile the latest version of Scala, as well as all of Chiaroscuro's dependencies.If the build was successful, the compiled JAR files can be found in the
.wrath/dist
directory.
Contributors to Chiaroscuro are welcome and encouraged. New contributors may like to look for issues marked beginner.
We suggest that all contributors read the Contributing Guide to make the process of contributing to Chiaroscuro easier.
Please do not contact project maintainers privately with questions unless there is a good reason to keep them private. While it can be tempting to repsond to such questions, private answers cannot be shared with a wider audience, and it can result in duplication of effort.
Chiaroscuro was designed and developed by Jon Pretty, and commercial support and training on all aspects of Scala 3 is available from Propensive OÜ.
Chiaroscuro is a Renaissance painting technique of expressing strong contrasts between light and dark, while Chiaroscuro provides the means to highlight the contrasts between two values.
/kɪˌɑːɹəˈskʊəɹəʊ/
In general, Soundness project names are always chosen with some rationale, however it is usually frivolous. Each name is chosen for more for its uniqueness and intrigue than its concision or catchiness, and there is no bias towards names with positive or "nice" meanings—since many of the libraries perform some quite unpleasant tasks.
Names should be English words, though many are obscure or archaic, and it should be noted how willingly English adopts foreign words. Names are generally of Greek or Latin origin, and have often arrived in English via a romance language.
The logo shows a stylized crescent moon, illustrating the contrast of light against darkness.
Chiaroscuro is copyright © 2024 Jon Pretty & Propensive OÜ, and is made available under the Apache 2.0 License.