These rules concern design of models in Scala and are only useful to Scala developers diving into the codebase. They aren't inventing anything new, but they had to be written down.
Domain models should be separate from API objects.
Domain models: values, entities, events - are both parts of the published language of the domain, and part of the internal API.
API models are part of the public (HTTP) API. Because they serve a different purpose than domain models, have a separate lifecycle, could have slightly different naming and conventions that would suit RESTful API, and what not.
If models read from the database (or other read source), aren't perfectly matching domain objects, then database DTO objects should be used.
This separation let us design each model in a way that would:
- express the goal of each model the best way
- have type class derivation with the least amount of annotations or other configs as model would simply match its use case with no redundant mappings (have I mentioned that we want to derive the dumb part of our code as much as possible?)
- avoid complex mappings and special cases all over the place - we would just derive code for simple case and THEN write a mapping API <=> Domain model or Domain <=> database DTO in one place and keep complexity at bay
Primitive types: Int
, String
, Double
, etc should be avoided in configs,
Domain models, API models and database DTO models. They carry no information
about what each piece of data represents, and our goal is to make self-evident
what can be easily made self-evident. So primitives in our models should be
replaced by @newtype
s or AnyVals
. Where it makes sense, primitives should
be constrained by Refined types:
// refined type _within_ newtype
@newtype final case class Model(value: Primitive Refined Constraint)
Enumerations should be implemented using Enumeratum.
Models aggregating more data should be designed by case class
es and sealed
hierarchies as they are canonical representation of Algebraic Data Types
in Scala. This help us limit use cases to handle. If some use case is invalid,
quite often it is relatively easy to make it unrepresentable, and then we are
just forced to explicitly handle all possible use cases, and ignore a lot of
cases which should never happen. On the edge of our Domain we can just take
non-validated input, parse it, and either end up with a valid model, or error
message to return. Mind that this way we are avoiding errors in our Domain,
on the Context Boundary we still need to handle it, e.g. by informing users
that their data is invalid, or that event or database entity is somehow broken.
If it makes sense, models shouldn't be flat. Instead of repetition of the same values embedded in a bigger value, we could distinct a separate value with a distinct describing its function.
Values (as understood by DDD) can be either @newtype
s, enumerations or case class
es.
Entities (as understood by DDD) can use the following pattern:
final case class Entity(
id: ID[Entity],
data: Entity.Data
)
object Entity {
final case class Data(
// model properties
)
}
This approach has the following advantages:
- automatically generated
.equals
andEq
compare instances for identical content - automatically generated
.equals
andEq
for.id
compare for identical IDs - automatically generated
.equals
andEq
for.data
compare for identical entities content
So we can easily check if we are testing if everything is the same, only IDs are the same, or content is the same but IDs might differ.
Events (as understood by DDD) are represented by ADTs.
Since, all of our models are: @newtype
s, Enumeratums or ADTs, type class
derivation should be easy, and (where possible) we could use the defaults.
Because automatic derivation can easily get out of hands (my personal experience,
YMMV), I decided to use semiautomatic derivation where possible. Conveniently,
there is a Catnip library (by yours truly) which allows usage of @Semi
annotation for derivation, and it is possible to configure it to use semiauto
for a type class which has some semiautomatic derivation defined. I used it for:
- Cats'
Show
andEq
- Jsoniter Scala codecs
- Avro4s
Decoder
,Encoder
andSchemaFor
- Tapir's
Schema
Conversions between Domain and API and between Domain and database DTO can be automated by Chimney.
Definitions should be placed in separate packages and namespaces, so that:
- published language doesn't know anything about how services are implemented
- HTTP API doesn't know anything about implementation
- domains using another domains definitions published language don't know anything about implementation details nor how its exposed through HTTP
Each domain should use a separate data source: Kafka topics, Postgres databases, Redis cache and so on.
Build dependencies (and tests in particular) should be configured to enforce these requirements.
This way we make it easy to develop against the published language rather than some of particular implementations which would break tests after a change that doesn't affect any of our end users.
If some derivation would require having the same imports in each file,
then - if possible - definitions brough by these imports should be placed
into one object
which could be imported instead. Examples are:
DoobieSupports
- provides support for core Doobie concepts, Postgres extensions, Refined Types,@newtype
s own types' supportAvroSupport
- provides support for Refined and Newtypes to Avro4sTapirSupport
- provides support for Tapir and derivation of Circe codecs, Refined Types and@newtype
sJsoniterSupport
- adds.map
and.mapDecode
, support for Refined and@newtype
sPureconfigSupport
- provides support for core Pureconfig, Refined,@newtype
s and Cats
Domain models should derive only Cats instances which are universal (Show
,
Eq
, Order
). API models should derive instances required by Tapir
(JsCodec
, JsSchema
). Events (and commands) as the only Domain models should
have instances for Avro4s Decoder
s, Encoder
s and SchemaFor
s.
Show
instances and .toString
hold no Domain meaning whatsoever and should
be used only for debugging purpose. Show
should be preferred to .toString
as it allows better handling of many cases where .toString
is not overridable
and exposes some sensitive or nonsensical data. If data is sensitive Show
instance should be hiding it allowing for a safe debugging and GDPR-friendly
logs.