Skip to content

Commit

Permalink
Improve filters, key, protobuf and query documentation.
Browse files Browse the repository at this point in the history
  • Loading branch information
jurmous committed Aug 30, 2024
1 parent d6f8e06 commit 52a8097
Show file tree
Hide file tree
Showing 4 changed files with 276 additions and 219 deletions.
61 changes: 38 additions & 23 deletions core/documentation/filters.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
# Filters

Filters can be applied to both [Get](query.md#get) and [Scan](query.md#scan) objects in
queries. Complex operations can be created by using the [`And`](#and)
and [`Or`](#or) filters. , and filters can be reversed using the [`Not`](#not)
filter.
Filters are useful tools that can be applied to both [Get](query.md#get) and [Scan](query.md#scan) objects in queries.
You can create complex operations by using the [`And`](#and) and [`Or`](#or) filters, while also having the ability to
reverse filters with the [`Not`](#not) filter.

## Reference Filters
The following filters operate on a property within a data object which is referred to with a

The following filters operate on a property within a data object, which is identified by a
[property reference](properties/references.md).

### Exists
Checks if a value exists.

This filter checks whether a value exists.

```kotlin
Exists(
Expand All @@ -21,7 +22,8 @@ Exists(
```

### Equals
Checks if a property reference's value is equal to the given value.

Use this filter to check if a property reference's value is equal to a specified value.

```kotlin
Equals(
Expand All @@ -31,7 +33,8 @@ Equals(
```

### GreaterThan
Checks if the referenced values are greater than the given value.

This filter checks if the referenced values are greater than a specific value.

```kotlin
GreaterThan(
Expand All @@ -41,7 +44,8 @@ GreaterThan(
```

### GreaterThanEquals
Checks if the referenced value is greater than or equal to the given value.

Use this filter to check if the referenced value is greater than or equal to a specified value.

```kotlin
GreaterThanEquals(
Expand All @@ -51,7 +55,8 @@ GreaterThanEquals(
```

### LessThan
Checks if the referenced value is less than the given value.

This filter checks if the referenced value is less than a particular value.

```kotlin
LessThan(
Expand All @@ -61,7 +66,8 @@ LessThan(
```

### LessThanEquals
Checks if the referenced value is less than or equal to the given value.

Use this filter to check if the referenced value is less than or equal to a specified value.

```kotlin
LessThanEquals(
Expand All @@ -71,9 +77,11 @@ LessThanEquals(
```

### Range
Checks if the referenced value is within the given range.

This filter checks if the referenced value falls within a specified range.

Maryk YAML:

```kotlin
Range(
intPropertyReference with 2..42,
Expand All @@ -86,7 +94,8 @@ Range(
```

### Prefix
Checks if the referenced value is prefixed by the given value.

Use this filter to check if the referenced value starts with a given prefix.

```kotlin
Prefix(
Expand All @@ -96,7 +105,8 @@ Prefix(
```

### RegEx
Checks if the referenced value matches with the given regular expression.

This filter checks if the referenced value matches a specified regular expression.

```kotlin
RegEx(
Expand All @@ -106,7 +116,8 @@ RegEx(
```

### ValueIn
Checks if the referenced value is within the set of given values.

Use this filter to check if the referenced value is included in a set of specified values.

```kotlin
ValueIn(
Expand All @@ -116,12 +127,14 @@ ValueIn(
```

## Filter operations
Filter operations are powerful tools that allow you to construct complex queries.
They run on top of other filters, making it possible to create intricate and highly customizable search criteria.

Filter operations provide powerful capabilities for constructing complex queries. They can run on top of other filters,
allowing you to create intricate and highly customizable search criteria.

### And
The And filter returns `true` if all the specified filters match.
This is an ideal filter to use if you want to find records that meet multiple conditions.

The And filter returns `true` if all specified filters match. This filter is ideal when you want to find records that
meet multiple conditions.

```kotlin
And(
Expand All @@ -135,8 +148,9 @@ And(
```

### Or
The Or filter returns `true` if one of the specified filters matches.
This is useful if you want to find records that match either of several conditions.

The Or filter returns `true` if at least one of the specified filters matches. This is useful for finding records that
satisfy any of several conditions.

```kotlin
Or(
Expand All @@ -150,8 +164,9 @@ Or(
```

### Not
The Not filter inverts the meaning of the specified filters. If multiple filters are passed, it performs an And operation.
This filter is useful if you want to exclude records that meet certain criteria.

The Not filter inverts the meaning of specified filters. If multiple filters are provided, it performs an And operation.
This filter is beneficial for excluding records that meet specific criteria.

```kotlin
Not(
Expand Down
66 changes: 38 additions & 28 deletions core/documentation/key.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,55 @@
# Keys in DataObjects

All DataObjects must be stored under a unique key, which acts as an index to the object.
The key serves as a permanent identifier for the object and can be used to retrieve it.
If you store new data under an existing key, the existing data will be overwritten.
All DataObjects must be stored under a unique key, which acts as an index to the object. The key serves as a permanent
identifier for the object and can be used to retrieve it. If you store new data under an existing key, the existing data
will be overwritten.

## Default UUID Keys

If you don't specify a key definition, the model will automatically generate 128-bit v4 UUID keys.
These keys are guaranteed to be unique, but they won't provide any benefits for scanning the data.
If you don't specify a key definition, the model will automatically generate 128-bit v4 UUID keys. These keys are
guaranteed to be unique, ensuring that no two DataObjects will share the same identifier. However, please note that
while they are unique, they do not provide any inherent benefits for scanning or ordering the data.

## Choosing the Right Key from the Start

It's essential to be mindful when designing the key for a DataModel, as the
key structure cannot be changed after data has been stored (without complex migrations).
Consider the primary use case and ordering of the data and make sure the key is optimized
for this purpose. Secondary use cases can be addressed by adding an index.
It's essential to be mindful when designing the key for a DataModel since the key structure cannot be changed after data
has been stored (without complex migrations). Consider the primary use case and the desired ordering of the data
carefully. Make sure the key is optimized for this purpose upfront. If secondary use cases arise, they can often be
addressed by adding an index later.

## Properties for Key Structure

Key structures must have fixed byte lengths. This way the location of key elements are
predictable which are beneficial for scans over keys.
Key structures must have fixed byte lengths. This predictability is crucial as it allows for efficient indexing and
scanning over keys.

Properties that can be used for key elements include numbers, dates and times, fixed bytes, references, enums, booleans,
multi-type objects, and ValueDataModels containing similar values.
Properties that can be utilized for key elements include numbers, dates and times, fixed bytes, references, enums,
booleans, multi-type objects, and ValueDataModels containing similar values.

Properties that cannot be used in keys include strings, flexible bytes, sets, lists, maps, and
embedded models, as they have varying byte lengths.
On the other hand, properties that cannot be used in keys include strings, flexible bytes, sets, lists, maps, and
embedded models, as they possess varying byte lengths, which disrupts the consistency required for effective key
structures.

## The order of keys
Keys are stored in order, which means that data scans will traverse or skip
the data in the same order. If the key starts with a reference, the data scan
can start at the exact location for that reference.
## The Order of Keys

If the data is often requested in the newest-first order, it is recommended to
reverse the date of creation so that new data is retrieved first.
Keys are stored in a specific order, meaning that data scans will traverse or skip the data in the same organized
manner. If the key begins with a reference, scans can efficiently start at the exact location corresponding to that
reference.

## Tips on designing a key
If the data is often requested in a newest-first order, it is advisable to reverse the date of creation in the key
structure. This way, newer data will be retrieved first during scans, enhancing the user experience.

- Consider performance: The key structure should be optimized for the most common use cases, as it will affect the performance of data retrieval and scans.
- If data "belongs" to a particular entity or person, start the key with a reference to that entity or person.
- If data needs to be ordered by time, include the time in the key. If newer data is frequently requested first, reverse the time.
- If the data has a primary multi-type property, include the type ID in the key so you can quickly retrieve data objects of a specific type. If time is also included in the key, make sure to place it after the type ID so that data is still ordered by time.
- If date precision in nanoseconds is somehow not enough, consider adding a random number to the key.
- Use indexing: If you have a key structure that is not optimized for your use case, you can use an index to improve performance. This is especially useful if you have large datasets.
## Tips on Designing a Key

- **Consider performance:** The key structure should be optimized for the most common use cases, as this will directly
impact the performance of data retrieval and scans.
- **Entity association:** If data "belongs" to a particular entity or person, start the key with a reference to that
entity or person. This associativity can streamline searches.
- **Time ordering:** If data needs to be ordered by time, include the time in the key. If newer data is frequently
requested first, reverse the time to prioritize its retrieval.
- **Type identification:** If the data has a primary multi-type property, include the type ID in the key to enable quick
retrieval of data objects of a specific type. Ensure that any time information follows the type ID to maintain
chronological order.
- **Date precision:** If date precision in nanoseconds is insufficient for your application's needs, consider appending
a random number to the key for additional uniqueness.
- **Use indexing:** If you find that your key structure is not optimized for your primary use case, utilize indexing to
improve performance. This is especially beneficial when dealing with large datasets.
38 changes: 24 additions & 14 deletions core/documentation/protobuf.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,34 @@
# Protobuf Transportation

The encoding standard for [ProtoBuf V3](https://developers.google.com/protocol-buffers/) has been adopted for efficient
and compact transportation of data. Developed by Google, ProtoBuf is a widely adopted standard. Currently, only the
encoding standard has been adopted, and schema generation is yet to be implemented.
and compact transportation of data. Developed by Google, ProtoBuf is a widely adopted standard used across various
platforms and languages for serializing structured data. Currently, only the encoding standard has been implemented,
while schema generation is yet to be developed.

For a more in-depth understanding of how values are encoded, refer to the [ProtoBuf encoding documentation](https://developers.google.com/protocol-buffers/docs/encoding)
For a more in-depth understanding of how values are encoded, refer to
the [ProtoBuf encoding documentation](https://developers.google.com/protocol-buffers/docs/encoding), which provides
detailed insights on the encoding mechanisms and examples.

## Key Value pairs
## Key Value Pairs

A ProtoBuf message is built using key-value pairs, where the key contains a tag that identifies the encoded property and
a wire_type that indicates the type of value that was encoded. The value is encoded in the byte format for transport,
and the encoding format for each property type is documented in the [properties documentation](properties/properties.md).
A ProtoBuf message is constructed using key-value pairs. In this structure, the key consists of a tag that uniquely
identifies the encoded property, along with a wire type that specifies the type of value being encoded. The actual value
is then represented in byte format for efficient transport. The encoding format for each property type is thoroughly
documented in the [properties documentation](properties/properties.md), which serves as a reference for developers
looking to implement and understand the specifics of encoding different data types.

## Wire Types

Maryk supports all wire types supported by ProtoBuf, including:
Maryk supports all wire types defined by ProtoBuf, including:

* VarInt: A variable integer used for numeric values that grow in size with the value.
* Length Delimited: Used for variable length values. The length of the bytes is preceded by the value.
It can also contain key-value pairs of embedded messages.
* 32 Bit: Used for values of 4 bytes.
* 64 Bit: Used for values of 8 bytes.
* Start Group / End Group: Not currently used and also deprecated in ProtoBuf.
* **VarInt**: A variable-length integer used for numeric values, allowing for efficient storage of small values while
accommodating larger integers without wasting space.
* **Length Delimited**: This type is utilized for values that can vary in length. The actual bytes of the value are
prefixed by a length field, making it versatile for use with strings and byte arrays. Additionally, it can encapsulate
key-value pairs of embedded messages.
* **32 Bit**: Specifically employed for values that are exactly 4 bytes in size, commonly used for fixed-width numerical
types.
* **64 Bit**: Designed for values that are exactly 8 bytes, typically used for large numerical types or higher-precision
floating-point numbers.
* **Start Group / End Group**: These wire types are currently not in use and are also deprecated in the ProtoBuf
specification. It is advisable to avoid using them in new implementations.
Loading

0 comments on commit 52a8097

Please sign in to comment.