Skip to content

Latest commit

 

History

History
107 lines (71 loc) · 6.63 KB

README.md

File metadata and controls

107 lines (71 loc) · 6.63 KB

Apex.Serialization

A high performance contract-less binary serializer capable of handling data trees or object graphs.

Suitable for realtime workloads where the serialized data will not persist for long, as most assembly changes will render the data format incompatible with older versions. Performance is optimized for throughput at the expense of initialization time.

Status

Build Status Code Coverage

Nuget Package

Using the latest package version is recommended, but 1.3.4 is the latest version that supportes netstandard or .NET framework targets.

Use Cases

Good

  • Caching of data that can be reproduced if lost or unable to be deserialized
  • Distributed processing where all workers are using the same runtime/application versions and are hosted on the same type of hardware

Bad

  • Any type of long term storage, such as save files or data archiving, since changes to schema, runtime, library versions, etc. could make deserialization no longer possible

Limitations

As the serialization is contract-less, the binary format produced depends on precise characteristics of the types serialized. Most changes to types, such as adding or removing fields, renaming types, or changing relationships between types will break compatibility with previously serialized data. Serializing and deserializing between different chip architectures and .NET runtimes is not supported.

For performance reasons, the serializer and deserializer make use of pointers and direct memory access. This will often cause attempting to deserialize incompatible data to immediately crash the application instead of throwing an exception.

NEVER deserialize data from an untrusted source.

Some types aren't supported:

  • Objects that use randomized hashing or other runtime specific data to determine their behavior (including HashSet<>, Dictionary<,> and their immutable counterparts) unless you specifically use a comparer/objects that don't have that randomization
  • Objects containing pointers or handles to unmanaged resources
  • BlockingCollection<> and types in System.Collections.Concurrent
  • Non-generic standard collections

Requires code generation capabilities

Migrating to version 2.x

Version 2 adds type whitelisting, which means no types can be serialized unless marked by calling Settings.MarkSerializable(Type | Func<Type, bool>). To restore the previous behavior for backwards compatibility you can simply pass a function that always returns true.

Usage

Serialization

var obj = ClassToSerialize();
var binarySerializer = Binary.Create(new Settings().MarkSerializable<ClassToSerializeType>());
binarySerializer.Write(obj, outputStream);

Deserialization

var obj = binarySerializer.Read<SerializedClassType>(inputStream)

Class instances are not thread safe, static methods are thread safe unless otherwise noted in their documentation.

Always reuse serializer instances when possible, as the instance caches a lot of data to improve performance when repeatedly serializing or deserializing objects. Since the instances are not thread-safe, you should use an object pool or some other method to ensure that only one thread uses an instance at a time.

Fields with the [Nonserialized] attribute will not be serialized or deserialized.

Settings

You must pass a Settings object to Binary.Create that lets you choose:

  • between tree or graph serialization (graph serialization is required for cases where you have a cyclical reference or need to maintain object identity)
  • whether functions should be serialized
  • whether serialization hooks should be called (any methods with the [AfterDeserialization] attribute will be called after the object graph is completely deserialized.)
  • whether to disable inlining (reduces startup time and stack frame sizes at the cost of throughput)
  • whether to disable flattening of class hierachies (has an effect similar to disabling inlining)
  • whether to emit autogenerated type IDs, which will prevent incompatible data from being deserialized at the cost of some overhead per object
  • custom serialization actions
  • what types are allowed to be serialized

Performance

Performance is a feature! Apex.Serialization is an extremely fast binary serializer. See benchmarks for comparisons with other fast binary serializers.

Custom serialization/deserialization

You can define custom serialization and deserialization simply by calling

Settings.RegisterCustomSerializer<CustomType>(writeAction, readAction)

In order for custom serialization to be used, the SupportSerializationHooks property on the Settings used to instantiate the Binary class must also be set to true.

Both the write Action and read Action will be called with an instance of the type being serialized/deserialized and a BinaryWriter/BinaryReader interface which exposes three methods:

        void Write(string input);
        void Write<T>(T value) where T : struct;
        void WriteObject<T>(T value);

The Actions can optionally take a third parameter for context, which is set on the Binary instance with SetCustomHookContext.

The reader has corresponding methods for reading back the values. Behavior of the generic Write/Read method when passed a non-primitive is undefined. If multiple customer serializers match an object, they will all be called in the order in which they were registered.

Tips for best performance

  • Use sealed type declarations when possible - this allows the serializer to skip writing any type information
  • Create empty constructors (or constructors that assign to every field from parameters matching the field types) for classes that will be serialized/deserialized a lot (only helps if there's no inline field initialization as well)
  • Use different serializer instances for different workloads (e.g. one for serializing a few objects at a time and one for large graphs), and pool serializer instances
  • Don't inherit from standard collections