New json serializer serializes empty complex types as {}, which fails to deserialize with new deserializer #2235

almostchristian · 2022-09-13T04:32:55Z

The new System.Text.Json serializer serializes empty complex types as empty JSON objects {} which fail to deserialize with the System.Text.Json deserializer. From what I understand, the new deserializer is supposed to fail on empty objects and empty lists, so the serializer should have treated these as null like in the older serializer.

To Reproduce

var serializerOptions = new JsonSerializerOptions().ForFhir();

var patient = new Patient
{
   MaritalStatus = new CodeableConcept(),
};

// {"resourceType":"Patient","maritalStatus":{}}
var patientJson = JsonSerializer.Serialize(patient, serializerOptions);

// throws Hl7.Fhir.Serialization.DeserializationFailedException
var deserialized = JsonSerializer.Deserialize<Patient>(patientJson, serializerOptions);

Version used:

FHIR Version: R4B
Version: 4.2.1

The text was updated successfully, but these errors were encountered:

ewoutkramer · 2022-09-13T08:09:59Z

That's right. If have documented this behaviour here: https://docs.fire.ly/projects/Firely-NET-SDK/parsing/system-text-json-serialization.html#serialization-with-pocos-and-system-text-json. The serializer does not do any validation and can easily be (misused) to generate incorrect json when pushed to do so.

The serializer will serialize what you have added to the POCO, in this case an empty CodeableConcept. Trying to avoid serializing empty objects would cost extra computation time, e.g. to first traverse the children to see whether the object is empty or not, before starting to serialize the object. Basically, making serialization slower in the general case to fix this specific case. The idea is that it is easier (and cheaper) for the caller of the serializer to avoid adding empty objects in the first place...

But I am all ears to hear why this is a wrong assumption.

ewoutkramer · 2022-09-13T09:23:03Z

I have been thinking about this a bit. The problem of optimizing for empty objects is that we only know an object is not empty when we hit the first property that is set (which could possibly be buried deeply in nested "almost empty" objects). Whether we ever find that first property or not, we will already have written the opening brackets to the Utf8JsonWriter for each (nested) object by then.

A possible solution would be to delay writing the opening brackets (basically caching them up in some kind of serialization state) until we hit the first property, and then flush out all the brackets. If we never encounter a single property, we just clear that cached set of brackets.

almostchristian · 2022-09-13T09:57:08Z

Yeah, I am reading the FhirJsonPocoSerializer, and I think there may be places here to avoid empty arrays and objects by using writer.Reset() as you are alluding to. Not sure if the performance cost here is an issue, but we can always write benchmarks to test this.

We found this issue because in our web application, our developer wrote code during development to modify a resource before saving it to the document db. Naturally, it was able to serialize to db but failed during deserialization.

ghost · 2022-11-12T09:28:33Z

Why not set the DefaultIgnoreCondition in serializerOptions to WhenWritingDefault ? An object with no properties set is a default object.

https://learn.microsoft.com/en-us/dotnet/api/system.text.json.serialization.jsonignorecondition?view=net-6.0

ewoutkramer · 2022-11-14T13:41:39Z

Though this looks nice, I think this is only applicable when you do the "built-in" object -> json serialization. FHIR Json serialization is complex enough that we wrote our own serialization (so using the Utf8JsonWriter directly) - I cannot imagine this setting is applicable to that lower-level API.

ghost · 2022-11-18T05:08:20Z

.NET 7.0 introduces https://devblogs.microsoft.com/dotnet/announcing-dotnet-7/#system-text-json-contract-customization

ewoutkramer · 2023-07-27T11:59:55Z

So, to finalize this discussion:

On the serializer

It is a feature that the serializer is able to serialize incorrect data, for three reasons:

The POCO model can contain incorrect data by design (e.g. unrecognized enum values, incorrect date values), so we don't lose information. It allows you to parse incorrect data, inspect the errors, and correct them. At least, we can roundtrip these kind of errors. Which is nice if you have a database of incorrect data.
Adding validation to serialization would mean slowing down the serializer. If you really want that, you can run the attribute validation yourself before running serialization.
Serialization is not supposed to fail or throw exceptions, users will just not expect that to happen.

This behaviour is also documented.

On the parser

The FHIR spec states on the xml and json pages that empty objects are not allowed, so this is something we should detect.
The old parsers could be tuned to be strict or lenient. In the latter case, empty arrays are acceptable, so it is easy to fix the problem of not being able to parse the serialized json by using a lenient parser.
This was also a point of contention: what if we don't want it to be lenient for other errors? Your "lenient" is not my "lenient". It proved finding the sweet spot was hard,
So, the newer parsers now detect everything that deviates from the spec in a syntactic way (so we're not a full profile validator, just the basic structural errors). The issues found are now coded. The parser will continue parsing, even if it finds issues. The returned exception (DeserializationFailedException) contains a PartialResult property that contains all data we were able to store in the POCO, regardless of the issues found.
In the new parser, the "empty array" error has its own code, so you can chose to use the PartialResult is this is the only error that occurs. Like this, you can exactly determine how "lenient" you want the parsing to be.

So, I think we have all ingredients to handle these situations. The only thing that is missing is that it's not obvious that all of this is possible. It's documented, but the API does not make it obvious or easy to ignore specific errors.

What I will do is create a new feature request for the next update of the SDK to improve the interface so this becomes easy to do.

ewoutkramer · 2023-07-27T12:32:49Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New json serializer serializes empty complex types as {}, which fails to deserialize with new deserializer #2235

New json serializer serializes empty complex types as {}, which fails to deserialize with new deserializer #2235

almostchristian commented Sep 13, 2022

ewoutkramer commented Sep 13, 2022 •

edited

Loading

ewoutkramer commented Sep 13, 2022 •

edited

Loading

almostchristian commented Sep 13, 2022

ghost commented Nov 12, 2022 •

edited by ghost

Loading

ewoutkramer commented Nov 14, 2022

ghost commented Nov 18, 2022 •

edited by ghost

Loading

ewoutkramer commented Jul 27, 2023

ewoutkramer commented Jul 27, 2023

New json serializer serializes empty complex types as {}, which fails to deserialize with new deserializer #2235

New json serializer serializes empty complex types as {}, which fails to deserialize with new deserializer #2235

Comments

almostchristian commented Sep 13, 2022

ewoutkramer commented Sep 13, 2022 • edited Loading

ewoutkramer commented Sep 13, 2022 • edited Loading

almostchristian commented Sep 13, 2022

ghost commented Nov 12, 2022 • edited by ghost Loading

ewoutkramer commented Nov 14, 2022

ghost commented Nov 18, 2022 • edited by ghost Loading

ewoutkramer commented Jul 27, 2023

On the serializer

On the parser

ewoutkramer commented Jul 27, 2023

ewoutkramer commented Sep 13, 2022 •

edited

Loading

ewoutkramer commented Sep 13, 2022 •

edited

Loading

ghost commented Nov 12, 2022 •

edited by ghost

Loading

ghost commented Nov 18, 2022 •

edited by ghost

Loading