Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DateTime is serialized as string instead of binary by DataContractSerializer with XmlDictionaryWrite.CreateBinaryWriter #61528

Open
sschmieta-qontigo opened this issue Nov 12, 2021 · 5 comments
Assignees
Milestone

Comments

@sschmieta-qontigo
Copy link

Description

The WCF binary XML format written by XmlDictionaryWriter.CreateBinaryWriter allows for a binary representation of DateTime objects that is shorter and faster to deserialize than the standard string presentation, e.g. 2021-01-01T00:00:00. The .NET Framework implementation uses the binary format everywhere except in XML attributes. The .NET Core implementation, on the other hand, uses it when serializing DateTime arrays (DataTime[]), but not for lists or object members, e.g. List<DateTime>. That leads to much larger blobs when the serialized objects contain many DateTimes and, more importantly, significantly longer deserialization times.

Reproduction Steps

The following unit test that can be added to src/libraries/System.Runtime.Serialization.Xml/tests/DataContractSerializer.cs demonstrates the issue by binary-serializing a single DateTime object.

    [Fact]
    public static void DCS_BinarySerializationOfDateTime()
    {
        DateTime dateTime = DateTime.Parse("2021-01-01");
        MemoryStream ms = new();
        DataContractSerializer dcs = new(dateTime.GetType());
        using (XmlDictionaryWriter writer = XmlDictionaryWriter.CreateBinaryWriter(ms, null, null, ownsStream: true))
            dcs.WriteObject(writer, dateTime);
        var serializedBytes = ms.ToArray();
        Assert.Equal(72, serializedBytes.Length);
    }

Expected behavior

The binary serialization should be 72 bytes long, matching .NET Framework 4.8, and the above unit test should pass.

Actual behavior

The binary serialization is 84 bytes long, using the textual representation of DateTime, and the unit test fails.

Regression?

Yes, this works in .NET Framework 4.7.2 and 4.8. Probably earlier versions as well.
The bug has been presented in all versions of .NET Core as far as I can tell.

Known Workarounds

No response

Configuration

Confirmed in .NET Core 3.1 and .NET 5.0 on x64 Windows. By code inspection the same issue is present in .NET 6.0 and also versions before 3.1. As far as I can tell, the issue is not specific to these configurations.

Other information

The bug has been presented in all versions of .NET Core as far as I can tell. The initial import in the "runtime" git repo has the defect. I traced it to a difference in the implementation of writeDateTime(DateTime value) in src/libraries/System.Private.DataContractSerialization/src/System/Runtime/Serialization/XmlWriterDelegator.cs. The .NET Core version is

         internal virtual void WriteDateTime(DateTime value)
         {
             WriteString(XmlConvert.ToString(value, XmlDateTimeSerializationMode.RoundtripKind));
         }

which forces the use of the textual DateTime representation. The .NET Framework sources have

         internal virtual void WriteDateTime(DateTime value)
         {
            writer.WriteValue(value);
         }

which delegate to the inner writer object which will use the binary representation in most cases.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-Serialization untriaged New issue has not been triaged by the area owner labels Nov 12, 2021
@HongGit HongGit removed the untriaged New issue has not been triaged by the area owner label Dec 2, 2021
@HongGit HongGit added this to the 6.0.x milestone Dec 2, 2021
@sschmieta-qontigo
Copy link
Author

@StephenMolloy , any update on the schedule for this ticket? Which 6.0.x release will include the fix?

@sschmieta-qontigo
Copy link
Author

Can this please be included in the next 6.0.x release.?The objects we are working with are time-series of yield curves and volatility cubes which contain tens of thousands of DateTime objects. This bug is causing deserialization to take much longer in .NET vs .NET Framework and is causing a serious performance regression for our Azure compute grid when run under .NET Core.

@danmoseley
Copy link
Member

Cc @HongGit @StephenMolloy for comment above.

@StephenMolloy
Copy link
Member

I believe this was fixed with our alignment of DCS & friends with .Net 4.8 in net7.0-rc1 with #71752.

Since this was not a regression from a previous version of .Net Core, it's not a slam dunk that we'll be able to service this issue in 6.0.x. But the fix here is a very small part of #71752 and relatively low risk. So with vocal customer and business justification, we would like to try to bring it into 6.0 servicing. Can folks tell us more about how this impacts them?

@sschmieta-qontigo
Copy link
Author

@StephenMolloy, as I mentioned in the initial bug statement, this is a regression against .NET Framework 4.7.1 that has been present in all .NET Core versions so far. It is causing a serious performance regression when running a .NET Core version of our Azure compute grid which is how we found this bug. We are currently targeting the 6.0 LTS version for our port (up from 3.1 when I filed the bug), so a fix in 6.0.x would be much appreciated.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 14, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants