Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: DataContract API's to enable external Schema Import/Export package #72242

Closed
StephenMolloy opened this issue Jul 15, 2022 · 5 comments
Labels
api-approved API was approved in API review, it can be implemented area-Serialization blocking Marks issues that we want to fast track in order to unblock other important work
Milestone

Comments

@StephenMolloy
Copy link
Member

StephenMolloy commented Jul 15, 2022

Background and motivation

We're trying to bring DCS and friends back to par with .Net 4.8 in this release. One of the capabilities of the DCS library in .Net 4.8 is to generate XSD schemas or import XSD schemas into usable code. This functionality is used to support WCF and svcutil. The code that handles schema importing however, had a dependency on System.CodeDom. After some extended discussion around the goals and motivations, we decided to take a two-pronged approach here. The existing DCS assembly in the runtime will be recieving updates to align with 4.8 and to support this schema work under the covers... but the top-level functionality (and public surface area) of the schema support will live in a separate package in dotnet/runtime that is not part of the runtime. (Like the System.CodeDom package it depends on.) See the PR #71752 and other half of the API review #72243.

Schema support in 4.8 relied heavily on DCS internals, which wasn't a problem since those and CodeDom were all there. To do this schema work in an external package though, we need to expose some API points from the System.Runtime.Serialization namespace. I've tried to keep these to a minimum, and at a somewhat logical and abstract level so they don't look too contrived or out of place for a public API. But the intent here is to support what we need for XSD schema processing - not to open up the internals of DCS anymore than we need to.

Classes being exposed by this proposal are as follows:

  • DataContractSet
  • DataContract
  • XmlDataContract
  • DataMember

Additionally, a new interface is being added, and the DataContractJsonSerializer is receiving an extension method that the Xml DCS already had for supplying surrogate providers.

API Proposal

namespace System.Runtime.Serialization
{
    public sealed class DataContractSet
    {
        [RequiresUnreferencedCode("")]
        public DataContractSet(DataContractSet dataContractSet);
        public DataContractSet(ISerializationSurrogateProvider? dataContractSurrogate, ICollection<Type>? referencedTypes, ICollection<Type>? referencedCollectionTypes);

        public Dictionary<XmlQualifiedName, DataContract> Contracts { get; }
        public Dictionary<XmlQualifiedName, DataContract>? KnownTypesForObject { get; }
        public Dictionary<DataContract, object> ProcessedContracts { get; }
        public Hashtable SurrogateData { get; }

        [RequiresUnreferencedCode("")]
        public void Add(Type type);
        [RequiresUnreferencedCode("")]
        public void ExportSchemaSet(XmlSchemaSet schemaSet);
        [RequiresUnreferencedCode("")]
        public DataContract GetDataContract(Type type);
        [RequiresUnreferencedCode("")]
        public DataContract? GetDataContract(XmlQualifiedName key);
        [RequiresUnreferencedCode("")]
        public Type? GetReferencedType(XmlQualifiedName stableName, DataContract dataContract, out DataContract? referencedContract, out object[]? genericParameters, bool? supportGenericTypes = null);
        [RequiresUnreferencedCode("")]
        public void ImportSchemaSet(XmlSchemaSet schemaSet, ICollection<XmlQualifiedName>? typeNames, bool importXmlDataType);
        [RequiresUnreferencedCode("")]
        public IList<XmlQualifiedName> ImportSchemaSet(XmlSchemaSet schemaSet, ICollection<XmlSchemaElement> elements, bool importXmlDataType);
    }

    public abstract class DataContract
    {
        public virtual bool IsBuiltInDataContract { get; }
        [DynamicallyAccessedMembers(*)]
        public virtual Type UnderlyingType { get; }
        public virtual XmlQualifiedName StableName { get; }
        public virtual Type OriginalUnderlyingType { get; }
        public virtual List<DataMember>? Members { get; }
        public virtual Dictionary<XmlQualifiedName, DataContract>? KnownDataContracts { get; set; }
        public virtual bool IsValueType { get; }
        public virtual bool IsReference { get; }
        public virtual bool IsISerializable { get; }
        public virtual XmlDictionaryString? TopLevelElementName { get; }
        public virtual XmlDictionaryString? TopLevelElementNamespace { get; }
        public virtual DataContract? BaseContract { get; }
        public virtual string? ContractType { get; }

        public static string EncodeLocalName(string localName);
        [RequiresUnreferencedCode("")]
        public static DataContract? GetBuiltInDataContract(string name, string ns);
        [RequiresUnreferencedCode("")]
        public static DataContract GetDataContract(Type type);
        [RequiresUnreferencedCode("")]
        public static XmlQualifiedName GetStableName(Type type);
        [RequiresUnreferencedCode("")]
        public static Type GetSurrogateType(ISerializationSurrogateProvider surrogateProvider, Type type);
        [RequiresUnreferencedCode("")]
        public static bool IsTypeSerializable(Type type);
        [RequiresUnreferencedCode("")]
        public virtual XmlQualifiedName GetArrayTypeName(bool isNullable);
        public virtual bool IsKeyValue(out string? keyName, out string? valueName, out string? itemName);
    }

    public sealed class XmlDataContract : DataContract
    {
        public bool HasRoot { get; }
        public bool IsAnonymous { get; }
        public bool IsTopLevelElementNullable { get; }
        public bool IsTypeDefinedOnImport { get; set; }
        public bool IsValueType { get; set; }
        public XmlSchemaType? XsdType { get; }
    }

    public sealed class DataMember
    {
        public bool EmitDefaultValue { get; }
        public bool IsNullable { get; }
        public bool IsRequired { get; }
        public DataContract MemberTypeContract { get; }
        public string Name { get; }
        public long Order { get; }
    }

    public interface ISerializationExtendedSurrogateProvider : ISerializationSurrogateProvider
    {
        object? GetCustomDataToExport(MemberInfo memberInfo, Type dataContractType);
        object? GetCustomDataToExport(Type clrType, Type dataContractType);
        void GetKnownCustomDataTypes(Collection<Type> customDataTypes);
        Type? GetReferencedTypeOnImport(string typeName, string typeNamespace, object? customData);
    }
}

namespace System.Runtime.Serialization.Json
{
    public static class DataContractJsonSerializerExtensions
    {
        public static System.Runtime.Serialization.ISerializationSurrogateProvider? GetSerializationSurrogateProvider(this DataContractJsonSerializer serializer);
        public static void SetSerializationSurrogateProvider(this DataContractJsonSerializer serializer, System.Runtime.Serialization.ISerializationSurrogateProvider? provider);
    }
}

API Usage

I'm not sure where to begin on this one. The schema support package needs to reason about and manage DataContracts, and these API's allow it to do so without exposing all the internals.

Alternative Designs

No response

Risks

These are largely methods that existed as-is or very close to how they are in 4.8. But they were internal before, not public. I do believe they reasonably support what we need. I expect there will be quite a bit of discussion though as it is a large and publicly un-proven surface we are opening up.

@StephenMolloy StephenMolloy added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jul 15, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 15, 2022
@StephenMolloy StephenMolloy added this to the 7.0.0 milestone Jul 15, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jul 15, 2022
@teo-tsirpanis
Copy link
Contributor

Are these APIs expected to be used outside the BCL? If not, perhaps they should be put in an Internal namespace, to avoid confusion with the similarly named DataContract and DataMember attributes.

Another idea I thought is to make these APIs public on the System.Private.DataContractSerialization assembly only, and reference them from the OOB package through it, avoiding adding new APIs on the public user-facing surface.

@HongGit HongGit added api-ready-for-review API is ready for review, it is NOT ready for implementation blocking Marks issues that we want to fast track in order to unblock other important work labels Jul 25, 2022
@bartonjs
Copy link
Member

bartonjs commented Jul 28, 2022

Video

  • Since these types aren't expected to be used by callers, other than tooling, let's avoid a popular namespace.
    • Moved to System.Runtime.Serialization.Schema, except the ISerializationSurrogateProvider2 interface.
  • DataContractSet.ImportSchemaSet: Be List<T> instead of IList<T> if the implementation always returns that concrete type.
    • Or ReadOnlyCollection<T> if the caller isn't supposed to mutate the data.
  • DataContractSet.ImportSchemaSet's elements parameter should be IEnumerable<T> if possible.
  • DataContractSet's "surrogate" ctor should take the inputs as IEnumerable<T> instead of ICollection<T> if possible.
  • DataContract.KnownDataContracts should not be settable, if possible
  • DataContract.StableName doesn't feel like an obvious name (or a name consistent with existing tech). XmlName seems to be a better fit (or XmlQualifiedName if you prefer)
    • DataContract.GetStableName should match.
  • DataContract.EncodeLocalName doesn't feel like it lives on the right type. Consider removing.
  • DataContract.IsKeyValue => DataContract.IsKeyedCollection or DataContract.IsDictionary or DataContract.IsDictionaryLike
    • Check the nullable annotations on this member, e.g. can all three gain a [NotNullWhen]?
  • Is Collection<Type> the right type for ISerializationExtendedSurrogateProvider.GetKnownCustomDataTypes?
    • ICollection, IEnumerable, etc.
  • ISerializationExtendedSurrogateProvider.GetCustomDataToExport's clrType should be runtimeType instead.
  • ISerializationExtendedSurrogateProvider => ISerializationSurrogateProvider2
  • DataContractJsonSerializerExtensions: These members should go on DataContractJsonSerializer directly
  • If you want to copy (NOT move) the DataContractSerializerExtension extension methods to DataContractSerializer, go ahead.
namespace System.Runtime.Serialization.Schema
{
    public sealed class DataContractSet
    {
        [RequiresUnreferencedCode("")]
        public DataContractSet(DataContractSet dataContractSet);

        public DataContractSet(
            ISerializationSurrogateProvider? dataContractSurrogate,
            ICollection<Type>? referencedTypes,
            ICollection<Type>? referencedCollectionTypes);

        public Dictionary<XmlQualifiedName, DataContract> Contracts { get; }
        public Dictionary<XmlQualifiedName, DataContract>? KnownTypesForObject { get; }
        public Dictionary<DataContract, object> ProcessedContracts { get; }
        public Hashtable SurrogateData { get; }

        [RequiresUnreferencedCode("")]
        public void Add(Type type);
        [RequiresUnreferencedCode("")]
        public void ExportSchemaSet(XmlSchemaSet schemaSet);
        [RequiresUnreferencedCode("")]
        public DataContract GetDataContract(Type type);
        [RequiresUnreferencedCode("")]
        public DataContract? GetDataContract(XmlQualifiedName key);
        [RequiresUnreferencedCode("")]
        public Type? GetReferencedType(
            XmlQualifiedName stableName,
            DataContract dataContract,
            out DataContract? referencedContract,
            out object[]? genericParameters,
            bool? supportGenericTypes = null);
        [RequiresUnreferencedCode("")]
        public void ImportSchemaSet(
            XmlSchemaSet schemaSet,
            ICollection<XmlQualifiedName>? typeNames,
            bool importXmlDataType);
        [RequiresUnreferencedCode("")]
        public IList<XmlQualifiedName> ImportSchemaSet(
            XmlSchemaSet schemaSet,
            ICollection<XmlSchemaElement> elements,
            bool importXmlDataType);
    }

    public abstract class DataContract
    {
        public virtual bool IsBuiltInDataContract { get; }
        [DynamicallyAccessedMembers(*)]
        public virtual Type UnderlyingType { get; }
        public virtual XmlQualifiedName XmlName { get; }
        public virtual Type OriginalUnderlyingType { get; }
        public virtual List<DataMember>? Members { get; }
        public virtual Dictionary<XmlQualifiedName, DataContract>? KnownDataContracts { get; set; }
        public virtual bool IsValueType { get; }
        public virtual bool IsReference { get; }
        public virtual bool IsISerializable { get; }
        public virtual XmlDictionaryString? TopLevelElementName { get; }
        public virtual XmlDictionaryString? TopLevelElementNamespace { get; }
        public virtual DataContract? BaseContract { get; }
        public virtual string? ContractType { get; }

        public static string EncodeLocalName(string localName);
        [RequiresUnreferencedCode("")]
        public static DataContract? GetBuiltInDataContract(string name, string ns);
        [RequiresUnreferencedCode("")]
        public static DataContract GetDataContract(Type type);
        [RequiresUnreferencedCode("")]
        public static XmlQualifiedName GetXmlName(Type type);
        [RequiresUnreferencedCode("")]
        public static Type GetSurrogateType(ISerializationSurrogateProvider surrogateProvider, Type type);
        [RequiresUnreferencedCode("")]
        public static bool IsTypeSerializable(Type type);
        [RequiresUnreferencedCode("")]
        public virtual XmlQualifiedName GetArrayTypeName(bool isNullable);
        public virtual bool IsDictionaryLike(out string? keyName, out string? valueName, out string? itemName);
    }

    public sealed class XmlDataContract : DataContract
    {
        public bool HasRoot { get; }
        public bool IsAnonymous { get; }
        public bool IsTopLevelElementNullable { get; }
        public bool IsTypeDefinedOnImport { get; set; }
        public bool IsValueType { get; set; }
        public XmlSchemaType? XsdType { get; }
    }

    public sealed class DataMember
    {
        public bool EmitDefaultValue { get; }
        public bool IsNullable { get; }
        public bool IsRequired { get; }
        public DataContract MemberTypeContract { get; }
        public string Name { get; }
        public long Order { get; }
    }
}

namespace System.Runtime.Serialization
{
    public interface ISerializationSurrogateProvider2 : ISerializationSurrogateProvider
    {
        object? GetCustomDataToExport(MemberInfo memberInfo, Type dataContractType);
        object? GetCustomDataToExport(Type runtimeType, Type dataContractType);
        void GetKnownCustomDataTypes(Collection<Type> customDataTypes);
        Type? GetReferencedTypeOnImport(string typeName, string typeNamespace, object? customData);
    }
}

namespace System.Runtime.Serialization.Json
{
    public partial class DataContractJsonSerializer
    {
        public System.Runtime.Serialization.ISerializationSurrogateProvider? GetSerializationSurrogateProvider();
        public void SetSerializationSurrogateProvider(System.Runtime.Serialization.ISerializationSurrogateProvider? provider);
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jul 28, 2022
@StephenMolloy
Copy link
Member Author

StephenMolloy commented Jul 30, 2022

After moving the schema exporter stuff from the other review back into the runtime assembly, there are 5 API's above that no longer need to be exposed. Additionally, I believe we discussed moving the utility method EncodeLocalName out of the API since it is easily duplicated externally. I was also able to remove the public setter of DataContract.KnownDataContracts as suggested, so now only the getter is in the public API. Unless there are objections, I will be removing these from the API approved above:

namespace System.Runtime.Serialization.DataContracts
{
    public sealed class DataContractSet
    {
        [RequiresUnreferencedCode("")]
        public void Add(Type type);
        [RequiresUnreferencedCode("")]
        public void ExportSchemaSet(XmlSchemaSet schemaSet);
    }

    public abstract class DataContract
    {
        // Removing setter only. Getter still exists on public API.
        public virtual Dictionary<XmlQualifiedName, DataContract>? KnownDataContracts { set; }

        public static string EncodeLocalName(string localName);
        [RequiresUnreferencedCode("")]
        public static DataContract GetDataContract(Type type);
        [RequiresUnreferencedCode("")]
        public static Type GetSurrogateType(ISerializationSurrogateProvider surrogateProvider, Type type);
        [RequiresUnreferencedCode("")]
        public static bool IsTypeSerializable(Type type);
    }
}

With regards to the various collection types in the API signature:

  • I believe Collection<Type> is the correct type for ISerializationSurrogateProvider2.GetKnownCustomDataTypes. The collection is passed to the instance of the surrogate provider, which is defined and created by users. We don't do anything with it that can't be satisfied by using something like IEnumerable<T>, but this was a public API in NetFx. So folks reusing code may depend on that actually being a Collection<T> rather than some looser interface.
  • As requested in the notes above, but missed in the codebox recap - the DataContractSurrogate.ImportSchema overloads accept IEnumerable<> now instead of ICollection<>, and the return type for the second is the concrete List<>.

Also, to be consistent with .Net 4.8 and the new ImportOptions, I used the name 'DataContractSurrogate' for the new property on ExportOptions, rather than the 'SurrogateProvider' name shown above.

Finally, instead of stashing these newly exposed types in a new 'System.Runtime.Serialization.Schema' namespace, we settled on 'System.Runtime.Serialization.DataContracts' since they are all classes that implement/support the DataContract model. (All the internal sub-classes of DataContract have moved into this namespace as well, not that that is relevant to the public API discussion.)

@StephenMolloy
Copy link
Member Author

@bartonjs - As referenced above, we were able to remove several of the API's originally listed when we brought the schema exporter back into the runtime package.

Additionally, @mconnew and I discussed the use of DataContract.Members both internally in the runtime and in the 'external' schema package and realized we don't do any editing of the collection externally. So to tighten up the API even more, we thought it would be prudent to only expose the collection as a ReadOnlyCollection<> rather than an editable collection. And make it not-null as well. To separate the internal use from the external use, the external property name has been updated to "DataMembers." So one more change to the above looks like this:

namespace System.Runtime.Serialization.DataContracts
{
    public abstract class DataContract
    {
        // This previously approved API
        //public virtual List<DataMember>? Members { get; }
        // Now looks like this:
        public virtual ReadOnlyCollection<DataMember> DataMembers { get; }
    }
}

Do we see any problems with this tightening up?

@StephenMolloy
Copy link
Member Author

Implemented in #71752

@teo-tsirpanis teo-tsirpanis removed the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Aug 14, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Sep 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-Serialization blocking Marks issues that we want to fast track in order to unblock other important work
Projects
None yet
Development

No branches or pull requests

4 participants