Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Go and Java Schema Incompatibility in Pulsar #1031

Open
liangyepianzhou opened this issue Jun 20, 2023 · 1 comment
Open

[bug] Go and Java Schema Incompatibility in Pulsar #1031

liangyepianzhou opened this issue Jun 20, 2023 · 1 comment

Comments

@liangyepianzhou
Copy link
Contributor

Description

The issue highlights the incompatibility between Go and Java schemas in Pulsar, specifically focusing on Avro, JSON, Proto, and ProtoNative schemas.

Avro Schema

Avro schema is compatible between Go and Java. Example code snippets are provided for both languages to demonstrate the compatibility.

schema := NewAvroSchema(`{"fields":
    [
        {"name":"id","type":"int"},{"default":null,"name":"name","type":["null","string"]}
    ],
    "name":"MyAvro","namespace":"schemaNotFoundTestCase","type":"record"}`, nil)
@AllArgsConstructor
@NoArgsConstructor
public static class Example {
    public String name;
    public int id;
}

Producer<Example> producer = pulsarClient.newProducer(Schema.AVRO(Example.class))
                .topic(topic).create();

JSON Schema

JSON schema requires the field names to be identical and the handling of null fields to be consistent. However, there is a difference in handling null fields between Go and Java. The Java Schema.JSON(Example.class) allows null fields implicitly, while Go JSON schema does not permit null fields. To achieve compatibility, the Java example should use a schema definition that allows null fields, and the variable names in the Java Example class should match the schema definition.

exampleSchemaDefCompatible := "{\"type\":\"record\",\"name\":\"Example\",\"namespace\":\"test\"," +
	"\"fields\":[{\"name\":\"ID\",\"type\":\"int\"},{\"name\":\"Name\",\"type\":[\"null\", \"string\"]}]}"

consumerJSCompatible := NewJSONSchema(exampleSchemaDefCompatible, nil)

exampleSchemaDefIncompatible := "{\"type\":\"record\",\"name\":\"Example\",\"namespace\":\"test\"," +
	"\"fields\":[{\"name\":\"ID\",\"type\":\"int\"},{\"name\":\"Name\",\"type\":\"string\"}]}"

consumerJSIncompatible := NewJSONSchema(exampleSchemaDefIncompatible, nil)

To achieve compatibility, modify the exampleSchemaDefIncompatible to allow null fields and ensure that the variable names in the Java Example class match the schema definition.

Proto Schema

The Proto schema generated from the same proto message in Go and Java results in different schema definitions that are not compatible. If the Java-generated schema definition is used uniformly, the consumer in Go can register successfully and receive messages. However, the message decoding will still fail. No error is reported, but all the messages only have the default values.

message TestMessage {
    string stringField = 1;
    int32 intField = 2;
}

Java producer code

    for (int i = 0; i < 10; i++) {
        producer.newMessage().value(new org.apache.pulsar.client.api.schema.proto.Test.TestMessage().toBuilder()
                .setStringField("message").setIntField(i).build()).send();
    }

Go consumer code

	for true {
		msg, err := consumer.Receive(context.Background())
		assert.Nil(t, err)
		err = msg.GetSchemaValue(&unobj)
		assert.Nil(t, err)
		log.Printf("Receive message %s, %d", unobj.StringField, unobj.IntField)
	}

Log in Go

2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0
2023/06/20 20:00:40 Receive message , 0

ProtoNative Schema

Similar to the Proto schema, ProtoNative schema also generates two incompatible schemas for the same proto message. Using the schema definition from the Java producer to create a Go consumer schema can resolve compatibility issues. However, even with this approach, the consumer will still decode messages with default values.

Please note that the provided examples and code snippets are for illustrative purposes only and may need to be adapted to suit your specific use case.

@liangyepianzhou
Copy link
Contributor Author

The main reason is that the Java client's Proto and Proto native schema use Avro's avro.java.string extension to specify the string type during Proto parsing. To make the Go Proto file compatible, you can add extension annotations to the Go Proto file while keeping the Java Proto file unchanged. This allows the message to be correctly parsed by the Go client.

Proto file in Go:

message TestMessage {
    string stringField = 1 [(avro_java_string) = "String"];
    int32 intField = 2;
}

Proto file in Java:

message TestMessage {
    string stringField = 1;
    int32 intField = 2;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant