-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
eb7287d
commit d0610fa
Showing
1 changed file
with
126 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,127 @@ | ||
# protobuf-jackson | ||
|
||
protobuf-jackson is a library for efficient marshalling of Protocol Buffer messages to and from | ||
JSON. It is based on the streaming API of the Jackson JSON handling library and uses Byte Buddy to | ||
generate efficient bytecode at runtime for marshalling specific message types. | ||
|
||
This library requires Java 8 to use. | ||
|
||
## Why another Protocol Buffer <-> JSON marshaller | ||
|
||
The official Protocol Buffer Java library comes with a util library for marshalling to JSON, using | ||
a class called JsonFormat. Unfortunately, JsonFormat is very slow, and its core design of using | ||
manual JSON printers / GSON parsers and introspection during serialization time limit how fast it | ||
can get. | ||
|
||
JSON marshalling is important - it can allow existing REST services to be migrated to Protocol | ||
Buffers and makes it easier to evangelize their use in organizations currently tied heavily to JSON. | ||
However, slow marshalling prevents these use cases from being practical, so this library aims to | ||
allow JSON marshalling to be fast enough for production. It will never be as good as binary | ||
serialization, but should be about as fast as standard REST API libraries. | ||
|
||
If the protoc compiler supports generating source code for JSON marshalling, this library will | ||
likely be deprecated. | ||
|
||
## Usage | ||
|
||
Include protobuf-jackson as a dependency. Gradle users can add | ||
|
||
```kotlin | ||
implementation("org.curioswitch.curiostack:protobuf-jackson:1.3.0") | ||
``` | ||
|
||
to their dependencies. | ||
|
||
The API is exposed in ```MessageMarshaller```. Use its builder to configure options and register | ||
specific Protocol Buffer message types you will marshall. Any other types will not be marshallable | ||
by the built ```MessageMarshaller```. | ||
|
||
```java | ||
class MessageApi { | ||
|
||
private static final MessageMarshaller marshaller = MessageMarshaller.builder() | ||
.register(ApiRequest.getDefaultInstance()) | ||
.register(ApiResponse.getDefaultInstance()) | ||
.build(); | ||
|
||
String coolApi(String jsonRequest) { | ||
ApiRequest.Builder request = ApiRequest.newBuilder(); | ||
marshaller.mergeValue(jsonRequest, request); | ||
|
||
ApiResponse response = processRequest(request); | ||
|
||
return marshaller.writeValueAsString(response); | ||
} | ||
} | ||
``` | ||
|
||
## Design | ||
|
||
protobuf-jackson introspects registered message types during bytecode generation time and generates | ||
code specific to the type. Unlike JsonFormat, this means that during actually marshalling, there is | ||
no introspection of the message type at all - it's already been done during registration. In | ||
addition, messages are accessed through their standard Java generated API, no reflection is needed. | ||
No reflection allows the JIT to optimize generated machine code as much as possible. By generating | ||
bytecode, we approach the same speed that could be achieved by generating the source code itself. | ||
|
||
All of the same tests as ```JsonFormat``` (besides the differences listed below) pass, so | ||
protobuf-jackson should be mostly compatible with upstream and ready for production. | ||
|
||
## Differences with upstream | ||
|
||
Differences with JsonFormat are | ||
|
||
- No support for ```DynamicMessage```. The library is designed to interact with generated code for | ||
optimal performance. If you don't know what ```DynamicMessage``` is, you are not using it. If you | ||
do, stick with ```JsonFormat```. | ||
|
||
- Does not support parsing Any messages where ```@type``` is not the first field. | ||
|
||
- Correctly handles invalid array syntax. ```JsonFormat``` does not due to its dependency on GSON. | ||
See [JsonFormatTest](https://github.com/google/protobuf/blob/master/java/util/src/test/java/com/google/protobuf/util/JsonFormatTest.java#L1124). | ||
|
||
- Correctly rejects invalid JSON which specifies the same field twice. ```JsonFormat``` does not due | ||
to its dependency on GSON. See [JsonFormatTest](https://github.com/google/protobuf/blob/master/java/util/src/test/java/com/google/protobuf/util/JsonFormatTest.java#L458) | ||
|
||
- Fields already present in the passed in ```Message.Builder``` will be overwritten, rather than | ||
causing the parse to fail, as is usual with merge semantics. | ||
|
||
## Benchmarks | ||
|
||
First, a synthetic benchmark that uses the upstream ```TestAllTypes``` proto which contains all | ||
types of fields. | ||
|
||
``` | ||
Benchmark Mode Cnt Score Error Units | ||
JsonParserBenchmark.codegenJson thrpt 50 114260.817 ± 1162.079 ops/s | ||
JsonParserBenchmark.fromBinary thrpt 50 448586.946 ± 7946.499 ops/s | ||
JsonParserBenchmark.upstreamJson thrpt 50 43392.884 ± 715.184 ops/s | ||
JsonSerializeBenchmark.codegenJson thrpt 50 204768.672 ± 2459.787 ops/s | ||
JsonSerializeBenchmark.toBytes thrpt 50 1071878.319 ± 12628.059 ops/s | ||
JsonSerializeBenchmark.upstreamJson thrpt 50 49941.606 ± 707.045 ops/s | ||
``` | ||
|
||
protobuf-jackson generally shows a 4x speed up in serialization and 3x speed up in parsing vs | ||
upstream JSON parsing. It is about 4-5x slower than protobuf binary marshalling. | ||
|
||
For a more real-world example, the JSON response of the [GitHub search API](https://developer.github.com/v3/search/#example) | ||
was used. A proto file corresponding to the format was made to model what an actual data exchange | ||
may look like. This is just one example, and other examples may have different results, but | ||
hopefully the trends are similar | ||
|
||
``` | ||
Benchmark Mode Cnt Score Error Units | ||
GithubApiBenchmark.marshallerParseString thrpt 50 1696.653 ± 170.891 ops/s | ||
GithubApiBenchmark.marshallerWriteString thrpt 50 3042.433 ± 172.438 ops/s | ||
GithubApiBenchmark.protobufParseBytes thrpt 50 4599.390 ± 56.570 ops/s | ||
GithubApiBenchmark.protobufToBytes thrpt 50 21315.212 ± 210.659 ops/s | ||
GithubApiBenchmark.upstreamParseString thrpt 50 472.902 ± 6.455 ops/s | ||
GithubApiBenchmark.upstreamWriteString thrpt 50 717.227 ± 8.834 ops/s | ||
``` | ||
|
||
For both parsing and serializing, protobuf-jackson is about 4x faster. | ||
|
||
One other way to look at these results is that JSON marshalling is now less than an order of | ||
magnitude different than protobuf binary marshalling, which was not the case before. While binary | ||
marshalling will always be preferred, for use cases that demand it, JSON should also be acceptable | ||
in production. |