Skip to content

Simplify dynamic Avro schema generation and usage using POJOs.

License

Notifications You must be signed in to change notification settings

amousavigourabi/easy-avro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Easy Avro

Simplify dynamic Avro schema generation and usage using POJOs.

Allows for easy definition of Avro schemas using POJOs. Accompanied by a set of annotations to provide more control over the generated schemas. After schema generation, mechanisms are provided to convert to and from Avro records and POJOs for seamless usage.

Easy Avro supports all access levels, and both final and non-final fields alike.

Usage

To use Easy Avro, you will need to interact with me.atour.easyavro.AvroSchema. The AvroSchema class provides mechanisms to generate Avro schemas for provided POJOs and convert said POJOs to Avro's GenericRecords, and back.

To get started with default settings, you can use it as follows.

AvroSchema avroSchema = new AvroSchema(MyPojo.class);
avroSchema.generate();
GenericRecord record = avroSchema.convertFromPojo(pojoInstance);
MyPojo pojoInstance2 = avroSchema.convertToPojo(record);

In the snippet above, the first line defines the AvroSchema class, after which the second line generates the Avro schema definition for the AvroSchema. The third line converts the POJO for which the schema was generated to Avro's GenericRecord, after which line four converts the newly generated GenericRecord back to a POJO.

To modify the standard behaviour of Easy Avro, you can use class- and field-level annotations. At class level, @AvroRecord would be used. This can be used to define the schema name and set the naming strategy for class fields. By default, the snake case converter is used while the class name is used as the schema name, after replacing $ with _ in the case of nested classes.

For the naming conversion strategy, six options are available: dromedary case, lowercase, pascal case, screaming snake case, snake case, and uppercase. It is important to note that the field naming converters assume the POJO's fields are already in dromedary case, as is customary in Java. As an example, the field transportBuilder would be converted to transportbuilder in lowercase, TransportBuilder in pascal case, TRANSPORT_BUILDER in screaming snake case, transport_builder in snake case, and TRANSPORTBUILDER in uppercase, while remaining as transportBuilder in dromedary case. In the following snippet, the class-level AvroRecord annotation is used to define both the schema name and naming conversion strategy, as "SCHEMANAME" and dromedary case respectively.

import me.atour.easyavro.AvroRecord;
import me.atour.easyavro.FieldNamingStrategies;

@AvroRecord(schemaName = "SCHEMANAME", fieldStrategy = FieldNamingStrategies.DROMEDARY_CASE)
public class Pojo {
  private int[] x;
  private boolean y;
  private List<Integer> z;
}

To customise the way fields are treated on a field-level, the @AvroField annotation would be used. It can modify two things: the name of the field in the final Avro schema, and its inclusion. This is not to mean that it can override the exclusion of static fields, they will remain excluded regardless. However, it can define exclusions for fields that would normally be included in the schema. The specified name in an @AvroField annotation always overrides the generated name, given that the specified name is not blank. The snippet below showcases how the annotation can be used to customise the way one field in the Pojo class defined above is treated.

import me.atour.easyavro.AvroRecord;
import me.atour.easyavro.FieldNamingStrategies;
import me.atour.easyavro.field.AvroField;

@AvroRecord(schemaName = "SCHEMANAME", fieldStrategy = FieldNamingStrategies.DROMEDARY_CASE)
public class Pojo {

  @AvroField(name = "intArray", included = false)
  private int[] x;

  private boolean y;

  @AvroField(name = "int_list", included = true)
  private List<Integer> z;
}

In this snippet, the Pojo.x field is not included in the Avro schema we generate. This comes as the included property of the @AvroField annotation was set to false. This also causes the name defined in the very same annotation to be ignored. The Pojo.y field has no annotation present and will thus be treated the same as before. It will be included as y in the schema. The Pojo.z field also has a custom annotation, much like Pojo.x. However, contrary to Pojo.x, the included property is true. This means that the field Pojo.z is included in the generated schema. The specified name int_list also overrides the name z that would otherwise be generated. You can also see in this example that while the class uses the dromedary case field naming strategy, int_list is still used for the field we specified it for. Field-level annotations can thus "overrule" the strategies specified at class-level. As the default value for the name property is generated by the naming strategy and the default value for the included property is true, the annotation at Pojo.z could also have omitted the value it sets for included, thus rendering it as @AvroField(name = "int_list") instead.

Installation

To install the project, first clone it from GitHub. Then go to the directory it was cloned to and run the Maven install command to install the project to your local Maven repository.

mvn clean install

Then, you can use the project by including the following Maven dependency in your projects.

<dependency>
  <groupId>me.atour.easy-avro</groupId>
  <artifactId>easy-avro</artifactId>
  <version>1.0.0-SNAPSHOT</version>
</dependency>

It is important to note that the sun.misc.Unsafe class is used to convert Avro records to POJOs. This means that Unsafe will have to be available to the runtime environment for Easy Avro to function properly.

Logging

For logging, SLF4J is used. Easy Avro does not provide an implementation, this means that it will use the no-op logger implementation by default. If an implementation is used by the project that uses Easy Avro, Easy Avro will use that logger instead.

Contribute

If you want to contribute code to the project, please fork the repository and submit a pull request explaining the contribution, linking to an issue whenever possible. Please do not forget to include tests.

To ensure a consistent codebase, Checkstyle, Spotless, and PMD are used. To run both these linters and the test suite, you can run the Maven verify lifecycle phase as mvn clean verify. To apply the suggestions from Spotless, you can run the apply goal Spotless provides as mvn spotless:apply.

About

Simplify dynamic Avro schema generation and usage using POJOs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages