I have two similar schemes in which only one nested field changes (it is called onefield in schema1 and anotherfield in scheme2).
SCHEMA1
{ "type": "record", "name": "event", "namespace": "foo", "fields": [ { "name": "metadata", "type": { "type": "record", "name": "event", "namespace": "foo.metadata", "fields": [ { "name": "onefield", "type": [ "null", "string" ], "default": null } ] }, "default": null } ] }
SCHEMA2
{ "type": "record", "name": "event", "namespace": "foo", "fields": [ { "name": "metadata", "type": { "type": "record", "name": "event", "namespace": "foo.metadata", "fields": [ { "name": "anotherfield", "type": [ "null", "string" ], "default": null } ] }, "default": null } ] }
I can programmatically combine both circuits with avro 1.8.0:
Schema s1 = new Schema.Parser().parse(schema1); Schema s2 = new Schema.Parser().parse(schema2); Schema[] schemas = {s1, s2}; Schema mergedSchema = null; for (Schema schema: schemas) { mergedSchema = AvroStorageUtils.mergeSchema(mergedSchema, schema); }
and use it to convert input json to avro or json view:
JsonAvroConverter converter = new JsonAvroConverter(); try { byte[] example = new String("{}").getBytes("UTF-8"); byte[] avro = converter.convertToAvro(example, mergedSchema); byte[] json = converter.convertToJson(avro, mergedSchema); System.out.println(new String(json)); } catch (AvroConversionException e) { e.printStackTrace(); }
This code shows the expected result: {"metadata":{"onefield":null,"anotherfield":null}} . The problem is that I cannot see the combined circuit. If I make a simple System.out.println(mergedSchema) , I get the following exception:
Exception in thread "main" org.apache.avro.SchemaParseException: Can't redefine: merged schema (generated by AvroStorage).merged at org.apache.avro.Schema$Names.put(Schema.java:1127) at org.apache.avro.Schema$NamedSchema.writeNameRef(Schema.java:561) at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:689) at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:715) at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:700) at org.apache.avro.Schema.toString(Schema.java:323) at org.apache.avro.Schema.toString(Schema.java:313) at java.lang.String.valueOf(String.java:2982) at java.lang.StringBuilder.append(StringBuilder.java:131)
I call this the avro uncertainty principle :). It looks like avro can work with a federated schema, but it fails when it tries to convert the schema to JSON. Merging works with simpler schemes, so for me it sounds like an error in avro 1.8.0.
Do you know what can happen or how to solve it? Any workarounds (e.g. alternative Schema serializers) are welcome.