I am trying to write a very lightweight avro scheme (simply because I just indicate my current problem) in order to write an avro data file based on data stored in json format. The trick is that one field is optional, and one of the authors or me does not do it right.
The goal is not to write your own serializer, endgoal will have it in the tray, I am in the early stages.
Data (work) in a file named so.log:
{ "valid": {"boolean":true} , "source": {"bytes":"live"} }
The schema in a file named so.avsc:
{ "type":"record", "name":"Event", "fields":[ {"name":"valid", "type": ["null", "boolean"],"default":null} , {"name":"source","type": ["null", "bytes"],"default":null} ] }
I can easily create an avro file with the following command:
java -jar avro-tools-1.7.6.jar fromjson --schema-file so.avsc so.log
So far so good. The fact is that the “source” is optional, so I expect the following data to also be valid:
{ "valid": {"boolean":true} }
But executing the same command gives me an error:
Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got END_OBJECT at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697) at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:155) at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:99) at org.apache.avro.tool.Main.run(Main.java:84) at org.apache.avro.tool.Main.main(Main.java:73)
I tried a lot of changes to the circuit, even things that do not meet the avro specification. The diagram I will show here is, as far as I know, what the specification is talking about.
Does anyone know what I'm doing wrong, and how can I actually have optional elements without writing my own serializer?
Thanks,