Creating an avro Schema with Extra Values

I am trying to write a very lightweight avro scheme (simply because I just indicate my current problem) in order to write an avro data file based on data stored in json format. The trick is that one field is optional, and one of the authors or me does not do it right.

The goal is not to write your own serializer, endgoal will have it in the tray, I am in the early stages.

Data (work) in a file named so.log:

{ "valid": {"boolean":true} , "source": {"bytes":"live"} } 

The schema in a file named so.avsc:

 { "type":"record", "name":"Event", "fields":[ {"name":"valid", "type": ["null", "boolean"],"default":null} , {"name":"source","type": ["null", "bytes"],"default":null} ] } 

I can easily create an avro file with the following command:

 java -jar avro-tools-1.7.6.jar fromjson --schema-file so.avsc so.log 

So far so good. The fact is that the “source” is optional, so I expect the following data to also be valid:

 { "valid": {"boolean":true} } 

But executing the same command gives me an error:

 Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got END_OBJECT at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697) at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:155) at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:99) at org.apache.avro.tool.Main.run(Main.java:84) at org.apache.avro.tool.Main.main(Main.java:73) 

I tried a lot of changes to the circuit, even things that do not meet the avro specification. The diagram I will show here is, as far as I know, what the specification is talking about.

Does anyone know what I'm doing wrong, and how can I actually have optional elements without writing my own serializer?

Thanks,

+6
source share
1 answer

According to java api documentation :

Using the builder requires setting all the fields, even if they are null

The python API , on the other hand, seems like empty fields are really shared:

Since the favorite_color field is of type ["string", "null"], we are not required to specify this field

In short, since most tools are written in java, null fields should usually be explicitly set.

+1
source

All Articles