Grokbase Groups Avro user August 2012
FAQ
Hey,
I've been running into this case where I have a field of type int but I need to allow for null values. To do this, I now have a new schema that defines that field as a union of null and int such as:
type: [ "null", "int" ]
According to my interpretation of the spec, avro should resolve this correctly. For reference, this reads like this (from http://avro.apache.org/docs/current/spec.html#Schema+Resolution):
if reader's is a union, but writer's is not
The first schema in the reader's union that matches the writer's schema is recursively resolved against it. If none match, an error is signaled.)

However, when trying to do this, I get this:


org.apache.avro.AvroTypeException: Attempt to process a int when a union was expected.






I've written a simple test that illustrates what I'm saying:





@Test


public void testReadingUnionFromValueWrittenAsPrimitive() throws Exception {


Schema writerSchema = new Schema.Parser().parse("{\n" +


" \"type\":\"record\",\n" +


" \"name\":\"NeighborComparisons\",\n" +


" \"fields\": [\n" +


" {\"name\": \"test\",\n" +


" \"type\": \"int\" }]} ");


Schema readersSchema = new Schema.Parser().parse(" {\n" +


" \"type\":\"record\",\n" +


" \"name\":\"NeighborComparisons\",\n" +


" \"fields\": [ {\n" +


" \"name\": \"test\",\n" +


" \"type\": [\"null\", \"int\"],\n" +


" \"default\": null } ] }");


GenericData.Record record = new GenericData.Record(writerSchema);


record.put("test", Integer.valueOf(10));






ByteArrayOutputStream output = new ByteArrayOutputStream();


JsonEncoder jsonEncoder = EncoderFactory.get().jsonEncoder(writerSchema, output);


GenericDatumWriter<GenericData.Record> writer = new GenericDatumWriter<GenericData.Record>(writerSchema);


writer.write(record, jsonEncoder);


jsonEncoder.flush();


output.flush();






System.out.println(output.toString());






JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(readersSchema, output.toString());


GenericDatumReader<GenericData.Record> reader =


new GenericDatumReader<GenericData.Record>(writerSchema, readersSchema);


GenericData.Record read = reader.read(null, jsonDecoder);





assertEquals(10, read.get("test"));


}







Am I misunderstanding how avro should handle such a case of schema resolution or is the problem in the implementation?






Cheers!


--
Alex

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriesavro
postedAug 17, '12 at 1:00a
activeAug 17, '12 at 1:00a
posts1
users1
websiteavro.apache.org
irc#avro

1 user in discussion

Alexandre Normand: 1 post

People

Translate

site design / logo © 2021 Grokbase